Finance Project 3: The Schema

Now that I’ve decided how to structure the data in my finance application I can move on to the next step – designing the schema.

As I discussed in the last post, I’ve decided not to have a complex hierarchy of accounts, although on reflection I think I will break it down further than I had first anticipated. After much consideration, I’ve decided on three layers of organisation.

The account type is straightforward double entry bookkeeping – asset, liability, equity, expense and income (or revenue, when you see it in a business context).

The account category is the next level, and examples will include “Utilities”, “Eating out” and so on. There aren’t going to be that many categories. There are only so many different things I spend money on.

The account will then be whatever level of breakdown I think necessary. For example I might choose to have specific vendors within “Eating out” with an “Other” category for ad-hoc meals. There will be fields to add narrative to individual transactions if I want to add more detail.

Some account categories may only have a single account, but enforcing the three levels keeps things simple. A truly flexible approach would replace the account category with an arbitrarily deep level of parent – child accounts, but that gets unnecessarily complicated to maintain and display. I know this from experience, having developed just such a system before…

The account structure schema will therefore look something like this:

Now we come to the actual transactions. I considered two approaches to maintaining double entry integrity.

One option was to store all transactions in a table with fields for two accounts – the account to credit and the account to debit, effectively rolling both sides of the double entry into a single record. It would be impossible to break the double entry integrity as long as both accounts were mandatory fields.

There are two downsides to that approach.

Let’s say the records look something like this. Note that I’m simplifying the illustration by using names for the accounts – in reality they would be the foreign keys to the accounts table:

debit_account credit_account amount
Bank Account Salary 1000
Gas Bill Bank Account 100
Groceries Credit Card 50

 
To get all of the transactions on my bank account, the query would be:

SELECT * FROM transactions WHERE debit_account = "Bank Account" or credit_account = "Bank Account"

Getting the balance is even harder. I’d either have to sum the debits and credits separately and subtract the credits from the debits or write something ugly. In MySQL it would be:

SELECT SUM(IF(debit_account = "Bank Account", amount, -1 * amount)) FROM transactions WHERE debit_account = "Bank Account" or credit_account = "Bank Account"

In other words, convert all of the credits to negative numbers. It’s complicated and ugly, and might be dependent on the database – other database engines may need the case expression instead of the if statement.

Now imagine an alternative approach with separate entries for each side of the transaction, and using negative numbers for credits. This time the transaction entry table will just link multiple transaction records together, so the individual transaction entries will be held on a separate table:

transaction_id account amount
1 Bank Account 1000
1 Salary -1000
2 Gas Bill 100
2 Bank Account -100
3 Groceries 50
3 Credit Card -50

 
The list of bank account transactions is simplified to:

SELECT * FROM entry WHERE account = "Bank Account"

And the balance is simply:

SELECT SUM(amount) FROM entry WHERE account = "Bank Account"

It’s also really simple to check that the double entry transactions balance – they should sum to zero. A purist might argue against this approach and suggest that the amount should always be positive with a flag to indicate whether it’s a credit or debit, but that just brings back the same problems with queries.

It might look a bit odd that the Salary account has a negative balance, but treating credits as negative numbers is a well established convention in databases. In effect, the negative sign is, in itself, the flag to indicate that the amount is a credit.

The other benefit of having multiple lines for each side of the transaction is that it allows for split transactions. Let’s say I go to the supermarket and buy some groceries and a new shirt. I want to allocate each to a different expense account. With the first approach, I’d need two credit entries against the credit card account:

debit_account credit_account amount
Groceries Credit Card 30
Clothing Credit Card 20

 
That’s a problem, because it doesn’t reflect reality. I didn’t actually make two separate credit card purchases. The second approach lets me reflect that:

transaction_id account amount
3 Clothing 20
3 Groceries 30
3 Credit Card -50

 
As an aside, a purist would probably hate to see the use of negative numbers to indicate credits in a liability account, such as a credit card, as that seems to breaks the accounting equation:

Equity = Assets – Liablities

If liabilities carry a negative balance, subtracting them from assets will actually increase equity, as the signs cancel out.

The thing to remember is that the use of negative numbers is a database convention to make calculations easier for the database. From the point of view of the database:

Equity = Assets + Liabilities

But only because the numbers in the liability accounts are already negative. Strictly speaking, debits and credits don’t map to positive and negative numbers, but treating them as if they do makes the database much easier to work with.

I’ve seen various approaches to naming the pair of tables in this setup. Some use journal for the overall transaction and posting for the individual entries, some use ledger and ledger_item. Those terms have specific meanings in accountancy and they don’t map exactly to what I’m doing, so I’m using transaction and entry for the table names:

Overall, then, my approach is double entry but not a purist version of double entry. I’ve chosen a method that will make selection and reporting easier at the expense of more complex data entry and validation. Mapping a transaction to a single database entry makes form design and validation simple, but with my method I’ll have to validate that the entries balance before adding the transaction and entries as a single atomic database request.

The schema isn’t complete – I’ll be adding narrative fields and may need to add additional fields or even tables as I go along. It’s enough to give me a basic structure, though, and the next step will be choosing a database and building the actual tables.

in Finance Project

Related Posts

Add a Comment

Your email address will not be published. All comments will be reviewed.