Creating DIY Budget Manager

Why create your own Budget Manager? Why not just use Mint or Money Brilliant, or YNAB, etc?
Because it’s more fun to create own yourself. Because all the existing ones just don’t quite right for me. Because this is perhaps the most important data that we have. Because it serves as great learning exercise to Python and data analysis. Because I have too much time in lockdown.

I like a simple and to the point solution. I need something that just works, I don’t need a fancy UI, or automatic Sync Online Banking transaction. I like something that can create categories according to my lifestyle, separate income, expense and investment, automate report creation to Excel, and I definitely want to own my data 100%, forever. I keep it with me even if I move banks, countries or planets. I also definitely want it 100% free. After all, we are trying to save money.

Unfortunately off-the-shelf solutions are often comes with subscription fee, and it is often difficult to export/import your data if I want to switch providers. It also requires an Online Banking permissions, and sometimes not compatible with Australian Banks, does not track cash transactions, and not to mention security or privacy concerns.

My DIY Budget Manager will use browser automation tool to download CSV data, import them to SQLite, and categorise them based on your frequently used transaction, and you have all of your data, forever.

Database Setup

First of all, I need to setup my database. For that I use SQLite. SQLite comes standard with Anaconda.
There is also a free UI browser, and I would recommend you to install it. Note that SQLite behave slightly different than other databases. For example, SQLite have rowid, which is an auto-increment integer primary key. This primary key is replaced when you declare a unique integer primary key. In my database design, some of the tables will utilise this feature.

This is what your database diagram will look like.

Budget Manager Diagram

The database will revolve mainly around the transactions. The Share_tran table is optional. It is meant to track your Stock Trading Portfolio.

This is the code to build that database.

CREATE TABLE Transactions (
	Transaction_ID	INTEGER NOT NULL,
	Account_ID	INTEGER NOT NULL,
	Category_ID	INTEGER,
	Payee_ID	INTEGER,
	Payee_name	TEXT,
	Tran_date	DATE NOT NULL,
	tran_value	REAL NOT NULL,
	Memo	TEXT,
	Tax_flag	Boolean CHECK(Tax_flag IN (0,1)),
	Transfer_ID	INTEGER,
	PRIMARY KEY(Transaction_ID),
	CONSTRAINT fk_category FOREIGN KEY(Category_ID) REFERENCES Categories(Category_ID),
	CONSTRAINT fk_Account FOREIGN KEY(Account_ID) REFERENCES Accounts(Account_ID),
	CONSTRAINT fk_Transfer FOREIGN KEY(Transfer_ID) REFERENCES Transactions(Transaction_ID)
);

CREATE TABLE Accounts (
	Account_ID  INTEGER PRIMARY KEY not null,
	Bank_BSB	TEXT,
	Bank_Name	TEXT,
	Account_No	TEXT,
	Account_Alias	TEXT,
	Account_Type	TEXT,
	Budget_Flag	Boolean NOT NULL CHECK (Budget_Flag IN (0,1)),
	Open_Date	date,
	Close_Date	date,
	Open_Flag	Boolean NOT NULL CHECK (Open_Flag IN (0,1))
);

CREATE TABLE Categories (
	Category_ID INTEGER not null primary key,
	Master_Category	TEXT,
	Sub_category	TEXT
);

CREATE TABLE Common_categories (
	Payee_name	TEXT,
	Category_ID	INTEGER,
	CONSTRAINT fk_common_cat FOREIGN KEY(Category_ID) REFERENCES Categories(Category_ID),
);

CREATE TABLE Payee (
	Payee_id INTEGER not null primary key,
	Payee_Name	TEXT,
	Default_Category	INTEGER
);

CREATE TABLE Budget (
	Budget_id INTEGER not null primary key,
	Category_ID	INTEGER NOT NULL,
	Budget_amount	REAL NOT NULL,
	Budget_period	date NOT NULL,
	CONSTRAINT fk_category FOREIGN KEY(Category_ID) REFERENCES Categories(Category_ID)
);

CREATE TABLE Share_tran (
	Account_ID	integer,
	Confirmation_Number	INTEGER NOT NULL primary key,
	Order_Number	TEXT NOT NULL,
	Trade_date	date NOT NULL,
	BuySell	TEXT NOT NULL,
	Security	TEXT NOT NULL,
	Units	INTEGER NOT NULL,
	Average_Price	REAL NOT NULL,
	Brokerage	REAL NOT NULL,
	net_proceeds	REAL NOT NULL,
	settlement_date	date NOT NULL,
	confirmation_status	TEXT,
	CONSTRAINT fk_account FOREIGN KEY(Account_ID) REFERENCES Accounts(Account_ID)
);

Now that you have the tables set up, it’s time to populate them with data.

Populate Accounts and Categories

First, let’s populate the accounts’ table. After all, we can’t have transactions without an account. You can set up as many as you need. I would suggest to set up for each account you have in all of your bank accounts, and for the cash in your wallet/under-the-mattress, assets, liabilities, and any personal debts or receivables you may have. This way you get an accurate view of your net worth.
You can also set up for your Superannuation, Paypal, Afterpay, etc. This way you track all your debts and commitments.

This is the query to insert an account. Repeat this for each account you have.

INSERT INTO "main"."Accounts"
("Account_ID", "Bank_BSB", "Bank_Name", "Account_No", "Account_Alias", "Account_Type", "Budget_Flag", "Open_Date", "Open_Flag")
VALUES (1, '062000', 'Commonwealth Bank', '12341234', 'Holiday Savings', 'Savings', 1, '12/01/2010', 1);

When you close the account, you can simply insert the close date, and mark the boolean flag with 0. You do this just for record keeping of when your accounts was closed, if you ever needed this information.

After you create the accounts, we need to create the transaction categories. By categorising your spending, you know which area you can save more, also helps future planning. Later on I will show you a code to automatically update the categories. However for now you need to categorise them manually.

Attached spreadsheet is the list of categories that I should recommend you create. Modify this as per your need and insert them into your table. Insert query also attached.

The Accounts and Categories tables are the basic tables you need for your budget manager. So let’s start populating your Transactions table!

Selenium: Download Online Banking Data Automatically

You can always download your CSVs from your bank manually, however this can be quite tedious and repetitive process:
1. Login to online banking
2. Pick an account
3. View transactions
4. Select date range
5. Export to CSV
6. Rinse and repeat for each account

As a lazy programmer, we have to implement something to automate this.

And here comes Selenium. Selenium was originally intended to test websites, however in this case we can use it to automate our data downloads.

The code below prompts for your online banking username and password and submit it in the Chrome browser via HTTPS (so everything is secured and encrypted), and navigate through your accounts and downloads the CSVs. While this code is optimised for Commonwealth Bank, you should be able to change a few things to let it navigate through your bank account.

First of all, import the necessary packages:

from selenium import webdriver
from selenium.webdriver.common.keys import Keys
import time
import getpass

Then create a function that logs you in to the online banking:

def cbaselenium():
    user_ID =  input("Enter your username: ") 
    #print(user_ID)
    passwd =  getpass.getpass()
    #print(passwd)
    
    browser = webdriver.Chrome("<Change to your Chrome Driver path>")
    browser.maximize_window()
    browser.get("https://www.my.commbank.com.au/netbank/Logon/Logon.aspx?ei=mv_logon-NB") #change to your bank's login page

    username = browser.find_element_by_id("txtMyClientNumber_field")
    username.send_keys(user_ID)
    time.sleep(2)
    
    password = browser.find_element_by_id("txtMyPassword_field")
    password.send_keys(passwd)
    password.send_keys(Keys.RETURN)
    time.sleep(5)

    #at this point you should be logged in
    return

To change the code for your bank’s, you need to use Chrome and right click on the username input box and select Inspect. Find the id of the input box and enter it there.

Now that you can automate your logon, you can automate the CSV downloads. We do this by creating a function.

def getAcctCSV(browser,acctName):
    
    browser.find_element_by_partial_link_text(acctName).click()
    time.sleep(5)

    #scroll down to view as many transactions as possible
    browser.execute_script("window.scrollTo(0, 1080)") 
    browser.execute_script("window.scrollTo(0, 1080)")
    browser.execute_script("window.scrollTo(0, 1080)")
    browser.execute_script("window.scrollTo(0, 1080)")
    
    browser.execute_script("window.scrollTo(0, 1080)")
    browser.execute_script("window.scrollTo(0, 1080)")
    browser.execute_script("window.scrollTo(0, 1080)")
    browser.execute_script("window.scrollTo(0, 1080)")
    
    #click the export button    
browser.find_element_by_id("ctl00_CustomFooterContentPlaceHolder_updatePanelExport1").click()
    time.sleep(2)
    
    #use the drop down button and select CSV
browser.find_element_by_xpath("//select[@name='ctl00$CustomFooterContentPlaceHolder$ddlExportType1$field']/option[text() = 'CSV (e.g. MS Excel)']").click()
    time.sleep(2)

    #click the Download button
browser.find_element_by_id('ctl00_CustomFooterContentPlaceHolder_lbExport1').click()
    time.sleep(2)
    # go back
    browser.execute_script("window.history.go(-1)")
    time.sleep(5)
            
    return browser

Now that you have the function ready, you can call it from the main function and use it for each account you have.

def cbaselenium():
    user_ID =  input("Enter your username: ") 
    #print(user_ID)
    passwd =  getpass.getpass()
    #print(passwd)
    
    browser = webdriver.Chrome("<Change to your Chrome Driver path>")
    browser.maximize_window()
    browser.get("https://www.my.commbank.com.au/netbank/Logon/Logon.aspx?ei=mv_logon-NB") #change to your bank's login page

    username = browser.find_element_by_id("txtMyClientNumber_field")
    username.send_keys(user_ID)
    time.sleep(2)
    
    password = browser.find_element_by_id("txtMyPassword_field")
    password.send_keys(passwd)
    password.send_keys(Keys.RETURN)
    #at this point you should be logged in

    time.sleep(5)

    browser = getAcctCSV(browser,'Savings Offset')
    browser = getAcctCSV(browser,'Daily Budget')
    browser = getAcctCSV(browser,'Diamond Credit Card')
    browser = getAcctCSV(browser,'Home Loan')

    return

I hope that helps you setup an automatic way to download your bank’s data.

Import data

After downloading your data, we need to import them into the database.

The CSV data will be uploaded to the database by converting it to pandas list, and then inserted to the SQLite table. This sounds complicated, but in practise it is quite simple. The only catch is that you need to modify the code to match the columns and formats CSV file from different banks.

First of all the necessary imports:

import pandas
import sqlite3

Then the function to add the transactions. This function takes input of the account id (as per your table), and the file name.

def addTransactions(account_id,filename):
    bank_name = conn.execute("SELECT bank_name FROM Accounts where account_id = " + str(account_id)).fetchone()
    #    print('bank_name:',bank_name[0])
    max_date = conn.execute("SELECT max(tran_date) FROM Transactions where account_id = " + str(account_id)).fetchone()
    print(max_date)
    #we use the max_date in the database to avoid entering duplicate transactions

    #header=None if no header, 0 indicates row number for the header, names to replace the column names when header = 0
    #the below is various methods of importing the CSV or Excel file into your database. For Cash I use a simple Android app called "Expenses: Simple Tracker" by Nominalista.
    if bank_name[0] == 'Cash':
        data = pandas.read_excel(filename,parse_dates=['Date'],thousands=",")
        amount_sql = 'amount*-1'
        payee_sql = 'title'
    elif bank_name[0] == 'CBA':
        data = pandas.read_csv(filename,parse_dates=[0], usecols=[0,1,2,3], thousands=",", dayfirst=True, header=None, names=('Date','Amount','Description','Balance'))
        amount_sql = 'Amount'
        payee_sql = 'description'
    elif bank_name[0] == 'St George':
        data = pandas.read_csv(filename,parse_dates=['Date'], usecols=[0,1,2,3,4],thousands=",", dayfirst=True)
        amount_sql = 'coalesce(debit,0)*-1 + coalesce(credit,0)'
        payee_sql = 'description'
    elif bank_name[0] == 'Citibank':
        data = pandas.read_csv(filename,parse_dates=['Date'],thousands=",", dayfirst=True)
        amount_sql = 'amount'
        payee_sql = 'description'
    elif bank_name[0] == 'ANZ':
        data = pandas.read_csv(filename,header=None,names=('Date','Amount','Description'),thousands=",",parse_dates=['Date'],dayfirst=True)
        amount_sql = 'amount'
        payee_sql = 'description'
    elif bank_name[0] == 'Westpac':
        data = pandas.read_csv(filename,header=0,names=('a','Date','Description','Debit','Credit','b','c','d'),thousands=",",parse_dates=['Date'],dayfirst=True)
        amount_sql = 'coalesce(Debit,0)*-1 + coalesce(Credit,0)'
        payee_sql = 'description'
    elif bank_name[0] == 'PayPal':
        data = pandas.read_csv(filename,header=0,names=('Date','Time','Zone','Description','Type','Status','Currency','Amount','a','b'),thousands=",",parse_dates=['Date'],dayfirst=True)
        amount_sql = 'amount'
        payee_sql = 'description'
    elif bank_name[0] == 'American Express':
        data = pandas.read_csv(filename,header=None, names=('Date','Amount','Description','Memo','a','b'),thousands=",",parse_dates=['Date'],dayfirst=True)
        amount_sql = '-amount'
        payee_sql = 'description'
    else: #standard format
        data = pandas.read_csv(filename,parse_dates=['Date'],thousands=",", dayfirst=True)
        amount_sql = 'amount'
        payee_sql = 'description'
        
    data.to_sql('import_data', conn, if_exists='replace', index=False)

    if max_date[0] is not None:
        conn.execute("delete FROM import_data where date <= '" + max_date[0] + " 00:00:00'").fetchone()

    insert_sql='''
        insert into transactions
        (Account_ID,Category_ID,Payee_ID,Payee_name,Tran_date,tran_value,
                   Memo,Tax_flag,Transfer_ID)
        SELECT ''' + str(account_id) + ''' as account_id
            ,p.category_id
            ,'' as payee_id
            ,''' + payee_sql + ''' as Payee_name
            ,date(a.date)
            ,''' + amount_sql + '''
            ,'' as Memo
            ,0 as Tax
            ,Null as Transfer_id
        FROM import_data a
        left join common_categories p
            on a.''' + payee_sql + ''' = p.payee_name
            ;'''
    #    print(insert_sql)

    conn.execute(insert_sql).fetchall()

    return conn.execute("""select min(tran_date), max(tran_date),sum(tran_value),count(*)
            from transactions
            where account_id = """ + str(account_id)).fetchall()

And that’s it! Now we just need to create the connection to sqlite and call the functions.

path='D:/BudgetManager/'

conn = sqlite3.connect(path+'budget.db')

c = conn.cursor()

# #Cash account
addTransactions(1, path+'Transaction file/Expenses_20200822_124441.xls')
# #CBA Savings
addTransactions(2, path+'Transaction file/CSVData.csv')
# #CBA Daily Budget
addTransactions(3, path+'Transaction file/CSVData (1).csv')
# #St George Home Loan 
addTransactions(4, path+'Transaction file/CSVData (4).csv')
# #Citibank Credit Card
addTransactions(5, path+'Transaction file/CSVData (5).csv')
# #PayPal
addTransactions(6, path+'Transaction file/Paypal_20200904.csv')

conn.commit()

#Check the latest balances
conn.execute("""select a.account_id, a.account_alias,count(*), min(tran_date), max(tran_date), printf("%.2f", sum(tran_value)) 
        from transactions t
        join accounts a
            on a.account_id = t.account_id
        group by 1""").fetchall()

conn.close()

Reporting

Obviously all those data is no good if you can’t do your own reporting.
The report must shows how much our income expenses by month and category.

First things first, the import packages and the Sqlite connection.

import pandas as pd
import numpy as np
import sqlite3
from xlsxwriter.utility import xl_col_to_name
import datetime

path='D:/BudgetManager/'
conn = sqlite3.connect(path+'/budget.db')

c = conn.cursor()

Then we need to get the transactions summary grouped by month and category.

df = pd.read_sql('''SELECT strftime('%Y%m',date(tran_date,'start of month','+1 month','-1 day')) as month
                    ,account_alias
                    ,case when c.master_category is null then 'Unknown' else c.master_category end as master_category
                    ,case when c.sub_category is null then 'Unknown' else c.sub_category end as sub_category
                    ,tran_value
                FROM Transactions t
                join accounts a
                    on a.account_id= t.account_id
                LEFT join categories c
                    on t.category_id = c.rowid
                where budget_flag = 1
                order by month desc
                ''',conn,coerce_float=True)

Then we need to create a pivot table based on the data so that we have summary for each month (total net surplus/deficit), and total for each category.

table = pd.pivot_table(df,index=["master_category","sub_category"],columns=["month"],values=["tran_value"],aggfunc=[np.sum],margins=True,fill_value=0)

pivot_shape = table.shape
datetime.date.today().strftime("%Y%m%d")

#create the filename with date stamp to keep records
writer = pd.ExcelWriter(path+datetime.date.today().strftime("%Y%m%d")+"_Income_Expense_Budget_Report.xlsx", engine='xlsxwriter')

table.to_excel(writer, sheet_name="Budget")

# Get the xlsxwriter workbook and worksheet objects.
workbook  = writer.book
worksheet = writer.sheets['Budget']
worksheet.freeze_panes(4, 2)


# Add some cell formats.
cat_format = workbook.add_format()
cat_format.set_align('left')
cat_format.set_align('vcenter')
dollar_format = workbook.add_format({'num_format': '$#,##0.00'})
sum_format = workbook.add_format({'bold': True,'num_format': '$#,##0.00'})
# Note: It isn't possible to format any cells that already have a format such
# as the index or headers or any cells that contain dates or datetimes.

worksheet.set_column('A:A', 28, cat_format)
worksheet.set_column('B:B', 23, cat_format)
worksheet.set_column('C:AQ', 10, dollar_format)

worksheet.set_row(pivot_shape[0]+3, None, sum_format)

#set the rightmost columns to the sum format
a=xl_col_to_name(pivot_shape[1]+1)
worksheet.set_column(a+':'+a, None, sum_format)


writer.save()
conn.close()

That’s it, I hope you enjoy it and it is useful for you. If you like this post, please press like, subscribe and share it with your friends. Donations are welcome too! If you have questions or feedback please let me know from the Contact Me Page.

2 thoughts on “Creating DIY Budget Manager

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s