Best Free Data Science/Data Management Courses

Over the past few months, I have been gathering the best Online Data Science/Computer Science/Data Management courses. I also include courses that requires student email address, and those the do not.

Data is truly a growing industry. I believe it is now mentioned in almost every job roles in any industry from junior to executive management level. It is not limited to only the ability to extract and manipulate data, but also the ability to visualise, story-telling and interpret the data itself, as well as managing and governing the data to ensure it is of high quality.

1. Python

Python has to be the first in the list.

Python is perhaps the most popular programming language in the world today. It has extensive library that is very useful for data analytics, machine learning and other scientific disciplines.

The link directs you to the first of two-units training for the PCEP – Certified Entry-Level Python Programmer and PCAP: Certified Associate in Python Programming certification exams. The exams itself is fairly affordable at USD59 (Entry) and USD295 (Associate) each which can be taken from https://pythoninstitute.org/pcep.

The intermediate course can be taken from this link.

Advertisements

2. Datacamp

Datacamp is one of my favorite, because the training has integrated IDE tool, which enables executing code within our web browsers without installing additional softwares.

Datacamp offers two free courses that introduces Python and R. However the yearly subscription is a good value and comes with dozens of trainings, projects, workspace and well-recognised certification exams with case study.

3. SAS

While uses of SAS in Enterprises is waning compared to Python or R. It is still being used in large Enterprises and Government. According to Seek.com.au, there are 3300+ jobs require this skill today.

SAS offers two free trainings and their optional exams are only $USD150 each. The free trainings are Programming 1 and Statistics 1.

However if you have .edu email address, almost their entire library of self-paced trainings is free, and their exams become half-priced too. This is a savings of thousands of dollars! Follow this link to know more: link.

Advertisements

4. Alteryx

The next in the list that I would like to recommend is Alteryx. Alteryx is a no-code or low-code IDE tool, which means someone can learn data analytics without prior knowledge in coding. There is a certain growth in Enterprises that adopt this tool for their Big Data projects, and knowledge of this tool would certainly help your career.

The Alteryx academy has robust training with interactive lessons that allow ones to explore its IDE without installing the software itself. However I do recommend to install their Designer software on free 14 days trial. There is also free 3 months Career Changer license, as well as 12 months free license for those with student email address.

They offers three absolutely free certification exams, the Foundation, Core, and Advanced Designer. The Foundation and Core levels test basic knowledge of Alteryx’s tools and data analyst abilities. The Advanced exam however is a step up that tested a much more specific tools and tougher practical questions. If you are keen to go further, they have Expert and Predictive Master exams at USD150 each.

Advertisements

5. Dataiku

Dataiku is another no-code/low-code software, similar to Alteryx. Personally, I don’t have much knowledge or experience with this. However I am interested to explore this in near future.

6. Power BI

Before exploring this point onwards, I would recommend to be familiar with the basics of Data Analytics offered above first.

Power BI is one of the popular visualisation tool in the market right now. The link above offers tons of free trainings.

Advertisements

7. Tableau

Tableau is another of popular visualisation tool that is more common in larger enterprises. Unfortunately they only offers a few free basic trainings.

8. Collibra

Collibra is one of the Data Governance tool in large enterprises. They have tons of free trainings, while some are very specific to its own software, the general knowledge of Data Management is applicable anywhere.

9. Informatica

Informatica is another Data Governance tool that competes directly with Collibra. They have tons of free trainings and the general knowledge of Data Management is worth learning.

Advertisements

10. VMEdu

11. VMEdu

While VMEdu is not Data Science/Computer Science/Data Management course, the free trainings and certifications offer knowledge in Scrum and Lean/Six Sigma (among a few), which are essential for developers in project environment.

What else?

The list above only covers some of the free trainings in the world of Data. There are others such as AWS, Azure, Snowflake, Redshift.

Other than the above, there are dozens of free courses in Udemy, Edx, Coursera that is truly worth learning. Coursera offers one free paid course each calendar year for .edu email address too! If you have one, then the Google or IBM Data Analytics course is very much worth taking.

Do you know more? Please share your knowledge by sending me an email!

One-Time
Monthly
Yearly

Make a one-time donation

Make a monthly donation

Make a yearly donation

Choose an amount

$5.00
$15.00
$100.00
$5.00
$15.00
$100.00
$5.00
$15.00
$100.00

Or enter a custom amount

$

Your contribution is appreciated.

Your contribution is appreciated.

Your contribution is appreciated.

DonateDonate monthlyDonate yearly
Advertisements

What non-techy should know on how to simplify Code

Simplicity is the ultimate sophistication.”

Leonardo da Vinci

Hi all, I really should start blogging again. It truly has been a while. Apologies for that. I will try to blog more regularly from now.

In my current return to blogging, I will discuss on what non-techy people (PM, BAs, Marketing, etc) should know about ‘Simple’ coding.

Why should you know this? Because simplicity is one of the goal that we all are trying to achieve in our job. I believe most organisations have this conversations with their team every now and then, on how can they make this simpler.

Unfortunately however, I believe the discussions often only look into how to make the processes simpler. It often neglects to simplify the code that is used to generate data, or calculation or information.

So here it is, a very simple way to simplify coding.

How to identify complexity in codes?

First of all, let me define complexity. Complexity in coding is when there is too much unnecessary code. Or in another word, if with less code, it can achieve the same result, then that code is overly complex.

Advertisements

Think about a document or report. Think of its purpose. If it is unnecessary long, then that is a overly complex document, right? It’s the same way for code, if it can be written less to achieve the same purpose, then it must be too complex.

One way to detect this is by looking at the number of lines (minus the comments). If it can be reduced by a thousand lines or more, there must be an area to improve.

Another way to see complexity in coding is when there is unnecessary nested ifs or nested loops.

Loop and if are fundamental blocks of any programming languages. They are the first things we learn in any programming courses. ‘Nested’ is when there is if/loop conditions inside another if/loop, and so on. The more it is nested the more difficult and complex it is to understand.

At university, we are often forced to learn how to use and implement nested ifs and loops, which is very advance and very complex method. However, in life, this are not necessary in most cases.

Think about a Word document again. A document can have multiple sub-heading, right? If a document has sub-headings of too many levels, then it is a complex document, right? The same way with coding, if it has nested loop or nested if, it is difficult to understand, and that is complex. The question to ask is then, is that really necessary? Can it be done simpler?

Advertisements

The third way is about using functions and repeatable code. Functions is when you have repeatable code or similar code. You then put them into a function, so that you use less code, less lines, less complexity. Imagine when your database have 100 similar tables and you need to append them together. Writing 100 SQL statements and union in between is unnecessarily complex. Instead writing one SQL statement and a function that append them together is much simpler.

The last thing is about structure. Again, let’s compare to a Word document. Every section should have a sub-heading and a clear purpose of what it is written about, right? It is poorly written if Introduction is written in the middle of the document, for example. Or if some analysis is written at the conclusion, then that is poorly written, right?

In coding, the structure should be that each section aims to do just one or maybe two things. It should have a comment block explaining what it is trying to achieve, and that’s it.

Why does this matter?

So you might ask? Why does this matter? Should we care if it does what it supposed to do anyway?

It’s because a complex code is difficult to peer-review. Difficult and take longer time to audit. It also more difficult to handover should someone resigned.

I hope this helps, my two cents on coding simplicity. Contact me if you need help in designing controls around your coding team.

One-Time
Monthly
Yearly

If you think this blog is helpful, a small donation goes a long way.

Make a monthly donation

Make a yearly donation

Choose an amount

A$5.00
A$15.00
A$100.00
A$5.00
A$15.00
A$100.00
A$5.00
A$15.00
A$100.00

Or enter a custom amount

A$

Your contribution is appreciated.

Your contribution is appreciated.

Your contribution is appreciated.

DonateDonate monthlyDonate yearly

Mortgage/Loan Interest Validator – Has Your Bank Made a Mistake?

Is your bank charging the right interest?

If you have a mortgage, you may have wondered if your bank charges your interest correctly. What if the bank didn’t factor in all your offset account balances? What if they charge incorrect rate? Or perhaps you are just curious.

There can be a few reasons how a bank may miscalculate your interest:

  1. Your offset account(s) aren’t linked properly or not linked in time, such as this event. This could be due to a loss of communication, human error, or system error or anything.
  2. Incorrect interest rate. For whatever reasons, the interest rate is wrong. This could be because the introductory/package/discretionary discounts were not applied properly, miscommunication, or anything.
  3. Incorrect payment types, such as Interest Only or Principal + Interest. This is usually a system error, but can be a human error too.

To do this manually in Excel can be very troublesome and time-consuming. This simple code below will automatically validate every interest charges in your home loan.

After you have followed the procedure below, you will find that validating your interest will take a matter of minutes only. And if you haven’t installed Anaconda already, I would recommend you do that now.

If you ever worried that your data maybe compromised. Fear not, that this code below is completely offline, no data will be uploaded anywhere.

Advertisements

Download the Mortgage and Offset Transactions

First of all, you need to download all your home loan and offset accounts transactions in CSV. This feature usually available in all banks, if you could not find this feature in your online banking, then you should contact your bank.

FYI, if you have already created your own budget manager, you don’t need to do this again.

After you downloaded your CSVs, you need to open them in Excel and remove all columns except description, transaction date and value. In that order. Don’t worry about the column names.

However, please make sure the following:

  1. The transaction values are correctly signed (positive for deposit/negative for interest or redraws).
  2. All interest charges have the word ‘interest’ in it. Case doesn’t matter.
  3. All payments have the word ‘payment’ in it. We may use this in the future.
  4. The first row has to be the initial balance of the account.

Finally, we need the interest rates of the loan. To do that, you need to create a new Excel file, and call it ‘interest_rate.xlsx’.

In that file, there should be only two columns: The interest effective date and the interest rate itself.

Advertisements

Advertisements

The Code – Comments Inside

import pandas as pd
import numpy as np
from datetime import datetime

#put all your csv and py files in this location
path='C:/Users/user/OneDrive/Documents/Blogging/Mortgage Validator/'
number_of_offset = 2 #change for the number of offset accounts

if __name__ == "__main__":
    main()
    
def main():
    #import the mortgage file
    mortgage_df = import_file(path+'Mortgage1.csv')
    
    #prime the mortgage data
    mortgage_df = format_mortgage_data(mortgage_df)
    
    #add the interest rate data
    mortgage_df = addInterestRate(mortgage_df)
    
    #add all the offset files
    mortgage_df = merged_offsets(mortgage_df)
    
    #calculate the daily interest
    mortgage_df = calculateInterest(mortgage_df)
    
    #export the output to excel
    export_output(mortgage_df)
    
    #import the mortgage file
    mortgage_df = import_file(path+'Mortgage1.csv')
    
    #prime the mortgage data
    mortgage_df = format_mortgage_data(mortgage_df)
    
    #add the interest rate data
    mortgage_df = addInterestRate(mortgage_df)
    
    #add all the offset files
    mortgage_df = merged_offsets(mortgage_df)
    
    #calculate the daily interest
    mortgage_df = calculateInterest(mortgage_df)
    
    #export the output to excel
    export_output(mortgage_df)
    
def merged_offsets(df):
    i=1
    df['total_offset_balance'] = 0

    while i <= number_of_offset:
        #import the offset file
        offset_df = import_file(path+'offset'+str(i)+'.csv')

        #format the offset data
        offset_df = format_offset_data(offset_df)
        
        #merge the offset data with the mortgage
        df = pd.merge(df, offset_df, on="Date",how='outer')
            
        #rename the offset balance column
        df = df.rename(columns={"offset_balance": "offset"+str(i)+"_balance"})
        
        df['total_offset_balance'] = df[['total_offset_balance','offset'+str(i)+'_balance']].sum(axis=1, numeric_only=True)
        
        i=i+1
    
    return df

def import_file(filename):
    df = pd.read_csv(filename,header=0,names=('Description','Date','Amount'),thousands=",",parse_dates=['Date'],dayfirst=True)
    
    #first, sort data by date
    df = df.sort_values(by=['Date'])
    
    return df

def format_mortgage_data(df):
    #then we need to separate each description to each columns
    #Interest columns
    df['Interest'] = np.where(df['Description'].str.contains('interest',case=False), df['Amount'], 0)
    #Repayment columns
    df['Payment'] = np.where(df['Description'].str.contains('payment',case=False), df['Amount'], 0)
    #Other columns
    df['Other'] = np.where(~df['Description'].str.contains('payment|interest',case=False), df['Amount'], 0)

        
    #Then we need to group duplicate dates in one row
    df = df.groupby(['Date'])[['Interest','Payment','Other']].sum().reset_index()

    #fill the gaps between dates with time series
    r = pd.date_range(start=df.Date.min(), end=datetime.now())
    df = df.set_index('Date').reindex(r).fillna(0.0).rename_axis('Date').reset_index()
    
    return df


def format_offset_data(df):
    #then we need to separate each description to each columns
    #Debit columns
    df['Debit'] = np.where(df['Amount'] < 0, df['Amount'], 0)
    #Credit columns
    df['Credit'] = np.where(df['Amount'] > 0, df['Amount'], 0)

        
    #Then we need to group duplicate dates in one row
    df = df.groupby(['Date'])[['Debit','Credit']].sum().reset_index()

    #fill the gaps between dates with time series
    r = pd.date_range(start=df.Date.min(), end=datetime.now())
    df = df.set_index('Date').reindex(r).fillna(0.0).rename_axis('Date').reset_index()
    
    #calculate daily balance
    df['Balance'] = (df['Debit']+df['Credit']).cumsum()
    
    #drop the Debit and Credit as we no longer needing them    
    df = df.drop(columns=['Debit', 'Credit'])

    #rename the offset balance column
    df = df.rename(columns={"Balance": "offset_balance"})
    
    #Forward fill the balance until today
    df = df.set_index('Date').reindex(r).fillna(method='ffill').rename_axis('Date').reset_index()
    
    return df

def addInterestRate(df):    
    rate_df = pd.read_excel(path+'interest_rate.xlsx',header=0,names=('Date','Rate'))

    r = pd.date_range(start=df.Date.min(), end=df.Date.max())
    
    #Forward fill the rate dataset with the rates for each day
    rate_df = rate_df.set_index('Date').reindex(r).fillna(method='ffill').rename_axis('Date').reset_index()
    
    #join the table to put the interest rate in the daily data
    df = pd.merge(df, rate_df, on="Date",how='outer')
    
    return df


def calculateInterest(df):
    #Now we have a daily dataset with Interest and payment amount populated.
    #Then we need to calculate the daily balance
    df['Balance'] = (df['Interest']+df['Payment']+df['Other']).cumsum()
    
    #Calculate the accrued daily interest
    df['AccruedDailyInterest'] = (df['Balance']+df['total_offset_balance'])*df['Rate']/365
    
    #Calculate the accumulated interest accrued, and reset the amount each time the interest is charged by the bank
    df['AccumIntAccrued'] = df.groupby((df['Interest']!=0).cumsum()).AccruedDailyInterest.cumsum()
    
    #Calculate the interest that should be charged on the interest charged date
    df['AccruedMonthlyInterest'] = np.where(df['Interest']!=0, df.AccumIntAccrued.shift(1),0).round(2)
    
    #Calculate the interest difference
    df['InterestDiff'] = (df['AccruedMonthlyInterest']-df['Interest'])
    
    return df

def export_output(df):
    #export the file to excel
    df.to_excel(path+"output"+datetime.now().strftime("%Y%m%d_%H%M%S")+".xlsx")
Advertisements

Final Thoughts

Is your interest charges different than what you have calculated? Try figuring out yourself why, before contacting your bank.

One of the reason may be:

  1. Incorrect interest rate applied. Check your statements and documentations.
  2. Different rounding method. Some banks may round the interest accrued daily. If this is the case the difference should be very small.
  3. Interest charged on a different day: Banks usually charged interest on the monthly anniversary day, however when this falls on weekend/ph, they may charge this on the following Monday. While this may be how it appear on your statement, the balance may factor the interest amount on the anniversary day still. Try adjusting the dates yourself to see if this is why.

In the future, I might expand into analysing how quickly will you pay off your loan based on current parameters.

One-Time
Monthly
Yearly

If you like this page, a donation can go a long way…

Make a monthly donation

Make a yearly donation

Choose an amount

A$5.00
A$15.00
A$100.00
A$5.00
A$15.00
A$100.00
A$5.00
A$15.00
A$100.00

Or enter a custom amount

A$

Your contribution is appreciated.

Your contribution is appreciated.

Your contribution is appreciated.

DonateDonate monthlyDonate yearly

Interview Tips in the Remote Working Environment

COVID has changed how we do interview. Here is my tips to ace them.

Article: Tips on how to ace a video interview — People Matters

In today’s climate, you are probable going to be invited for interview via video conference. Video interview can be very different and challenging than face-to-face. You may be invited by Skype, Teams, Zoom or other video conferencing applications.

Read more

SAS for SQL Developers

I am writing this guide for SQL developers out there that is learning SAS for the first time.

In case you are wondering, SAS is in the top ten data science programming language. In Australia, major corporations uses SAS in many of its departments. The one reason they do this instead of Python or R is because for its Enterprise Support.

The following guide is written for some of the most common SAS functions.

Read more

Budget Manager: BillSense and BalanceSense

Recently one of the Big 4 Bank in Australia released a functionality in their Mobile Banking app called BillSense. It is a feature that analyses your transactions and predicts your future bills.

As a customer of the said bank, I understand how the BillSense work is by looking at my past transactions, and identify future bills from it. The app then asks my confirmation of which is bills, and which is not. This data then is used by the app to project future balance, and ask myself to set apart some funds for those bills.

While this is great, it is however aimed for the Millennials, in which I am not. For example, it doesn’t predict future income or other regular expenses such as dining out or groceries. Which means it doesn’t estimate future balance.

I know that I am disciplined with my spending and I have a very good idea of my regular bills. What I need therefore is a balance projection to predict when I will be able to achieve my savings or retirement goal.

Read more

The Best Android Auto Device is not what you think it is

For a few years now I have been wanting to upgrade my aging old car’s Head Unit, or otherwise called Radio or Stereo System, to a newer system that has Android Auto or Apple Car Play.

If you haven’t tried Android Auto or Apple Car Play yet, you should try and test drive from newer car models. They will blew your mind compared to old stereo system.

Read more

Creating DIY Budget Manager

Why create your own Budget Manager? Why not just use Mint or Money Brilliant, or YNAB, etc?
Because it’s more fun to create own yourself. Because all the existing ones just don’t quite right for me. Because this is perhaps the most important data that we have. Because it serves as great learning exercise to Python and data analysis. Because I have too much time in lockdown.

I like a simple and to the point solution. I need something that just works, I don’t need a fancy UI, or automatic Sync Online Banking transaction. I like something that can create categories according to my lifestyle, separate income, expense and investment, automate report creation to Excel, and I definitely want to own my data 100%, forever. I keep it with me even if I move banks, countries or planets. I also definitely want it 100% free. After all, we are trying to save money.

Unfortunately off-the-shelf solutions are often comes with subscription fee, and it is often difficult to export/import your data if I want to switch providers. It also requires an Online Banking permissions, and sometimes not compatible with Australian Banks, does not track cash transactions, and not to mention security or privacy concerns.

My DIY Budget Manager will use browser automation tool to download CSV data, import them to SQLite, and categorise them based on your frequently used transaction, and you have all of your data, forever.

Read more