Willie Sutton, the bank robber, has much to teach us about coding. After he was finally captured by the authorities he was asked by a reporter why he robbed banks. As it is attributed (very likely misquoted but who cares - don’t let the truth interrupt a good story), he replied, “because that’s where the money is”. Similarly for you, the aspiring data analyst, one must to learn to code because that’s where the data is.
The data is not of course in the code, but in order to access the data one needs to code. Why can’t I use Excel, you ask? Well you certainly can, and I certainly would urge you to continue to do so. However, you cannot only use Excel. That simply won’t do.
Excel is perfect when your data is in a spreadsheet. If your data is in a database, from an API, sitting on a website, or in pretty much any other format than a spreadsheet, Excel will not be much help. I think you already knew this in your heart. Take a second and let it sink in.
Most introductions to coding for data analysis do not suggest you should continue to use Excel. As someone who lacks an analytical mind (not a problem solver here), I find it abhorrent that I should simply let go of the hard work I’ve done in Excel learning to analyze data. I have trouble enough translating the analysis I want to do in my mind into Excel formulas. Adding new tools, let alone learning to analyze data with code is a gargantuan challenge.
It is also, in my not so humble opinion, an unnecessary challenge. Excel is a fine tool for data analysis. And let’s be honest, most Excel users do not maximize the computational complexity or push the memory limits of Excel. For most users, receiving data from previously inaccessible sources would be the greatest boon to their analytical efforts.
So let’s start there. Excel sucks when the data you want is not in a spreadsheet. This is where coding will be so beneficial to you. Without learning much of anything about programming or computer science, you can use code to gather and format data for use in Excel. I’m not saying that it will be easy, but it certainly is not as hard as you think.
Think of any handyman’s toolbox. While there are certainly tools the handyman knows better than others, the handyman knows that the right tool for the job is the best strategy to solving a problem.
Using toolboxes like Python or the Linux command line, you will be exposed to numerous tools. This can become overwhelming. Learning to Google “How to read JSON in Python?” or “How to read HTML in Linux?” will allow you to pick the best tool for the job. Further, you will often find that the code you need has been presented in a StackOverflow answer. You may find you do more copying and pasting than writing. This is not a bad thing.