
I get asked this question all the times: “I have several years of software development experience. How do I become a data scientist”? Many IT professionals are interested to switch career to data science but do not know how to do so. In this article, I will cover this topic.
To begin with, let me share my own story. Back in 2008, I was an IT developer. I was working for a leading Indian IT company, as a PL/SQL developer, for the Enterprise Resource Planning (ERP) implementation projects for a leading global client (GE). I was working in Inventory and Order Management (modules of ERP). That is where I saw algorithms/models being used for decision making (ex. Inventory forecasting), and I got interested to know more.
However, I did not know where to start. I could not find any online course on this. In fact, online learning was not very popular back then as they are now. Note that these are pre-data-science times. With no other option at my disposal, I decided to quit my job (in 2010) to do a full time M.Sc. in Economics (to learn Statistics/Econometrics as part of this course).
Back then I did not know ‘data science’ as a career (now we all know). In fact, this term had not yet been coined. But there were jobs in data analytics (rather a niece area back then) in few companies. To get these jobs, you needed to have a (post graduate/PhD) degree in Statistics, Mathematics Economics. It wasn’t even easy for engineering graduates from elite universities to get these jobs (let alone others).
Long story short, I managed to get a data analytics job at a US based MNC after finishing M.Sc. in Economics. It has been almost a decade since I am in this field.
Things have changed so much in a decade time. Now anyone with a quantitative degree can become a data scientist, provided he/she is willing to learn things freely available on the internet. The entry barrier is less now compared to when I started my career in Data Analytics. Besides, the demand for data science professionals has increased leap and bound.
So what steps would I take if I were to start all over again?
I would not go for a full time course (for sure). I already knew coding and was from Physics background (knew enough mathematics). So there was no point wasting two years and good amount of money (being from a lower middle class family) doing post graduation. I would have self-taught myself to become a data scientist. I do not think I would be any less competent data science professional than what I am now had I taken this route.
That is what I suggest anyone who is already having few years of IT experience. Do not leave your job to get into data science. You can learn on the side while working full time. Unless you are rich or interested in a career break, doing a (full time) master degree in data science is not worth it.
So how do you learn on your own to become a data scientist? Follow the below steps
General advice:
– Do not think you will become a data scientist overnight. It takes time. Usually it takes about 4-12 months of rigorous study to get a job in data science.
– Plan your study. Gather resources from where you will study. Do not follow too many resources. A few good ones are enough.
– Do not waste time understanding confusing (technical) jargons (like difference between data science and data mining).
– Do not fall prey to training centres that promise you placements (most do not provide).
– Do not be intimidated by technical terms. You will be comfortable with them over time.
– Do not do data science because everyone else is doing. Other IT jobs are equally rewarding.
– Do spend some time researching if you are genuinely interested. Else do not waste your time. Talk to your friends/colleagues who are already working in this field.
– Do not expect huge salary increase in a short time. You will be disappointed.
– Data science is very vast. So you do not have to know everything. So focus on just a few areas and try to be good at them (you need not be good at both NLP and Computer vision).
– Do not worry if you are from Non-engineering background. I have seen many from pure science, arts and commerce streams doing well in data science.
– Remember your programming experience (Java, .Net, C++ etc.) or Cloud computing experience will be invaluable in data science.
Specific advice on how to learn:
– Learn Linear algebra and Co-ordinate geometry on Khan Academy (1 week)
– Learn basics of Statistics (Mean, Median, Correlation etc.) on Khan Academy (1 week)
– Learn basics of Python Programming on freeCodeCamp (2-3 weeks)
– Learn basics of SQL on w3school (1 week)
– Learn advanced Statistics theory on Statsquest (Linear Regression, Logistic Regression, Quantile Regression, Polynomial regression, Hypothesis testing, Time Series analysis, Cluster Analysis) (4-6 weeks)
– Learn Machine Learning on Statsquest/Krish Naik (Decision tree, Random Forest, Bagging, Boosting, Support Vector Machine) (4-6 weeks)
– Learn implementing these models in Python on Sentdex (8 weeks). Also do a few basic projects (predicting survival using Titanic dataset, stock price forecasting etc.).
– Learn advanced ML libraries in Python on Sentdex/Krish Naik (2 weeks)
– Do more advanced (real world) projects (twitter sentiment analysis, Credit scoring in banking etc.). (4 weeks) — contact us for projects.
– You may participate in competitions on Kaggle (but you do not need to do well there to become a data scientist)
– Learn some data visualisations on Edureka (Using R/Python or specialised tools like PowerBI). Big data/ETL experience also adds values to your CV.
The above mentioned things can be learnt in 5-6 months. After you have done the above, start applying for jobs (internally within your company or outside). You can never gain good experience doing projects at home. So try getting a project within your company or outside. Highlight your (academic) data science project experience in your CV. More than CV, people will be interested in your projects. For example, if you have done a Sales forecasting project then it will interest retail companies. A credit scoring project on your CV will impress banks/fintechs.
These are some of the wonderful free resources (blogs/YouTube channels) to learn data science
– Sentdex
– FreeCodeCamp
– Krish Naik
– Edureka
– Machinelearningmastery.com
– Khan Academy (for maths/stats topics only)
My personal favourite is FreeCodeCamp. If you do not want to follow too many blogs/YouTube channels, just follow FreeCodeCamp. They have everything you need to become a data scientist.
I also run a YouTube channel (Analytics University). It is not as good as the above sources. However, if you are an absolute beginner and you have no issue with Indian accent, you may follow my channel. There are many beginner friendly analytics videos (on Python, R and SAS).
While free sources are wonderful (no doubt), they are not very organized. Even the good channels/blogs are not very organized. Hence I suggest you to do a few cheap online paid courses. These are many courses on Coursera/Udemy. You may subscribe to DataCamp as well. Paying a little money, you will get to learn from amazing faculties (in an organized fashion).
If you are planning on attending bootcamps, prefer offline ones. Crowded online bootcamps are not that great (telling from my experience).
I recommend people to hire a career coach. It’s not that expensive. There are many who are providing career coaching (data science related). Talk to them over phone/zoom on a regular basis to guide you and clarify your doubts (technical as well as non-technical). Also ask them to review your CV and provide you with interview preparation tips. You may hire me as a coach (contact: analyticsuniversity@gmail.com).
Sometimes, you may not have to learn all that I have mentioned above to start applying. You just need to do a couple of projects in a given field and then you will start getting calls. Five years back I trained someone, who was an absolute beginner, in Supply Chain Analytics for 2 weeks. He managed to get a job with PwC in a month time. Another person, who was working as a tester with IBM, managed to get a Financial Crime analytics job after doing a Credit scoring course form me (contact me if you are interested to learn Credit risk modelling).
My first student (back in 2014) was an experience IT professional (10 years more experienced than me) who managed to become a data science manager at Cognizant after learning some basic stuffs from me. He is not a hands-on data scientist, but he has enough knowledge to manage data science teams (ideal for experience > 10 years).
The bottom-line is that you need not become an expert to break into data science. So do not feel intimidated by reading on different blogs that you need to master 100 different techniques before calling yourself a data scientist. Like software engineering projects, there are many types of data science projects. Some are extremely complex, others are moderate to easy ones.
But there is something for everyone in this field. So if you are genuinely interested, I suggest you give it a shot at it (and now). This field (despite over-hype) will only going to grow in the future (imho).