How to start with machine learning wisely and become a data scientist?
The job title data scientist wasn’t widely used even a short fifteen years ago. There remain a limited (but rapidly growing) number of universities that offer a master’s degree in data science. So, what are the most promising and effective ways to become a data scientist? Let’s analyze what we already have on the education market and what the perfect learning path should look like.
The most popular ways to become a data scientist
Adult education is strongly connected with lifelong education and learning experience – the process of gaining knowledge, practice and developing a problem-solving mindset supported by a deep motivation to learn.
What are the most common ways to gain a data science learning experience? There are three baseline paths – academia (still), bootcamps and online courses. There are also more informal ways, including community-based short courses (e.g. Introduction to Data Science by PyData) or old-fashioned, one-on-one classes. In this article I will focus on the three most common and well-structured paths.
Universities – prestige & a problem-solving mindset
Top ten Universities (QS Top Universities Ranking, 2018), including MIT, Stanford, Cambridge and ETH Zurich offer superior data science courses. Graduates gain not only knowledge on specific technologies but also develop a problem-solving mindset. Universities teach students how to process issues, ask questions and look for valid answers – in other words, how to think like a scientist. They also offer prestige and peerless social networking. What universities may fail to provide is experience turning knowledge into practice. They also require years to complete, can be exclusive and are not an inexpensive means to gaining knowledge.
Bootcamps – experience
Bootcamps are a shorter alternative to a university education (usually 3-12 months) that in most cases teach technologies and good practices. They are ideal for those committed to starting a new carrier in a completely new field. As with the academy, most bootcamps are designed to be a full-time activity. What sets them apart is the experience and work on projects they provide.
Thanks to the internship programs and strong connection with companies, bootcamps are an easy path to one’s first junior data scientist job. The downside of the bootcamp route is that graduates have fewer opportunities to learn how to think about more complex issues. There is simply not enough time to do so. Bootcamps are a great option for individuals who are fully committed to starting a new carrier in a completely new field. However, the learning experience has a limitations.
Online courses – knowledge
Online courses are a third approach. Stanford, MIT and EPFL offer a wide range of courses, as do established online education players such as Coursera or Udacity. Online courses are particularly suited for those who have learned how to learn. If you do not have that skill, you may well get lost in the forest of information. Data science, neural networks, machine learning and business uses of AI can all be tackled online.
Some of the courses are very good quality and very accessible, though relatively few provide complex knowledge and even fewer the comprehensive learning experience – that is, knowledge, practice and the current state of the art. The challenge with online courses is based on extreme self-motivation and exploration. Unfortunately, if you lack basic knowledge about data science and experience as a learner, than it’s hard to use online courses effectively.
Are these enough?
Each of the above three forms of education suffer from the same problem: they fail to provide the opportunity to apply knowledge in real projects. Two of them likewise come up short in providing experience and a problem-solving mindset as well. Both of these factors are crucial in adult education and essential for employers.
Data Science Training Types
* Per person.
** Instant Application of the knowledge and skills is very important in the adult education. It helps in establishing internal motivation
and defining goals. In this case, the instant application would be a use in a particular project or other job-related tasks.
*** Prestige is an additional variable that comes from classical, academy based education.
The fourth way into data science
The most effective route to education in data science would be however different than Gourdjieff’s, and we won’t be referring here to yogis, monks and fakirs. Instead, let’s do a simple exercise. If you take a look at random Linkedin profiles, very often now data scientists has their background in statistics, physics, math, or programming. If you consider to work as a data scientist it’s rather possible that this is a part of your experience too. It’s not a coincidence.
There’s little doubt that it’s faster and more effective to teach an engineer who knows how to code, or a researcher who already knows statistics and has limited experience in R (programming language), building machine learning models, than to teach the same thing to someone who does not have that experience. If you are an engineer or researcher, data science is a natural career choice.
Learning while building a project
An effective learning experience is also a matter of purpose. You need to understand why you are learning and how you will be able to apply your new skills and experience in practice. If you are a software developer and you know that you need to gain specific knowledge to build a model that can solve an issue in your project, you are more willing to learn the technology that will enable you to do that. You will learn faster and potentially understand the technology more thoroughly, because you will be able to apply it by building the solution.
Team training as a part of your job
Let’s add one more variable. Who can benefit from you becoming a data scientist besides yourself? Your employer, of course. This brings me to the fourth way of becoming a data scientist – going through an internal training program grounded in real projects. This may be training series combined with mentoring sessions, based on a real project that will be run in-company. This kind of training would be provided by an employer as an investment in a future internal data science team. This solution will be mutually beneficial for a company as well as an individual.
So, the fourth route to an education in data science is on-the-job training based on a team learning (giving you motivation and a sense of purpose) and real projects (focusing your efforts on a practical goal). Such training functions as a bridge to solving an existing issue or building an outstanding future solution with your and your team’s new competencies. Because basic programming knowledge is required and an academic background or at least experience in solving programming, statistical or economic issues is also helpful, the program may not be for everyone. It’s an advanced path for professionals who appreciate their time and have a strong need to develop.
Characteristics of the Fourth Way Training
* Assumption that the training participants are already familiar with problem solving from the academy or years of work experience.
** Depends on the previous participants’ experience and educational needs.
Teaching data science effectively
Based on many years of my work with data scientists and software developers in the US and Europe, I would suggest that none of the main, current data science educational paths – universities, boot camps and online courses – fully cover the job-oriented potential in this field. None of the three offer effective, advanced training to give you knowledge and experience that is ready to use in your job. The fourth way is a new force on the advanced, professional educational market, and will give you an opportunity to develop in your company.
The next post will discuss the practical implementation of the fourth way using the 4T training method. Enjoy!