Do you want to become a Data Scientist but lack the formal qualification? Fret not. One of our data scientist followed this exact path and became one in a year despite coming from a non-technical background. You can definitely become one too, here are 15 best courses to learn data science in 2020.
Make sure to read our last post on ‘5 Most Important Skills of a Data Scientist‘ to find out why we focus on these skills set.
Disclosure: Some of the links below are affiliate links, meaning, at no additional cost to you, Self Learn Data Science will earn a commission if you click through and make a purchase.
Data Science Timeline
Send download link to:
Probability Courses
1. An Intuitive Introduction to Probability (Coursera) [Free, $49 for Cert]
- Rating: Coursera – 4.7 (694), Class Central – 4.0 (3)
- Estimated Workload: Approx. 14 hours
- Target Audience: Beginner
- Concepts Covered: Basic and Conditional Probability, Bayes rule, Random Variables, Normal Distribution
Review: A basic introduction to probability. Provide enough fundamental to get you going but insufficient for application purposes. Suitable for learners without any prior exposure to probability. Even though enrollment into the course is free, certification and weekly quizzes are only unlock upon paying for the course. Recommended for beginners to have a general overview of the topic.
2. Probability – The Science of Uncertainty and Data (edX) [Free, $300 for Cert]
- Rating: Class Central – 4.88 (24)
- Estimated Workload: 10 – 14 hours/week for 16 weeks
- Target Audience: Advanced
- Concepts Covered: Basic and Conditional Probability, Independence, Random Variables, Bayesian Theory, Probability Distributions, Markov Chains
Review: This is a post-graduate, math-intensive course from MIT and is not for the weak-hearted. Unlike your usual MOOC, this course follows the actual MIT classes closely, making it one of the most challenging and rigor MOOC. Mathematics prerequisites are to be taken seriously as they presumed certain competency in calculus with increasing learning curve as the week goes by. The time commitment needed for this course is exceptional for MOOC, with many learners clocking more than 10 hours per week for weekly lectures and homework. As this is an instructor-led course, lectures and homework will only be released weekly with tight deadlines for submission (Usually a week or two after releasing the contents).
At the time of writing, the next iteration of the course start on 27 Jan 2020. Get ready to learn all you need to know about probability for a data scientist and strengthen your mathematics competency. Highly recommended for serious learners who want to dive deep into the math behind data analytics. Learners without official qualifications should definitely try to get the verified certification to tap into MIT’s credentials and build up their credibility in the field.
Statistics Courses
3. Intro to Descriptive Statistics, 4. Intro to Inferential Statistics (Udacity) [Free]
- Ratings: Class Central – 4.0 (12), Class Central – 4.5 (8)
- Estimated Workload: 2 hours/week for 8 weeks
- Target Audience: Beginner
- Concepts Covered: Sampling Methods, Central Tendency, Variability, Hypothesis Testing, Experimental Designs
Review: Intro to Descriptive and Inferential Statistics complement each other to provide a general understanding of statistical theories. They should be taken together to give a sufficient introduction to statistics as both concepts are needed for data science. Courses consist of very short video lectures ( Not more than 3 minutes) and basic quizzes to drive understanding. It serve as great introductory material for anyone without statistical training and courses are free without any certification upon completion.
5. Fundamentals of Statistics (edX) [Free, $300 for Cert]
- Ratings: Class Central – 5.0 (2)
- Estimated Workload: 10 – 14 hours/week for 16 weeks
- Target Audience: Advanced
- Concepts Covered: Estimator, Maximum Likelihood, Hypothesis Testing, Confidence Interval, Linear Model, PCA
Review: Similar to ‘Probability – The Science of Uncertainty and Data’, this is another math-intensive course offered by MIT. Most of the review for ‘Probability – The Science of Uncertainty and Data’ applied here and I should emphasize the time commitment needed for these courses. A consistent effort is required for both courses for an extended period of time (~ 4 months), you have to plan your schedule well if you plan to complete the course within the time frame.
This is especially important if you verified and purchase the certification as no refund will be given if you did not attain the required score by the end of the course. Take note of the dates of your midterm and final exam before you verified as these will have a higher weightage towards your final score. I, for one, did not manage to complete the course on my first try due to unexpected time commitment and you should avoid the same mistake. The next iteration starts on 11 May 2020 at the time of this writing. Highly recommended for serious learners who aspire to become professional data scientists.
Calculus Courses
6. Mathematics for Machine Learning: Multivariate Calculus (Coursera) [Free, $49 for Cert]
- Ratings: Coursera – 4.7 (2,354), Class Central – 4.88 (8)
- Estimated Workload: Approx. 22 hours
- Target Audience: Beginner
- Concepts Covered: Uni-variate & Multi-variate Calculus, Differentiation Rules, Partial Differentiation, Jacobian & Hessian, Calculus in Neural Networks, Taylor Series, Optimisation, Linear & Non-Linear regression
Review: Calculus required for data science applications. A great way to focus your efforts on important calculus concepts instead of taking general college-level calculus courses. This course introduces important concepts of calculus needed for machine learning algorithms, however, learning curves can be steep as concepts are introduced, covered, and move on quickly throughout the course. Recommended for learners who need a refresher course on calculus and beginners to focus on the main concepts required in data science. This course is available without any fees but quizzes and certification are only unlocked after payment. Certification is not a must unless you have no prior technical qualifications.
Linear Algebra Courses
7. Mathematics for Machine Learning: Linear Algebra (Coursera) [Free, $49 for Cert]
- Ratings: Coursera – 4.7 (4,505), Class Central – 2.7 (6)
- Estimated Workload: Approx. 22 hours
- Target Audience: Beginner
- Concepts Covered: Vectors Space, Vector & Matrix Operations, Determinants, Inverses, Eigenvectors & Eigenvalues
Review: Similar to ‘Mathematics for Machine Learning: Multivariate Calculus’, this course introduces linear algebra concepts used in data science. Pros and Cons are similar to the calculus course. Recommended for learners who need a refresher course or beginners who want to focus their efforts on specific linear algebra concepts. This course is available without any fees but quizzes and certification are only unlocked after payment. Certification is not a must unless you have no prior technical qualifications.
8. Linear Algebra – Foundations to Frontiers (edX) [Free, $49 for Cert]
- Ratings: Class Central – 4.4 (11)
- Estimated Workload: 6 – 10 hours/week for 15 weeks
- Target Audience: Intermediate
- Concepts Covered: Vector Space, Vector & Matrix Operations, Linear Transformation, Eigenvectors & Eigenvalues
Review: Advanced course on linear algebra. As linear algebra formed the core of vectorization (Process that operates on a set of values at a time instead of a single value) and a number of data science algorithms, it is a good idea to have a deeper understanding of the topic. This comprehensive course brings learners through all the fundamental concepts and operations in linear algebra and is definitely a good place to build up mathematics competency. One downside is the use of MatLab in this course, students have to learn a new programming language on top of the already content-heavy linear algebra concepts. I do not find such implementation efficient as it might impair learning the main concepts, however, some might find translating theories into computer programs useful. Highly recommended for serious learners and while you are at it, get the verified certification to add on your resume. I always advocate getting certification by established universities and this time is by The University of Texas of Austin.
Remarks: A special mention to Khan Academy, a great free resource for learning mathematical concepts. Anytime you feel stuck or do not understand any concepts, make sure you visit Khan Academy and watch their lectures. They provide clear visual explanation of concepts mentioned above and is a great alternative way to learn math. Some tips for learning calculus and linear algebra is to go through concepts introduce in ‘Mathematics for Machine Learning’ series and fill up your gap using Khan Academy.
Programming Courses
9. Complete Python Bootcamp: Go from zero to hero in Python 3 (Udemy) [Paid]
- Ratings: Udemy – 4.5 (216,531)
- Estimated Workload: 185 lectures with 24 hours worth of videos
- Target Audience: Beginner
- Concepts Covered: Data Types, Data Structures, Flow Control, Functions, Object-oriented Programming, Modules & Packages, Errors Handling, Decorators, Generators
Review: A relatively complete course to learn python. This course covers the fundamentals and some advanced modules using short videos, quizzes, in-built IDE finger exercises, and Jupyter notebook projects. The inclusion of Jupyter notebook projects is a nice addition as it is arguably the most common tool for data scientists and exposing students to the notebook environment help to prepare them for future courses. The lectures are clear, comprehensive and beginners-friendly as Jose explained concepts in a chronological and simple manner. Recommended for beginners with no prior experience in any programming language. If instead you have some prior programming experience, the Python Documentation is a great resource to get you started in the language.
10. Introduction to Computer Science and Programming Using Python (edX) [Free, $75 for Cert]
- Ratings: Class Central – 4.5 (122)
- Estimated Workload: 14-16 hours/week for 9 weeks
- Target Audience: Beginner
- Concepts Covered: Data Types, I/O, Flow Control, Simple Algorithms, Function, Recursive Function, Testing/Debugging, Errors handling, Object-oriented Programming, Algorithms Complexity, Visualisation
Review: Another MIT course that makes the list and with good reasons. Apart from the usual video lectures, this course set itself apart from other MOOCs by having weekly well-thought-out problem sets that are both challenging and engaging. If you are used to the classroom style of learning, this course is perfect for you. You will have lectures, lab exercise, midterms, exams, and is as close as you can get to a real classroom setting. Even though this is advertised as an introductory course, the learning curve might be steep for someone without prior programming knowledge. Hence, we recommend beginners to have some basics programming knowledge before attempting this course.
This course, being an instructor-led course, assignments and lectures are only released weekly and deadlines for submission are tight. So make sure you are able to commit for the time frame. As of this writing, the next iteration will be on 23 Jan 2020. Highly recommended for everyone stepping into the world of computer science. Again, purchase the verified certificates if possible especially if you do not have a prior formal education in computer science.
11. Introduction to Computational Thinking and Data Science (edX) [Free, $75 for Cert]
- Ratings: Class Central – 4.4 (30)
- Estimated Workload: 14-16 hours/week for 9 weeks
- Target Audience: Intermediate
- Concepts Covered: Optimization Algorithms, Recursive function, Dynamic Programming, Graph Theory, Visualisation, Monte Carlo Simulation, Experimental Data, Introduction to Machine Learning
Review: A continuation of ‘Introduction to Computer Science and Programming using Python’, this course is built on top of its concepts and tried to introduce computation thinking and more complex algorithms. Similar delivery as the previous course and helps to translate programming knowledge into data science algorithms. Although not required, recommend taking these 2 courses in sequence. This course isn’t mandatory but recommended for beginners without formal computer science education as the verified certifications can bring credibility to your competency in programming. Having verified certification for both of these courses will qualify you for MITx Computational Thinking using Python XSeries and get their XSeries certification.
Visualization Courses
12. Tableau 10 A-Z: Hands-On Tableau Training for Data Science (Udemy) [Paid]
- Ratings: Udemy – 4.6 (36,808)
- Estimated Workload: 70 lectures with 7 hours worth of videos
- Target Audience: Beginner
- Concepts Covered: Common Chart Types, Dashboard, Table Calculation, Data Preparation
Review: A visualization A-Z course by Kirill Eremenko. Kirill is known for his A-Z series in data science concepts as his contents are comprehensive and explain complex ideas in a simple-to-grasp manner. A simple to follow along tutorial on the use of Tableau for visualization and dashboard. Tableau is one of the most popular visualization tools in the industry so definitely take this course as part of your learning journey. Recommended for learners to expand their skills set and competency in industry tools.
13. Data Visualization with Python (Coursera) [Free, $39 for Cert]
- Ratings: Coursera – 4.6 (4,969), Class Central – 4 (1)
- Estimated Workload: Approx. 10 hours
- Target Audience: Beginner
- Concepts Covered: Matplotlib/Seaborn/Folium, Common Chart Types, Geo-spatial Visualization
Review: A course focusing on python visualization libraries. This course introduces the two most common visualization package in python; namely matplotlib and seaborn. Great exposure to the possibility of these libraries and start to appreciate the power of visualization as a data science tool. Recommended for beginners with no exposure to these libraries. Similar to other Coursera courses, learners have to purchase the course to access the full content and certification.
Machine Learning Courses
14. Machine Learning (Coursera) [Free, $80 for Cert]
- Ratings: Coursera 4.9 (123,171), Class Central – 4.8 (352)
- Estimated Workload: Approx. 56 hours
- Target Audience: Intermediate
- Concepts Covered: Supervised/Unsupervised Learning, Linear Regression, Logistic Regression, Regularization, Neural Network, Model Selection, Support Vector Machine, K-means, Dimensionality Reduction, Anomaly Detection, Recommender System
Review: The highest recommended course in the data science community. Machine Learning is taught by Andrew Ng, a prominent figure in AI and the founder of Coursera. It is almost a must to take this course as a learning data scientist as most practitioners in the field recognize the authenticity of the course. Andrew Ng uncovered the magic of these machine learning algorithms as he illustrates the math and science behind each algorithm, gives practical use cases and provides well-designed programming assignments. No data scientist should implement an algorithm without understanding the math behind them, and this explains the popularity of this course as Andrew Ng instills confidence and competence in machine learning concepts. Apart from algorithms, Andrew also shared machine learning projects knowledge from his own experiences, making this one of the most comprehensive machine learning course currently.
The only downside is the dated use of MatLab as its programming language of choice for the assignment. In the current climate, python will be a much better choice as it contains some of the best machine learning and AI libraries and communities. However, given the popularity of this course, multiple python implementations of the programming assignment are available in GitHub and you can use these as your reference. Highly recommended for everyone. This course is free and you have free access to all content and assignment, however, the certificate is only available upon purchase.
15. Introduction to Machine Learning for Coders (Fastai) [Free]
- Ratings: Class Central – 4 (1)
- Estimated Workload: Approx. 20 hours
- Target Audience: Intermediate
- Concepts Covered: Random Forest, Model Validation, Feature Selection, Forest Interpretation, Logistic Regression, Regularization, Natural Language Processing, Embedding, Ethical Issues
Review: A different approach to Andrew Ng’s Machine Learning course. Jeremy Howard in fastai took on the practical path and introduces concepts through projects and jupyer notebook. Depending on your learning style, you might prefer one over the other but I would suggest going for Andrew Ng’s Machine Learning course before taking fastai. The reason is that Andrew Ng introduces more general algorithms and provides a better overview while fastai only focuses on a few complex but more powerful techniques. Regardless, this course is definitely worth checking out as it is totally free of charge but without any certification. Jeremy also teaches about industry best practices which I found to be extremely useful in my daily practice.
This course uses the fastai library developed by Jeremy and his team which incorporated some of the best practices as default parameters in their models. However, fastai might not be as established as other more popular data science framework and fewer companies use it in their workflow. Regardless, another highly recommended machine learning course that I strongly advise taking.
Here is all the courses needed to become a data scientist. What are you waiting for? Click onto one of the links and get started. Our data scientist took a year, how long will you take?
What else more do you need to become a Data Scientist? Find out here for our massive guide to become one in 2020!