December 3, 2020

5 Data Science Books Every Data Scientists Should Know

best data science books

Most blogs recommend online courses or interactive programming platforms to learn data science. But what if you prefer learning from a book? They say there are more treasures in a book than a pirate’s loot so why not find your treasures there? Here, we list out 5 most recommended data science books that every data scientist should possess.


Disclosure: Some of the links below are affiliate links, meaning, at no additional cost to you, Self Learn Data Science will earn a commission if you click through and make a purchase.


Hands down the best book any data scientist should read. I myself read the whole book from cover to cover not once, not twice, but thrice. ‘ An Introduction to Statistical Learning’ not only serves as a great introductory material for beginners but also function as a reference book for more experienced data scientists. This book aims to introduce the logic and rationale behind each algorithm with college-level math so that anyone will some math background can get right into it. You should be able to understand 80% of the content with just college-level math.

Apart from exposing readers to the different statistical learning models, this book also includes concepts that are paramount for a data scientist. Some examples are the Bias-Variance trade-off, evaluation of model performance, re-sampling methods and others. You should have a better understanding and appreciation of the work done by data scientist after reading this.

Although programming assignments and labs are in R, do not be put off by it if you are not familiar with the language. You can skip the Lab and Assignment sections completely without missing out on any content. This book is able to stand alone with the theories concepts alone, and a happy bonus if you are learning R!

What can you expect:

  1. What is Statistical Learning
  2. Linear Regression
  3. Classification
  4. Linear Model Selection and Regularization
  5. Tree-based Models
  6. Support Vector Machines
  7. Unsupervised Learning

View in Amazon

If ‘Introduction to Statistical Learning’ is the bible for mathematical concepts, ‘Hands-On Machine Learning’ is the equivalent for its applications in python. Using the most common libraries in python, this book teaches readers the whole end-to-end process of a machine learning project in code. The scope of the book also covered advanced algorithms such as neural networks and deep learning. Perfect for anyone looking to put what they learned from courses or classroom into actual codes.

If you have zero knowledge in machine learning, fret not. This book also contains a complete walk-through of the common machine learning algorithms, albeit less in-depth than ‘Introduction to Statistical Learning’. If you liked the structure form of learning, you will love this book as they followed the theory -> hands-on -> exercise format, giving you a classroom-like setting.

This is the 2nd edition of the title and is a great improvement over the 1st. Having included Keras in this book, readers can learn this high-level deep learning API that allows a fast and simple building of neural networks compared to Tensorflow which I think is a major downside of the 1st edition that turns away beginners. 

What can you expect:

  1. Machine Learning Landscape
  2. End-to-end Machine Learning Project
  3. Regression, Classification
  4. Neural Networks and Deep Learning
  5. ANN, CNN, RNN
  6. Reinforcement Learning


View in Amazon

Another great book for those who want to learn python for data science. ‘Python for Data Analysis’ taught about more fundamental but as if not more essential libraries compared to ‘Hands-On Machine Learning’. This book is filled with coding examples and snippets which you can save on your local computer and used it whenever you faced with similar problems. Very useful as even the most experienced data scientist would not remember all functions.

Bonus for beginners, all installation and setup guides are available in this book. It also introduces some of the more common IDE ( Integrated Development Environment) for data science such as Jupyter Notebook. 

What can you expect:

  1.  Basics of Python
  2. Data structures
  3. Functions
  4. Numpy
  5. Pandas
  6. Matplotlib

View in Amazon

The book I went to great length to obtain a physical copy. ‘The Hundred-Page Machine Learning Book’ by Andriy Burkov is filled with positive reviews and said to be the best machine learning book you can get your hands on. Being the FOMO (Fear of Missing Out) person I am, I rushed to get this off amazon and it never disappoints.

In just 100+ pages, this book covered all types of machine learning algorithms and best practices. The math notations used are beginners friendly but still provide enough depth. The explanation for each concept is concise, elegant, and straight to the point. Andriy wastes no words here and every sentence is packed with knowledge. This book is also thin and light enough to be brought everywhere as a reference book. Very useful.

At only $40 for the paperback version, this is a great investment for beginners and experts alike.

What can you expect:

  1. Probability and Statistics
  2. Regression
  3. Decision Tree
  4. Support Vector Machine
  5. K-nearest neighbors
  6. Neural-networks and Deep Learning
  7. End-to-end machine learning process

View in Amazon

To be frank, this book frightened me. Apart from deep learning as indicated on the title, this book works it’s way up to deep learning from applied math and machine learning basics. However, these aren’t as basic as you think. Advanced math topics were included in the book with advanced math notation. If you do not have advanced math knowledge, you will be put off at the very first chapter (That’s a lot of ‘advanced’ for this advanced book).

However, this is also the reason this book is highly recommended. Apart from the fact that Ian Goodfellow is a recognized figure in machine learning research, this book gives an in-depth explanation and background for deep learning practitioners. Highly recommended for learners who wish to go into research. As for beginners, brush up your math, finish other books on this list before attempting this beast.

What can you expect:

  1.  Linear Algebra
  2. Probability and Information Theory
  3. Numerical Computation
  4. Machine Learning Basics
  5. Deep Feedforward Networks
  6. Convolutional Networks
  7. Sequence Networks
  8. Deep Learning Research

Do you aspire to become a Data Scientist after reading these books? See our step-by-step guide on how you can become one at ‘How to become a Data Scientist in 2020‘.