Want to start learning Data Science while consolidating your Python Knowledge? Then this is the perfect book for you.
The following is a review of the book Python Data Science Handbook: Essential Tools for Working with Data by Jake VanderPlas.
Review of Python Data Science Handbook
Python Data Science Handbook is a reference manual and learning resource that teaches its readers statistical and analysis methods crucial to data science. You will learn how to do Exploratory Data Analysis (one of the most important steps of a Machine Learning project, where we get insights from the data), using the three most well known Python Libraries for this: Pandas, Numpy, and Matplotlib.
The only pre-requisites for this book are a slight programming background and maybe some statistics knowledge, however this last one can be skipped. You will learn to manipulate, transform, and clean data, visualise it in many cool and meaningful ways, and learn how to explore the data in order to give answers to business questions or to build your own Machine Learning models.
It includes many examples, applications, and tools that are of very high practical value. In a few words, a book to always have close by when playing with structured Data. Lets see what is inside it!
2. Introduction to Numpy: Introduction to the different data types of numpy: arrays and matrices and their differences with normal Python lists, how to manipulate them and combine them using different operations, different techniques to sort arrays, broadcasting and indexing.
3. Data Manipulation with Pandas: Chapter 3 describes the goal and uses of Pandas, Series and Dataframes, Operating and handling data, merges and joins, pivot tables and how to handle missing values.
4. Visualisation with Matplotlib: This chapter covers most of the plots that we would want to use when analysing data: scatter plots, histograms, density plots, and how to customise them and get them pretty for presentation.
5. Machine Learning: The last chapter of the book is dedicated towards familiarising the reader with Scikit-Learn (the most used Python Library for Machine Learning) and the different Machine Learning algorithms that can be found inside it. It won’t make you an expert but its a nice first step into this world.
Summary of Python Data Science Handbook
Python Data Science Handbook builds upon Python basics – Ipython, using Jupyter, Numpy, Pandas and Matplotlib and with that knowledge discusses some important Machine Learning models.
It is a great book for learning how to use the main Python Libraries for Data Science and to have a quick sneak peak at what Machine Learning is. Also, it is a fantastic resource not just for a first time read, but to have as a companion when doing Data Analysis.
The book for us only has one drawback, which is the lack of exercises about each of the sections to consolidate what has been learned. It is true that the goal of the text is to provide a handbook for quick references, but some small tests wouldn’t have hurt.
If you get hold of this book and want to practice what you have learned after each chapter, you can go to Dataquest.com. It features a lot of nice coding exercises about Numpy, Visualisation, Pandas, and almost anything you can’t think of.
Still, it is an awesome book that we recommend having close by. You can buy it on Amazon here:
Python Data Science Handbook
- O\'Reilly Media
- VanderPlas, Jake (Author)
- English (Publication Language)
- 548 Pages - 12/20/2016 (Publication Date) - O'Reilly Media (Publisher)
We hope you enjoy it, and that it serves you as well as it has served us. Also, if you want to take a step further, take a look at our Machine Learning books, or if you are looking for more books like this one, take a look at our Data Analysis books section.
Similar books to this one, that you can find reviewed at our site are:
Thank you for reading How to Learn Machine Learning and have a wonderful day!