Data Science with Python

Related: Data Science with R

Getting started? Choose one guide:

  1. kdnuggets
  2. analyticsvidhya

We’ll break down data science activities into four parts: getting and cleaning data, exploratory data analysis, machine learning, and data infrastructure engineering.

Exploratory Data Analysis

Amazingly, exploratory data analysis is so powerful that it alone is often sufficient to produce tremendous values:

Getting and Cleaning Data

For this particular task, pandas (and also numpy) is indispensable. Aside from providing a nice dataframe, it’s very efficient in handling timeseries data.

Other tools suitable for getting/cleaning data:

Also see: Tidy Data in Python

Machine Learning

In Python, the most popular tool for Machine Learning is still scikit-learn, but it’s certainly not the only powerful tool:

See also: Top 20 Python ML Projects 2016

Learn with scikit-learn: Cheat sheet

Examples of use cases:

Data Infrastructure Engineering



Resources to learn Python

  2. Introduction to Python for Econometrics, Statistics and Data Analysis
  3. Yhat blog: moving from R to Python
  4. Python for Informatics

Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s