Hitchhiker's Guide to Data Science

About this content

This material contains different quick courses with many examples and solutions in mathematical problems introducing popular technologies used in data analysis in order to solve these problems, this material is aimed to people that are interested in to starting in data science and big data technologies.

About this material.

The material presented has been developed with free software, the code that the notebooks contain is mostly its own and made for academic and training purposes.

Software used during the elaboration of the content:

Apache Spark 2.0.1 https://spark.apache.org/
Jupyter notebooks http://jupyter.org/
Python 3 https://www.python.org/download/releases/3.0/
Docker https://hub.docker.com/r/jupyter/all-spark-notebook/

Python

Introduction to Python Get notebook
Variables Get notebook
Basic Operators Get notebook
Control Statements Get notebook
Functions Get notebook

Pyspark

Spark Introduction Get notebook
Distributed operations with RDDs Get notebook
DataFrames and Relational Operations Get notebook