Big data and open data (BIG)
Code interne : (BIG)
Programme de cours :
responsable : Monsieur le professeur Patrick BURY
Big Data & Open Data
We want to provide here a global overview of the main tools used in the Big Data world. Lectures are spread over two main parts for each lecture, one theoretical and one very practical, hands on data. We will introduce providers (Microsoft, Google, Amazon), main devops tools (Docker, gitlab, …) , workflow (mar-reduce), databases (graph DB, document oriented, …) and tools (tensorflow). We also give a short introduction to Dataviz and statistical interpretation.
Map :
- Introduction to Big Data
- Open Data, definition, sources, Licences
- Big Data definition and anti-definition
- Usual scales in Big Data
- Solutions provider overview
- Big Data Databases
- Introduction, shortcuts and pitfalls in dataviz
- Introduction to Hadoop
- Map-Reduce principles and usage
- AWS offers
- Azure offers
- Google offers
- Sovereign cloud (new for 2020-2021)
- Docker principle and application
- Gitlab Howto
- Map reduce with Azure (can be changed)
- Graph databases
- Tensorflow (new for 2020-2021)