Partager cette page :
Discipline(s) : Infomatique et télécommunications

Cloud et Big Data

Semestre Semestre 2
Type Facultatif
Nature UE

Objectifs

The goal of this lecture is to serve as a first step towards exploring the Hadoop platform and also to provide a short introduction into working with big data in Hadoop. An overview on Big Data including definitions, the source of Big Data, and the main challenges introduced by Big Data, will be presented. We will then present the MapReduce programming model as an important programming model for Big Data processing in the Cloud. Hadoop ecosystem and some of major Hadoop features will then be discussed. Finally, we will discuss several approaches and methods used to optimise the performance of Hadoop in the Cloud.

Several hand-ons could be provided to study the operation of Hadoop platform along with the implementation of MapReduce applications.

Contenu

Partie Cloud

1. Introduction au Cloud computing, définitions, principes, usages, stack, limites
2. Virtualisation, définitions, différentes techniques de virtualisation, machine virtuelle d'un point de vue système, migration, suspend/resume
3. Gestionnaire niveau IaaS, rôles, fonctionnalités, gestion des ressources, ordonnancement, monitoring
4. Gestion niveau PaaS, élasticité, SaaS, SLA, étude de l'exemple des flash crowds
5. Différentes architectures de cloud, étude d'exemples (OpenNebula, OpenStack), gestion des données, gestion du réseau
6. Hot topics : green, securité, cloud distribué, mobile cloud, cloud hybride, fédérations, analyse de performances, modèles économiques


Partie Big Data

Data volumes are ever growing, for a large application spectrum going from traditional database applications, scientific simulations to emerging applications including Web 2.0 and online social networks. To cope with this added weight of Big Data, we have recently witnessed a paradigm shift in the way data is processed through the MapReduce model. First promoted by Google, MapReduce has become, due to the popularity of its open-source implementation Hadoop, the de facto programming paradigm for Big Data processing in large-scale data-centers and clouds.

Mise à jour le 13 avril 2018