Numérique - Systèmes d'Information

27260047 - Big Data Storage and Management

Niveau de diplôme
Crédits ECTS 6
Volume horaire total 40
Volume horaire CM 40

Responsables

  • VARGAS SOLAR Genoveva

Objectifs

This course explores the design and use of data warehouses, lakes, and lakehouses to support business intelligence and decision-making. Students learn modeling, integration, and querying techniques through hands-on labs and projects. Topics include ETL/ELT, OLAP, cloud deployment, and data governance. Emphasis is placed on user needs, system architecture, and analytical performance. The course culminates in a final project where students build and present a complete BI solution.

Estimation du temps de travail personnel (en dehors des cours) : 10 heures

TARGETED KNOWLEDGE AND SKILLS 

In terms of knowledge, the module enables students to understand the definition, structure, and purpose of data warehouses, data marts, other forms of storage, and dimensional modeling;
1. To define the key steps of data integration and understand the added value of specialized tools compared to conventional programming languages; 
2. To grasp how data access and analysis tools can empower end users.

Following this course, the acquired skills will include:
· Defining and implementing standards, methods, tools, and procedures that meet business requirements
· Continuously adapting to evolving technological environments
Recommending or negotiating IT solutions across administrative, industrial, scientific, and technical domains.

Contenu

COURSE OUTLINE

Module 1: Introduction to Business Intelligence 
Module 2: Data Warehousing 
Module 3: Data Lakes & Lakehouses 
Module 4: BI in the Cloud 
Module 5: Final Project – BI Demo Fest 
▪ BI Project Management 
▪ BI Lifecycle
▪ Final Presentation & Demo 
 

Bibliographie

PRESCRIBED TEXTS AND PUBLICATIONS

1. Ralph Kimball, Laura Reeves, Margy Ross, Warren Thornthwaite (2006). The Data Warehouse Lifecycle Toolkit, 2nd Edition, Wiley.
2. Efraim Turban, Ramesh Sharda, Dursun Delen, David King (2010). Business Intelligence: A Managerial Approach, 2nd Edition, Prentice Hall. 
3. Paulraj Ponniah (2010). Data Warehousing Fundamentals for IT Professionnals, 2nd Edition, Wiley

EMBLEMATIC BOOKS OR RESEARCH PAPERS REGARDING THE SUBJECT OF THE COURSE

1. Michalczyk, S., Nadj, M., Azarfar, D., Maedche, A., & Gröger, C. (2020). A state-of-the-art overview and future research avenues of self-service business intelligence and analytics.
2. ABU-ALSONDOS, I. The impact of business intelligence system (BIS) on quality of strategic decision-making. International Journal of Data and Network Science, 2023, vol. 7, no 4, p. 1901-1912.

Contrôles des connaissances

Individual grade
In class exam, 2h

Other grade(s)
Written Test
Project report and presentation

Weight: 40/60

Informations complémentaires

TEACHING METHODS
- Interactive lessons: theoretical contributions accompanied by concrete examples and discussions.
- Group case studies: progressive application of concepts to a specific scenario.

NATURE OF MATERIALS
Lecture slides and detailed case study.

TEACHING INNOVATIONS AND USE OF TECHNOLOGY

The course introduces a pedagogical innovation by using role-based learning where students adopt different professional perspectives such as data engineers, analysts, and decision-makers. This approach helps learners better understand how data systems serve various business needs and encourages collaboration and communication within diverse teams. Another key aspect is project-based learning. Throughout the course, students work on a realistic case study that guides them through designing and implementing a complete business intelligence solution. This hands-on method reinforces technical skills while also developing problem-solving and critical thinking abilities.
­The course also promotes active engagement through challenge-based assessments. Instead of traditional exams, students solve practical tasks such as SQL query challenges and OLAP exercises. These interactive activities make learning more dynamic and help students receive immediate feedback.
On the technology side, the course uses cloud-based environments and open-source tools to simulate real data ecosystems. Students gain experience with platforms like PostgreSQL, Delta Lake, and BI tools such as Power BI or Superset, which are widely used in industry.
Finally, the course supports reproducibility and transparency by encouraging the use of version-controlled notebooks and data pipelines. Technologies like Jupyter and Git help students document their work clearly and collaborate more effectively, preparing them for real-world data projects.

PRE-REQUISITES IN TERMS OF KNOWLEDGE AND SKILLS
Mastery of knowledge in system design and modeling, as well as database management systems and the SQL language.

ADVISED PRIOR READING
https://fr.wikipedia.org/wiki/Entrep%C3%B4t_de_donn%C3%A9es

Formations dont fait partie ce cours