Data Management (academic year 2024/2025)



For whom is this course. This 6 credits course is for the students of the Master of Science in Engineering of Computer Science (School of Engineering) of the Sapienza Università di Roma. This course is also for students of the Corso di Laurea Magistrale in Ingegneria Gestionale of the same School. The language for both the course and the exam is English. The lectures are held in the second semester (February 2025 - May 2026).
Prerequisites. A good knowledge of the fundamentals of Programming Structures (algorithms and data structures), Programming Languages, Databases (SQL, relational data model, Entity-Relationship data model, conceptual and logical database design), Theoretical Computer Science (computational complexity, computability) is required.
Course goals. The course presents the basic concepts of data management systems. Several major issues related to the theory and the design of data management systems are covered, including NoSQL databases, transaction management, concurrency control, recovery, file and index organizations, query processing.


  • News
    • April 30, 2025. The exam of the summer session will held on 5/6/2025 at 2pm in classroom 204MP, on 11/7/2025 at 9am in classroom 204MP and on 17/9/2025 at 9am in classroom 204MP. Who do not register within the deadline will not be accepted at the exam and there will be no exception to this rule.
    • February 22, 2025. The course will start on February 27, 2025.
  • Tutor: Roberto Delfino (delfino@diag.uniroma1.it)
  • Lecture material
    • M. Lenzerini, Lecture notes (slides to download)
      Students can download the course slides by accessing the MOODLE system at this page.
      The slides will be available during the lecture period. Please, note that all students of Sapienza can access the MOODLE system by using the user name and the password of the university
    • R. Ramakrishnan, J. Gehrke. Database Management Systems. McGraw-Hill, 2004
    • Students willing to read more about concurrency control can freely download an excellent book from the following site: http://research.microsoft.com/en-us/people/philbe/ccontrol.aspx
      If one is looking for a more modern book, which is not free, a good suggestion is:
      Gerhard Weikum, Gottfried Vossen, "Transactional Information Systems: Theory, Algorithms, and the Practice of Concurrency Control and Recovery", The Morgan Kaufmann Series in Data Management Systems.

  • Lectures
    • Monday: [1:00pm - 3:00pm], via Eudossiana 18 (RM041), classroom 41,
    • Thursday: [10:00am - 12:00pm], via Eudossiana 18 (RM041), classroom 41,
    • Thursday: [12:00pm - 01:00pm], via Eudossiana 18 (RM041), classroom 41.

    The lectures can also be attended on-line, following this link:
    https://uniroma1.zoom.us/j/83220889311?pwd=ZkxrQ3crNFJDVlIwS21jelp4bjRXZ...

    The recording of the on-line lectures will be available in the MOODLE page of the course.

    Week Monday (1:00pm - 3:00pm) classroom 41 Thursday (10:00pm - 12:00pm) classroom 41 Thursday (12:00pm - 01:00pm) classroom 41
    01 (Feb 24)
    -------
    Lectures 1,2
    - Course overview
    - Relational data model
    - Recap of relational algebra
    Lectures 3
    - Exercises on relational algebra
    02 (Mar 03) Lectures 4,5
    - Recap on SQL
    - Exercises on SQL
    Lectures 6,7
    - Windows in SQL
    - Data Warehousing: introduction and architectures
    Lectures 8
    - OLTP vs OLAP
    03 (Mar 10)
    -------
    Lectures 9,10
    - OLAP operators
    Lectures 11
    - Conceptual modeling of multidimensional data
    04 (Mar 17) Lectures 12,13
    - Logical design of multidimensional modeling
    Lectures 14,15
    - Exercises on Data Warehousing
    - The tableaux system for OLAP
    Lectures 16
    - Illustrations of projects for the exam (Dott. Roberto Delfino)
    05 (Mar 24) Lectures 17,18
    - NoSQL databases
    - Graph-oriented databases
    Lectures 19,20
    - Graph databases
    - The Neo4j system
    Lectures 21
    - RDF databases
    06 (Mar 31) Lectures 22,23
    - The SPARQL query language
    Lectures 24,25
    - Introduction to Knowledge Graphs
    - RDFS-based Knowledge Graphs: syntax
    Lectures 26
    - RDFS-based Knowledge Graphs: semantics
    07 (Apr 07)
    -------
    Lectures 27,28
    - Document databases
    Lectures 29
    - The MongoDB system
    08 (Apr 14) Lectures 30,31
    - RDFS-based Knowledge Graphs: inference
    -------
    -------
    09 (Apr 21)
    -------
    Lectures 32,33
    - Buffer management
    - Transactions and concurrency
    - Serializability
    Lectures 34
    - View-serializability
    10 (Apr 28) Lectures 35,36
    - Conflict-serializability
    -------
    -------
    11 (May 05) Lectures 37,38
    - Concurrency control based on locking
    Lectures 39,40
    - Deadlock management
    - The notion of recoverability
    Lectures 41
    - Exercises on concurrency control
    12 (May 12) Lectures 42,43
    - Concurrency control based on timestamps
    - Concurrency control in SQL
    - Simple file organizations
    Lectures 44,45
    - Sorting in secondary storage
    - Types of indexes
    Lectures 46
    - Sorted indexes
    13 (May 19) Lectures 47,48
    - ISAM indexes
    - B+ tree indexes
    Lectures 49,50
    - Query evaluation: introduction
    - One pass algorithms
    - Nested loop algorithms
    Lectures 51
    - Two pass algorithms based on sorting
    14 (May 26) Lectures 52,53
    - Two pass algorithms based on hashing
    - Multipass algorithms
    Lectures 54,55,56 (28/5/2025)
    - Exercises on concurrency and access file

    Lectures 57,58,59 (29/5/2025)
    - Index-based algorithms
    - Parallel algorithms
    - Exercises on evaluation of operators

  • Topics covered (programma d'esame)
    • 1. Recap on databases and SQL
    • 2. Data warehousing: architectures, multidimensional modeling, OLAP
    • 3. Data Warehouse design methodologies; The DFM model; The logical models for DW: the star schema, the snowflake schema
    • 4. NoSQL data models and systems: the case of graph databases; The Neo4j system
    • 5. The notion of Knowledge Graph; RDFS Knowledge Graphs:syntax, semantics, inference
    • 6. NoSQL data models and systems: the case of Document databases; The MongoDB system
    • 7. The structure of a Data Management System (DBMS)
    • 8. Buffer management: buffer pool, replacement strategies, operations on the buffer
    • 9. Concurrency management: The concept of transaction; The notion of serializability; Concurrency management strategies; Concurrency control in SQL and PostgreSQL; Recovery: Crash management; Classification of failures; Recovery strategies
    • 10. Physical structures for data management: Record and page organizations, Simple file organizations, Indexed file organizations
    • 11. Query processing: evaluation of relational algebra operators
  • Exam
    • Each student can choose between two options:
      • OPTION 1: doing only the written exam; in this case the written exam is full and the mark is between 18 and 30, as usual.
      • OPTION 2: doing both the written exam and the project. In this case,
        • the written exam is shortened wrt to OPTION 1, the maximum mark is 24 and the minimum mark for passing the exam is 15,

        • for the project, the maximum mark is 8 and the minimum mark for passing the project is 4.

        The final mark will be the sum of the two marks.

      The procedure for choosing between the two options is easy: if a student chooses option 2, then (s)he must contact the tutor and tell him about the decision of choosing OPTION 2 before booking in INFOSTUD for the written exam, and even before formulating the project proposal; if a student chooses option 1, then (s)he simply has to book for the written exam without having contacted the tutor for the step described above.
      Notice that there is no deadline for choosing OPTION 2; notice also that booking for the exam without having informed the tutor about the decision of opting for OPTION 2 implies that the student has opted for OPTION 1.

    • Past written exams: you can have a look at the texts of past exams
    • To book for the written exam: Please, follow the on-line booking procedure.
    • Schedule of written exams (decided by the administration office, not by the professor):
      • First written exam: June 2025
      • Second written exam: July 20245/li>
      • Third written exam: September 2025
      • Special exam session (only for "fuori corso" or "part-time" students): October 2025
      • Fourth written exam: January 2026
      • Fifth written exam: February 2026
      • Second special written exam session (only for "fuori corso" and "part-time" students): April 2026
  • Projects

    Projects can be presented at any time, before, or even AFTER the written exam. The exam will be formally registered when both the project presentation and the written exam will be carried out. When both parts of the exam will be successfully completed, the student must wait for the first exam session ("appello") available, and book in INFOSTUD in order to have the exam formally registered.

    The procedure for the proposal, the assignment, the preparation and the discussion of a project is described in this document.

  • Information about course evaluation by students
    • Data about the evaluation of the course by students of the previous editions are available in the home pages of the corresponding editions (see below). Data about the 2024-2025 edition will be posted here as soon as they are available.
  • Information on past editions of this course
  • Office hours. Tuesday, 5:00 pm at the Meet room at https://meet.google.com/hzy-save-oqw -- please, look at the last minute news for the next office hours.