CIS 624               Spring 2008                  Assignment 3 – Data Modeling - Aggregates

 

Assigned: 04/03/08

Due: Start of Class on 04/10/08

 

Task:

Answer the following questions. Your answer should be neat and easy to read.

 

Situation:

Imagine a university that is building a data warehouse around the business process of “registering students”.  They have an existing 3NF Database (design attached) as a source of data. Brief explanation: Each Student may enroll in several sections, each of which can enroll several students. Each section has a room and a professor, while the room can be used for multiple sections, and each professor can teach multiple sections. The section is one of many for a catalog course. The catalog courses can have many prerequisites and be a prerequisite for multiple courses as well.

 

Suppose there are 5000 students, 150 Rooms, 600 Courses, 300 Faculty, Time is kept in Minutes (1 record per minute), Date is kept in Days (1 record per day), and data is to be kept for a five year period. Suppose that the average section has 15 students in it and that there are 3000 sections offered per year.

A)      Ignoring business requirements for this question, if the Student dimension has 5000 students which are in a hierarchy with 50 majors, in 3 schools, in 1 university, what might be a favorable aggregate(s) to build? Why?

Suppose an aggregate is being created for registrations by StudentMajor by Semester in a relational database according to Kimball’s recommendation for RDB aggregate storage. There are 50 majors and 3 semesters per year.

B)      What is the largest number of rows that can be in the aggregate fact table?

I expect that 30 majors will only have enrollments during two of the semesters, while the rest will have enrollments all three semesters each year.

C)      Estimate the amount of disk storage needed for the aggregate fact table. Assume each record will be 20 bytes.

D)      What dimensions will be related to the aggregate fact table.

E)      For each dimension, what attributes will be included?

F)       Why may dimension tables related to an aggregate fact table be “shrunken” in both the number of rows and in the number of columns?