CIS 624                                                   Spring 2001

03/06/01                                                 Midterm Exam                                                        Test Form A

 

Name:

 

Instructions:

    Answer all questions in the answer sheet provided. On Multiple Choice, choose the ONE BEST ANSWER.

    Remember to put the letter of your test form on the top of your answer sheet.

    Hand in Test, Answer Sheet, and Help Sheets, all with you name.

 

Multiple Choice

(2 points each)

 

1. Which of the following is NOT part of data warehouse project planning?

A)     Readiness assessment

B)      Interviewing future users

C)      Determining staffing needs

D)      All of the above

E)       None of the above.

 

2. What products / infrastructure can be important aspects of a data warehousing initiative?

A)     database management system

B)      extract, transform and load tool

C)      server(s)

D)      data access tool, such as an OLAP tool

E)       All of the above

F)       None of the above

 

 

True/False – If False Explain Why!

(3 points each)

 

3. Typically, the biggest justification of a data warehouse project is the time savings for the organization’s business analysts.

 

4. It is important to discuss an interview with teammates and fill in gaps in notes as soon as possible after the interview.

 

5. It is recommended that initial development of a data warehouse should target as wide an area within a company as possible in order to gain organizational allies and supporters.

 

6. A dimension in a star schema could have multiple hierarchies.

 

7. Aggregate data is redundant if transaction data is stored in the data warehouse.

 

8. One advantage of using a star schema design for a data mart is that it makes it easy for data access tools to generate SQL code in order to produce reports.

 

 

Completion (fill in the blank) (mostly key terms (not all of which are a single word))

(3 points each)

 

9. When a business user desires more detailed information that they are currently looking at, they can ________, looking at data divided by more refined attributes or restricted to a particular value for an attribute. For instance, sales by year could be refined to sales by quarter, or to sales of a particular department by year.

 

10. The ________ of a project is the task or tasks that if delayed would delay the completion of the whole project. Project management software can aid in the identification of this.

 

11. In a dimensional model based on a star schema, a(n) ________ table is in the center of all of the other tables, linked to each.

 

12. Some data warehouses have failed because different parts of the warehouse were developed independently, and as it turns out incompatibly. These individual, incompatible data marts have been referred to as ________.

 

13. The _______ room is another way of describing the process of extracting data from source systems and preparing it for the data warehouse.

 

14. ________ is data about data. It is specified by members of the data warehouse team – frequently into a tool that will use it to carry out part of the data warehouse process.

 

 

Short Answer

(points as shown)

 

(5 points)

15. Data warehouses exist separate from the operational systems. Why is this separation desirable? (at least 2 good reasons)

 

(4 points)

16. Why is it recommended that fact tables be at the atomic grain size?

 

(6 points)

17. Briefly explain - once a data mart has been delivered to users, what activities are important for the on-going success of the data mart?

 

(5 points)

18. What is involved in building a business justification for a data warehousing project?

 

(4 points)

19. Why is it important to break the project plan down into small tasks, preferably taking two weeks or less?

 

(8 points)

20. Briefly explain advantages and disadvantages of using a tape recorder during requirements determination interviews.

 

 

Short Answer: Applied

(points as shown)

 

(12 points)

21. Imagine that a music swapping service is building a data mart surrounding the downloading of songs. A song dimension includes an attribute of whether the song is subject to royalty payments or not. If this attribute is modeled as a slowly changing dimension using Kimball et al’s “type 2” method – create a new record after a change, briefly explain how this method yields correct results for both questions such as a) show the number of downloads by royalty status and b) show the number of downloads by musical genre.

 

(8 points)

22. Imagine that an auto insurance company is building a data mart surrounding accident claims. They have determined that the grain size of the fact table will be an individual accident. Dimensions include the insured person, the car, the time, etc. Candidate facts include: total claim amount, speed of insured at time of accident, and weather conditions at time of accident. For each fact, explain how additive it is.

 

(8 points)

23. Critique the following question, asked of a manager of customer service (it isn’t the first question asked):

                “Don’t you think the warehouse should be web-enabled for maximum data availability ?”