CIS 624                                                   Fall 2005

10/31/05                                                 Midterm Exam                                                       Test Form A

 

Name:

 

Instructions:

    Answer all questions in the answer sheet provided. On Multiple Choice, choose the ONE BEST ANSWER.

    Remember to put the letter of your test form on the top of your answer sheet.

    Hand in Test, Answer Sheet, and Help Sheets, all with you name.

 

Multiple Choice

(2 points each)

 

1.        Which of the following is a form of “drill-down” a decision-maker for a university might do?

A)     Move from looking at “credits enrolled for by students” by department to looking at “credits enrolled for by students” by school.

B)      Move from looking at “credits enrolled for by students” by school to looking at “credits enrolled for by students” by semester

C)      Move from looking at “credits enrolled for by students” by school to looking at “credits enrolled for by students” by departments within a school

D)      All of the above

E)       None of the above.

 

2.        In a dimensional model, which of the following is a useful attribute in the Date dimension?

A)     Day of Week

B)      Fiscal Quarter

C)      Season

D)      Month

E)       All of the above

F)       None of the above

 

3.        Source systems data could be in which of the following formats?

A)     flat file

B)      relational database

C)      hierarchical database

D)      file format specified by source system programs

E)       All of the above

 

4.        Why does data staging tend to be a long and involved process?

A)     Because few people have any experience doing it

B)      Because data in source systems needs more cleaning than would be expected

C)      Because you cannot use a database for the staging area

D)      All of the above

E)       None of the above

 

5.        Which of the following is true about requirements determination interviews

A)     having pre-written questions needlessly removes spontaneity from the interview

B)      tape recording the interview has many advantages and no disadvantages

C)      interviewees are likely to react very negatively to having two interviewers

D)      all of the above

E)       none of the above

 

 


Completion (fill in the blank) (mostly key terms (not all of which are a single word))

(2 points each)

 

6.        In a dimensional model based on a star schema, a(n) ________ table is in the center of all of the other tables, linked to each.

 

7.        It has been recommended that the data warehouse design should avoid the use of ________; these are keys that are made up of a concatenation of different attribute values.

 

8.        ________ is data about data. It is specified by members of the data warehouse team – frequently into a tool that will use it to carry out part of the data warehouse process.

 

9.        ________ is a data staging process that involves identifying members of a single economic household, so that households could be an area of analysis, instead of (just) individuals.

 

10.     Some dimension tables experience periodic changes. These ________ dimensions present a dilemma for data warehouses since the data warehouse is intended to retain history, not just current information.

 

 

True/False – If False Explain Why!

(3 points each)

 

11.     Tables should not be normalized in a data warehouse data model.

 

12.     It is important to discuss an interview with teammates and fill in gaps in notes as soon as possible after the interview.

 

13.     It is largely infeasible to identify customers who are included as multiple records in a source system customer table.

 

14.     One advantage of using a star schema design for a data mart is that it makes it easy for data access tools to generate SQL code in order to produce reports.

 

15.     If an organization has a strong business justification for a data warehouse, that by itself is enough to show readiness for the project.

 

 

Short Answer

(6 points each)

 

16.     Data warehouses exist separate from the operational systems. Why is this separation desirable? (at least 2 good reasons)

 

17.     Why is it beneficial to build a data warehouse incrementally, starting with a single data mart?

 

18.     It has been said that some data warehouse projects suffer from being “in search of demand”. What does this mean? Why is it a problem?

 

19.     What is involved in building a business justification for a data warehousing project?

 

 


Short Answer: Applied

(points as shown)

 

(32 points -  A) 6 points; B) 6 points; C) 8 points; D) 6 points; E) 6 points)

20.     Imagine a hotel chain that is building a data warehouse around the business process of “renting rooms”.

A)     During requirements determination, your colleague has asked for advice on a set of questions s/he has developed.  Critique the following question, to be asked of a hotel chain regional manager during a requirements determination interview:

                        “How does higher management judge your performance?”

B)      Two candidate fact table grain sizes are: each room rented (obviously the room is in one hotel) for a given night, OR registrations for a given kind of room in one hotel for a given night. What are the advantages and disadvantages of the former compared to the latter?

Suppose that the decision has been made that the grain size should be each room rented for a given night.

C)      What would be appropriate contents of the fact table? For each fact(s), explain how additive it is. (I have several in mind; I’d like at least 3)

D)      Further imagine that after data modeling one of the attributes of customer is smoking preference: none, smoking, or non. If a customer changes their smoking preference from smoking to non, the old value (smoking) is replaced by the new value (non). Briefly explain the problem that this presents for the question: show revenues by year by smoking preference.

E)       In planning for aggregates, and ignoring business requirements for this question, if the hotel chain rents 6,000,000 room nights per year, from among 40,000 rooms, in 200 hotels, spread across 50 states, in 6 regions, in 1 company, what might be a favorable aggregate(s) to build? Why?

 

(10 points)

21.     Using Cognos PowerPlay, with the Univ Registrations data cube, show data based on the following criteria:

·         Show data in a table format

·         Show NumCourses

·         Show in row headings cities within Pennsylvania

·         Show in column headings – years

·         Restrict data to Computer Science courses

·         Sort results by total number of courses for all years combined

Print the table and answer the following question: What city had the second most computer science course registrations during the six year time period (after Philadelphia)?