CIS 624                                                             Fall 2004

10/19/04                                                           Midterm Exam                                                 Test Form A

 

Name:

 

Instructions:

    Answer all questions in the answer sheet provided. On Multiple Choice, choose the ONE BEST ANSWER.

    Remember to put the letter of your test form on the top of your answer sheet.

    Hand in Test, Answer Sheet, and Help Sheets, all with you name.

 

Multiple Choice

(2 points each)

 

  1. What might be specified as part of documenting the scope of a data warehousing project?

A)    who the project team member are

B)     which group(s) of users will be supported

C)     the dimensional model

D)    all of the above

E)     none of the above

 

  1. After all data has been initially loaded and successfully deployed in a data mart, when should incremental loads of up-to-date data be done?

A)    Nightly

B)     Weekly

C)     Monthly

D)    on a schedule appropriate for the business requirements

 

3.      Data Warehouses exist separate from the operational systems. Why is this separation desirable?

A)    so that the business-side can control the data warehouse

B)     so that they don’t interfere with each other’s speed of performance

C)     so that users of the data warehouse do not have access to too much data

D)    all of the above

E)     none of the above

 

4.      In what way does a star schema design differ from recommendations from entity-relationship modeling theory for databases?

A)    Fact tables are denormalized

B)     Dimension tables are normalized (in 3rd Normal Form)

C)     Dimension tables are denormalized

D)    None of the above

 

 

Completion (fill in the blank) (mostly key terms (not all of which are a single word))

(2 points each)

 

5.      On-line Analytical Processing tools frequently store the data that they use (particularly aggregates, but for some atomic data as well) in a form of proprietary format called a(n) ________.

 

6.      The ________ of a project is the task or tasks that if delayed would delay the completion of the whole project. Project management software can aid in the identification of this.

 

7.      Some data warehouses have failed because different parts of the warehouse were developed independently, and as it turns out incompatibly. These individual, incompatible data marts have been referred to as ________.

 

8.      ________ keys are recommended in data warehouses, instead of using the primary keys from the transaction oriented systems.

 

True/False – If False Explain Why!

(3 points each)

 

9.      Interviews with potential users should be carried out in the IT conference room so project team members spend less time running around.

 

10.   An existing entity relationship diagram for source systems is of no value to data warehouse data modelers, since a different model is being used.

 

11.   Smart keys are useful because they avoid the need to store longer textual descriptions.

 

12.   The idea of meta data is entirely theoretical, no data warehouse tool currently uses it.

 

13.   Data warehouse project success is more likely if the first deployment offers some benefit to all stakeholders in the organization.

 

14.   It is common for organizations to have a standard way of measuring expected benefit of IT projects so that they can be compared to each other.

 

 

Short Answer

(points as shown)

 

(6 points)

15.   How does having data warehouse data in a star schema design aid developers of end-user data access tools? (NOTE – In this question, I am not interested in any other reasons for star schema design)

 

(10 points)

16.   Briefly explain each of the critical readiness factors for the success of a data warehousing initiative.

 

(6 points)

17.   In trying to limit a data warehousing project, why can it be beneficial to limit the number of users to be supported?

 

(6 points)

18.   Why is it a good idea to have at least two members of the project time at a requirements determination interview? (at least 3 good reasons)

 

 

Short Answer: Applied

(points as shown)

 

(8 points)

19.   Imagine that a music swapping service is building a data mart surrounding the downloading of songs. A song dimension includes an attribute of whether the song is subject to royalty payments or not. If this attribute is modeled as a slowly changing dimension using Kimball et al’s “type 2” method – briefly explain how this method yields correct results for both questions such as a) show the number of downloads by royalty status and b) show the number of downloads by musical genre. You should makes clear why both questions will be correctly answered using this strategy.   Example records could be very useful.

 


For questions 20-22, imagine that a realtor that specializes in renting houses by the week down at the Jersey shore is building a data mart around the rental aspect of their business.

 

(6 points)

20.   Critique the following question, asked of a manager of a particular office of the realtor during a requirements determination interview (it isn’t the first question asked):

“How do you know if the agents that work for you have been doing a good job?”

 

(8 points)

21.   They have determined that the grain size of their fact table will be a one week rental of a particular house (if somebody rents a house out for two weeks, that will be two records in the fact table). Candidate facts include the Asking Price for the Week, the Amount Paid, and the Number of Weeks the Renter actually rented. For each fact, explain how additive it is.

 

(6 points)

22.   Ignoring business requirements for this question, if a realtor lists 100,000 houses in 40 zip codes in 20 cities, in 5 states, what might be a favorable aggregate to build? Why?

 

 

Applied

(10 points)

 

  1. Using Cognos PowerPlay, with the Sample Cube data cube, show data based on the following criteria:

·        Show data in a table format

·        Show quantity sold (instead of revenues)

·        Show in row headings, all months within 2001

·        Show in column headings – order methods

·        Restrict data to sold via golf shops (vendors)

·        Sort by amount sold via e-mail

Print the table and answer the following question: What month in 2001 had the highest quantity sold through Golf Shops  via e-mail?