Midterm Exam Answers

Test Form C.

  1. D
  2. D
  3. D
  4. D
  5. Greedy
  6. Supervised
  7. Nearest Neighbors
  8. Prior Probability
  9. Black box
  10. Nominal
  11. True
  12. True – but false with an argument that “in some tasks people accept argument via cases” would also get full credit
  13. False – there are a number of successful fielded applications as discussed in Chapter 1 – screening borderline loan approvals, detecting oil slicks, forecasting electricity demand, etc. Each provided significant benefit to the company using them.
  14. False – some methods can do numeric prediction – including regression tree and model tree methods, instance-based learning, etc
  15. People must be involved a number of ways (having 4 would be full credit):

·         Clean data – somebody with knowledge of the data is very valuable in cleaning the data, and actually cleaning the data usually requires human action

·         Data preparation – it may be valuable for the learning method for a person to develop new attributes based on existing attributes. Human intelligence is needed to determine what would be valuable

·         Determine what experiments to do – people determine what algorithms are appropriate for the current data, what to try

·         Evaluate results – people must determine if the accuracy found in tests is good or not

·         Evaluate results – people must determine if the info learned makes sense

·         Use results – people must determine what to do with what is learned

  1. If the algorithm suggests making a decision based on an attribute that would be considered discriminatory (race, ethnicity, age, gender), the result (if used) is discrimination.
  2. A total of ___9____ of ___18_____ predictions were correct.

·         A total of ____9____ predictions were incorrect.

·         Of the ____12____ times Cancer was predicted, this prediction was correct  ___5_____ times.

·         Of the ___6___ times Not Cancer was predicted, this prediction was correct ___4_____ times.

·         Of the ___7_____ times that Cancer occurred, the prediction was correct -____5____ times

·         and incorrect ___2____ times.

·         Of the ___11_____ times that Not Cancer occurred, the prediction was correct -____4____

·         times and incorrect ___7____ times.


  1.  

Probabilities with Laplace Estimator

 

Area

Purchase = Yes

Purchase = No

Mt Airy

4/10

4/13

Germantown

5/10

3/13

Manyunk

1/10

6/13

 

Home

Purchase = Yes

Purchase = No

Own

5/9

5/12

Rent

4/9

7/12

 

Age

Purchase = Yes

Purchase = No

Young

6/11

2/14

Established

3/11

5/14

Middle Aged

1/11

4/14

Old

1/11

3/14

 

To Predict

Yes

No

Purchase

8/19

11/19

 

Test Instance:  Mt Airy, Rent, Established

 

Prob(Yes | Evidence ) = 4/10 * 4/9 * 3/11 * 8/19  = .0204

Prob(No | Evidence ) = 4/13 * 7/12 * 5/14 * 11/19 = .037

Predict NO, since it’s value is higher.

 

  1.              

Sorted By Rating, showing value for Buy. Tally Yeses and Nos until have at least 3 of one – then continue until Buy answer switches

Rating

Buy

Num Yes

Num No

21

No

0

1

25

Yes

1

1

27

No

1

2

28

Yes

2

2

29

No

2

3

30

No

2

4

30

No

2

5

33

No

2

6

35

No

2

7

38

No

2

8

40

Yes

1

0

41

Yes

2

0

41

No

2

1

45

No

2

2

48

No

2

3

49

No

2

4

51

No

2

5

52

Yes

1

0

53

Yes

2

0

 

Dividing lines are halfway in between – hence 39 and 51.5

Technically, the first two categories could be collapsed into one category since they have the same answer (No)