Quinlan, "C4.5: Programs for Machine Learning", Morgan Kaufmann, Oct 1992
2. Information:
This file concerns credit card applications. All attribute names and values have been changed to meaningless symbols to protect confidentiality of the data.
--: 383 (55.5%)
A1: b(0), a(1).
A2: continuous.
A3: continuous.
A4: u(0), y(1), l(2), t(3).
A5: g(0), p(1), gg(2).
A6: c(0), d(1), cc(2), i(3), j(4), k(5), m(6), r(7), q(8), w(9), x(10),
e(11), aa(12), ff(13).
A7: v(0), h(1), bb(2), j(3), n(4), z(5), dd(6), ff(7), o(8).
A8: continuous.
A9: t(0), f(1).
A10: t(0), f(1).
A11: continuous.
A12: t(0), f(1).
A13: g(0), p(1), s(2).
A14: continuous.
A15: continuous.
A16: +(0) ,-(10) (class attribute)
4. Database: 653 instances.
Car Database file has 653 - 37 = 616 instances. The data required for 100% accuracy is:
Class #Required #Actual
0 = + 1600 307
10 = - 1600 383
Total 3200 653
There is not enough data in this database for 100% accuracy. As we will see, the DecisionMaker 2.5 will have an accuracy rate of 100% overall, which is more than expected.
The first 5 rows (All 0-class) and the last 5 rows (All 10-class) are used for question file. The Credit Database file has 606 instances. Below are the first 5 instances in the Database File:
Credit Database File:
0 32.08 4 0 0 6 0 2.5 0 1 0 0 0 360 0 0
0 33.17 1.04 0 0 7 1 6.5 0 1 0 0 0 164 31285
0
1 22.92 11.585 0 0 2 0 0.04 0 1 0 1 0 80 1349
0
0 54.42 0.5 1 1 5 1 3.96 0 1 0 1 0 180 314 0
0 42.5 4.915 1 1 9 0 3.165 0 1 0 0 0 52 1442
0
Below are the first 5 and the last 5 instances in the database, which will be used to test the DecisionMaker's accuracy:
0 30.83 0 0 0 9 0 1.25 0 0 1 1 0 202 0
0
1 58.67 4.46 0 0 8 1 3.04 0 0 6 1 0 43 560
0
1 24.5 0.5 0 0 8 1 1.5 0 1 0 1 0 280 824
0
0 27.83 1.54 0 0 9 0 3.75 0 0 5 0 0 100 3
0
0 20.17 5.625 0 0 9 0 1.71 0 1 0 1 2 120 0
0
0 21.08 10.085 1 1 11 1 1.25 1 1 0 1 0 260 0
10
1 22.67 0.75 0 0 2 0 2 1 0 2 0 0 200 394
10
1 25.25 13.5 1 1 13 7 2 1 0 1 0 0 200 1
10
0 17.92 0.205 0 0 12 0 0.04 1 1 0 1 0 280 750
10
0 35 3.375 0 0 2 1 8.29 1 1 0 0 0 0 0
10
5. Results.
Set precision = 15.
Click "Integer/+ Predict" to get the following answer file:
=================== Beginning =====================
0 30.83 0 0 0 9 0 1.25 0 0 1 1 0 202 0
Possibility Confidence*Probability
0 8000
------------------------------------------------------
0
1 58.67 4.46 0 0 8 1 3.04 0 0 6 1 0 43 560
Possibility Confidence*Probability
0 180577
10 2500
------------------------------------------------------
0
1 24.5 0.5 0 0 8 1 1.5 0 1 0 1 0 280 824
Possibility Confidence*Probability
0 28340
10 16500
------------------------------------------------------
4
0 27.83 1.54 0 0 9 0 3.75 0 0 5 0 0 100 3
Possibility Confidence*Probability
0 542000
------------------------------------------------------
0
0 20.17 5.625 0 0 9 0 1.71 0 1 0 1 2 120 0
Possibility Confidence*Probability
0 5753
10 1125
------------------------------------------------------
2
0 21.08 10.085 1 1 11 1 1.25 1 1 0 1 0 260 0
Possibility Confidence*Probability
0 3244
10 151845
------------------------------------------------------
10
1 22.67 0.75 0 0 2 0 2 1 0 2 0 0 200 394
Possibility Confidence*Probability
10 512000
------------------------------------------------------
10
1 25.25 13.5 1 1 13 7 2 1 0 1 0 0 200 1
Possibility Confidence*Probability
0 974
10 1360
------------------------------------------------------
6
0 17.92 0.205 0 0 12 0 0.04 1 1 0 1 0 280 750
Possibility Confidence*Probability
0 20000
10 28000
------------------------------------------------------
6
0 35 3.375 0 0 2 1 8.29 1 1 0 0 0 0 0
Possibility Confidence*Probability
0 98750
10 256000
------------------------------------------------------
7
Precision of each number:
0.355556
=================== End ==========================
6. Analysis
As you can see, the answers are 100% correct, which is more than expected.