4. Application Database
Examples
4.1 Wisconsin Breast Cancer Database
4.2 Car Evaluation Database
4.3 Credit Approval Database
4.4 Ecoli database: Protein Localization
Sites
4.5 Heart Disease Databases
4.6 Boston Housing Database
4.7 Forensic Science Database
4.8 Nursery School Database
4.9 Thyroid Database
4.10 More
4.
Application Database Examples
Figure 9. Examples.
All examples here are build in the DecisionMaker. To generate
the Heart Disease Database, for example, click "Example/Heart Disease Database
(13, 1), 5". Here "(13, 1), 5" means:
-
13 variables make up the questions
-
1 variable provides the answers
-
5 classes
Look at the status bar; it indicates this example has a total of 920 instances.
As a rule of thumb, your database should have 100 instances per attribute
per class. If you increase the precision from 10 to 100, your database
should have 200 instances per attribute per class. This amount of data
can easily generate a prediction with 100% accuracy.
Example 1: 10 attributes, 2 classes.
100 * 10 * 2 ===> 2,000 instance for precision = 10.
200 * 10 * 2 ===> 2,000 instance for precision = 100.
If you do not have enough data, the accuracy will drop. If your data
is substantially less than the above amount, you can only use the DecisionMaker
(DM) for making an educated guess.
The databases used by the DecisionMaker as examples are:
-
Wisconsin Breast Cancer Database: Integer Weighted Average
-
Car Evaluation Database: Integer Distribution
-
Credit Approval Database: How to handle text Input
-
Ecoli database: Not enough data, educated guess
-
Boston Housing Database: continous classes.
-
Heart Disease Database: How to handle missing data
-
Forensic Science database: Not enough data, educated guess
-
Nursery School Database: enough data ==> 100% accuracy
-
Thyroid Database: 2 output variables
Depending on the amount of data you have, the DM 2.5 can be used
in 2 ways:
-
Making a decision
-
Making an educated guess.
4.1 Wisconsin Breast Cancer Database
4.2 Car Evaluation Database
4.3 Credit Approval Database
4.4 Ecoli database: Protein Localization
Sites
4.5 Heart Disease Databases
4.6 Boston Housing Database
4.7 Forensic Science Database
4.8 Nursery School Database
4.9 Thyroid Database
4.
10 More
Please see:
UCI Machine Learning Repository
At:
http://www.ics.uci.edu/~mlearn/MLSummary.html
-
Abalone Database
-
Adult Database
-
Annealing Database
-
Arrhythmia Database
-
Artificial Characters
Database
-
Audiology Databases
-
Auto-Mpg Database
-
Automobile Database
-
Badges Database
-
Balance Scale Database
-
Balloons Database
-
Breast Cancer Database
(restricted access)
-
Wisconsin Breast
Cancer Databases
-
Pittsburgh Bridges
Database
-
Car Evaluation
Database
-
Census Income Database
-
Chess Databases
-
Bach Chorales (time-series)
Database
-
Connect-4 Opening
Database
-
Credit Screening
Databases
-
Computer Hardware
Database
-
Covertype data
-
Cylinder Bands
Database
-
Dermatology Database
-
Diabetes Data
-
Document Understanding
Database
-
EBL Domain Theories
and Examples
-
Echocardiogram
Database
-
Ecoli Database
-
Flags Database
-
Function Finding
Databases
-
Glass Identification
Database
-
Hayes-Roth Database
-
Heart Disease Databases
-
Hepatitis Database
-
Horse Colic Database
-
Housing Database
(Boston)
-
ICU Data
-
Image segmentation
Database
-
Ionosphere Database
-
Iris Plant Database
-
Ftp Access
-
Kinship Database
-
Labor relations
Database
-
LED Display Domains
-
Lenses Database
-
Letter Recognition
Database
-
Liver-disorders
Database
-
Logic-theorist
-
Lung Cancer Database
-
Lymphography Database
(restricted access)
-
Mechanical Analysis
Data
-
Meta-data Database
-
Mobile Robots Database
-
Molecular Biology
Databases
-
MONK's Problems
-
Moral Reasoner
Database
-
Mushrooms Database
-
MUSK Databases
-
Nursery Database
-
Othello Domain
Theory
-
Page Blocks Classification
Database
-
Pima Indians Diabetes
Database
-
Optical Recognition
of Handwritten Digits
-
Pen-Based Recognition
of Handwritten Digits
-
Postoperative Patient
Database
-
Primary Tumor Database
(restricted access)
-
Qualitative Structure
Activity Relationships (QSARs)
-
Quadraped Animals
Data Generator
-
Servo Database
-
Shuttle Landing
Control Database
-
Solar Flare Databases
-
Soybean Databases
-
Challenger USA
Space Shuttle O-Ring Databases
-
Low Resolution
Spectrometer Database
-
Sponge Database
-
Statlog Project
Databases
-
Student Loan Relational
Database
-
Tic-Tac-Toe Endgame
Database
-
Thyroid Disease
Database
-
Trains Database
-
University Database
-
Congressional Voting
Records Database
-
Water Treatement
Plant Database
-
Waveform Data Generator
-
Wine Recognition
Database
-
Yeast Database
-
Zoo Database
-
Undocumented Databases