Performance Evaluations on testing data

Given the two experiments we performed, the best-performing classifiers (with high validation precision and recall) are:

(1) cost-sensitive Naive Bayes trained by the 'large' dataset

(2) cost-sensitive Bayes Net trained by the 'large' dataset

(3) cost-sensitive 5NN trained by the 'large' dataset

The testing precision and recall of these 3 classifiers are shown in the table below:

Cost-sensitive 5NN is the best classifier for our task, as it has both the highest testing precision for on-time prediction and the highest testing recall for delayed flights.

Future Work

Due to the building time of complex models and the limit of Weka, we wasn’t able to get results from logistic regression and multilayer perceptron, though we believed both of them should be reasonable models to work with. We have even tried to use a tiny dataset (around 1000 instances) to train these two models, but the results were unsatisfying, so we had to give up. Both models take a considerable amount of time to run, especially when the size of the dataset reaches over 10,000 instances. In the future, we would like to use stronger tool like scikit-learn to build more efficient models with larger training set and test the classifiers on larger test sets.

Click here to read our final report.

Flight Delay Prediction

EECS 349, Northwestern University

Performance Evaluations on testing data

Flight Delay Prediction

EECS 349, Northwestern University

​Performance Evaluations on testing data

Performance Evaluations on testing data