Originally Posted by
dfreeman02
I need to propose a project for a data mining class. Given my favorite "hobby," I'd like to work with the DOT airline data sets on (domestic) ticket sales and on-time performance, and use them to predict some feature of air travel. My first two ideas were:
1. Predict number of unsold F seats -- probably can't be done with the available data.
2. Predict on-time performance for flights/city pairs -- already done by FlightCaster.
Any other ideas? The data are the following:
- Ticket data: for 10% of all domestic itineraries, includes origin/destination/connecting cities, ticketing/operating carriers, fare, class, distance, number of pax.
- On-time data: date, carrier, tail number, flight number, origin/destination, scheduled/actual arrival/departure times, actual wheels up/down and taxi times, delay reason, diversion/cancellation info.
The ticket data has *no* information about purchase date or flight date (other than which quarter the flight was in), and no info about domestic flights connecting to/from international flights.
How about weather delays in certain cities / at certain times of year? Or on time performance, but just for major holidays (ie the use would be 'how likely is your flight the day before Thanksgiving to be on time')