School Project - How to get flight schedule data in bulk?
#1
Original Poster




Join Date: Aug 2005
Location: DEN, FL350
Programs: IHTFP, AS 100k, ex-AA EXP, ex-UA 1K, Ex-BD/A6/AC *G (RIP)
Posts: 518
School Project - How to get flight schedule data in bulk?
Wow, I'm actually using FT for something pertaining to my schoolwork. This is a first 
I didn't see obvious forum for an odd request (feel free to move as necessary) like this but here it goes...
I'm working on a project for a class that I'm taking this term and am looking for source data for some of my simulations. I realize that I could harvest this data from published schedules by hand but I'm hoping that someone here on FT has a quick an easy way to gather this information.
First of all, I'm really only interested in non-stop flights between the US and Europe. For these flights, I'm hoping to make a fairly comprehensive list (a CSV list would be ideal) of flights which includes the following information:
I thought I could massage the information I'm after into a convenient form on ITA but that doesn't seem to be the case. So far, the most convenient form of information that I've found (for manual gathering) is to use an ITA many-city to many-city search constrained in the the following ways:
Obviously the ITA search reveals nothing about frequency of service and also getting the flight number is pain (you must mouse-hover over the bar in the graphical display). Maybe there are some ITA gurus out there that know better
In any case, that's the gist of what I'm looking for. I'm sure I've left something out those so feel free to ask for clarification!
PS: I'm fairly handing with scripting and parsing tools so if anyone has anything close to what I'm looking for, I'd certainly be interested in hearing about it!

I didn't see obvious forum for an odd request (feel free to move as necessary) like this but here it goes...
I'm working on a project for a class that I'm taking this term and am looking for source data for some of my simulations. I realize that I could harvest this data from published schedules by hand but I'm hoping that someone here on FT has a quick an easy way to gather this information.
First of all, I'm really only interested in non-stop flights between the US and Europe. For these flights, I'm hoping to make a fairly comprehensive list (a CSV list would be ideal) of flights which includes the following information:
- Flight number, including carrier (ex: UA902)
- Origin airport code
- Destination airport code
- Departure time (scheduled, not actual time for a specific flight on a certain day)
- Arrival time (scheduled, not actual time for a specific flight on a certain day)
- Frequency (optional)
I thought I could massage the information I'm after into a convenient form on ITA but that doesn't seem to be the case. So far, the most convenient form of information that I've found (for manual gathering) is to use an ITA many-city to many-city search constrained in the the following ways:
- Source city: list of major US gateways (NYC,WAS,PHL,CLT,ORD,ATL,etc.)
- Dest city: major European gateways
- Non-stops only
- Text view, sort by departure time (helps to eliminate code-shares*)
Obviously the ITA search reveals nothing about frequency of service and also getting the flight number is pain (you must mouse-hover over the bar in the graphical display). Maybe there are some ITA gurus out there that know better

In any case, that's the gist of what I'm looking for. I'm sure I've left something out those so feel free to ask for clarification!
PS: I'm fairly handing with scripting and parsing tools so if anyone has anything close to what I'm looking for, I'd certainly be interested in hearing about it!
#2

Join Date: Mar 2007
Posts: 642
This might not be particularly easy but most airlines have PDF files of their schedules. You could parse these and extract the pertinent information.
I believe that there are a couple of PDF->Text file converters. Maybe after running one of these you could parse the text file.
Again, not particularly easy but an option if you were really need the info.
I believe that there are a couple of PDF->Text file converters. Maybe after running one of these you could parse the text file.
Again, not particularly easy but an option if you were really need the info.
#3
FlyerTalk Evangelist
Join Date: Jun 2006
Location: IAD/DCA
Posts: 31,871
maybe contact this guy >
http://www.aaronkoblin.com/work/faa/
http://www.aaronkoblin.com/work/faa/
#4
Join Date: Mar 2003
Location: ATL / SJC
Programs: DL DM, SPG Lifetime Platinum, Marriott Platinum, Hyatt Globalist
Posts: 213
OAG (Official Airline Guide) is a great source schedule date.
http://www.oag.com/oag/website/com/en/Home/
Back Aviation Solutions has a nice interface for this data, but is quite expensive.
http://www.backaviation.com/Informat.../schedules.htm
Seabury has a similar (and better) interface with their APG DAT product, but again, quite expensive.
http://www.seaburyapg.com/index.html
http://www.oag.com/oag/website/com/en/Home/
Back Aviation Solutions has a nice interface for this data, but is quite expensive.
http://www.backaviation.com/Informat.../schedules.htm
Seabury has a similar (and better) interface with their APG DAT product, but again, quite expensive.
http://www.seaburyapg.com/index.html
#5

Join Date: Aug 2000
Location: Exile
Posts: 16,064
OAG and Innovata would be the best sources for this data, but I doubt they will be cheap.
http://www.oag.com/
http://www.innovatallc.com/
http://www.oag.com/
http://www.innovatallc.com/
#6
In Memoriam
Join Date: Feb 2000
Location: Easton, CT, USA
Programs: ua prem exec, Former hilton diamond
Posts: 31,801
Does it need to be current? I think if you approach the companies listed here and explain what you are doing maybe they would be willing to part with older information from a few months ago or whatever.
#7




Join Date: Sep 2005
Location: YOW
Programs: AC E75K *G
Posts: 7,242
You should consider KVS - http://www.kvstool.com/
For $30 you can get Platinum level access, which affords access to the Amadeus timetable. It is very fast.
You can put your gateways (or even cities like NYC and LON to get all flights from JFK/LGA to LHR/LGW/STN/LTN/LCY) and put the airline code as YY. It will give the entire schedule for all airlines.
More importantly - you can copy the results as (say) Tab separated values. This can be pasted into a spreadsheet as-is. But you'd probably be better off doing you own processing in Perl or whatever.
You would be able to compile some pretty comprehensive schedules this way, and very quickly.
Here's an example of the NYC-LON schedule for the week including May 30:
[KVS Availability Tool 3.1.0/Platinum - Amadeus: Timetable/NL-BCDF]
For $30 you can get Platinum level access, which affords access to the Amadeus timetable. It is very fast.
You can put your gateways (or even cities like NYC and LON to get all flights from JFK/LGA to LHR/LGW/STN/LTN/LCY) and put the airline code as YY. It will give the entire schedule for all airlines.
More importantly - you can copy the results as (say) Tab separated values. This can be pasted into a spreadsheet as-is. But you'd probably be better off doing you own processing in Perl or whatever.
You would be able to compile some pretty comprehensive schedules this way, and very quickly.
Here's an example of the NYC-LON schedule for the week including May 30:
[KVS Availability Tool 3.1.0/Platinum - Amadeus: Timetable/NL-BCDF]
Code:
NYC New York Metro NY US = JFK LGA LON London Metro UK = LHR LGW STN LTN LCY TUE 27 May 2008 - 03 Jun 2008 Carrier Flight From Depart To Arrive A/C St Frequency | Dur'n | Dep T | Arr T | Effect | Ending | Exceptions --------- ------ ---- --------- ---- --------- ---- ---- ---------------------------------------------------------------- VS 26 JFK 07:30 LHR 19:10 346 0 1234567 06:40 4 3 30 Mar 25 Oct BA 172 JFK 07:45 LHR 19:40 777 0 123456- 06:55 7 4 28 May 09 Jun Tue 27 May BA 186 EWR 08:00 LHR 20:10 767 0 123456- 07:10 B 4 30 Apr 25 Oct VS 18 EWR 08:05 LHR 20:00 346 0 1234567 06:55 B 3 30 Mar 25 Oct AA 142 JFK 08:30 LHR 20:30 777 0 1234567 07:00 8 3 28 Apr 07 Jun BA 178 JFK 08:45 LHR 20:35 EQV 0 1234567 06:50 7 4 27 May 03 Jun AF/DL 3663 JFK 08:55 LHR 21:10 767 0 1234567 07:15 3 4 30 Mar 24 Oct DL 3 JFK 08:55 LHR 21:10 767 0 1234567 07:15 3 4 30 Mar 04 Jun E0 10 JFK 09:00 STN 21:00 752 0 1-----7 07:00 4 - 26 May 21 Jul CO 18 EWR 09:00 LHR 21:15 762 0 1234567 07:15 C 4 30 Mar 25 Oct DL 5 JFK 17:30 LGW 06:25 +1 752 0 1234567 07:55 3 N 26 May 04 Jun AF/DL 3665 JFK 17:30 LGW 06:25 +1 752 0 1234567 07:55 3 N 12 Apr 24 Oct AA 100 JFK 18:05 LHR 06:25 +1 777 0 1234567 07:20 8 3 30 Apr 30 Jun BA 112 JFK 18:15 LHR 06:25 +1 744 0 1234567 07:10 7 4 29 Apr 15 Jun BA 184 EWR 18:30 LHR 06:25 +1 777 0 1234567 06:55 B 4 29 Apr 14 Aug CO/VS 8224 JFK 18:30 LHR 06:35 +1 346 0 1234567 07:05 4 3 30 Mar 24 Oct VS 4 JFK 18:30 LHR 06:35 +1 346 0 1234567 07:05 4 3 30 Mar 24 Oct AA 104 JFK 18:35 LHR 06:55 +1 777 0 12345-7 07:20 8 3 30 Mar 30 Jun VS/CO 3114 EWR 18:35 LGW 06:55 +1 757 0 1234567 07:20 C N 29 Mar 24 Oct CO 114 EWR 18:35 LGW 06:55 +1 757 0 1234567 07:20 C N 05 May 05 Jun CO 28 EWR 18:40 LHR 06:45 +1 777 0 1234567 07:05 C 4 04 May 24 Oct BA 174 JFK 18:45 LHR 06:55 +1 744 0 ---4--- 07:10 7 4 01 May 23 Oct E0 2 JFK 18:45 STN 07:10 +1 752 0 1234567 07:25 4 - 25 May 06 Jun AA 124 JFK 18:50 STN 07:15 +1 763 0 1234567 07:25 8 - 30 Mar 24 Oct BA 174 JFK 19:00 LHR 07:15 +1 744 0 123-567 07:15 7 4 29 Apr 08 Jun E0 22 EWR 19:30 STN 07:30 +1 752 0 12345-7 07:00 B - Y7 102 EWR 19:30 LTN 07:40 +1 762 0 1234567 07:10 B - 29 Mar 24 Oct BA 176 JFK 19:30 LHR 07:40 +1 744 0 1234567 07:10 7 4 27 May 03 Jun CO/VS 8246 JFK 19:40 LHR 07:50 +1 744 0 1234567 07:10 4 3 30 Mar 24 Oct VS 46 JFK 19:40 LHR 07:50 +1 EQV 0 1234567 07:10 4 3 27 May 03 Jun AI 112 JFK 19:45 LHR 07:30 +1 77W 0 1234567 06:45 4 3 03 Apr 25 Oct E0 4 JFK 19:45 STN 08:15 +1 752 0 ----5-- 07:30 4 - 30 May 06 Jun BA 116 JFK 20:01 LHR 08:10 +1 744 0 1234567 07:09 7 4 27 May 03 Jun VS/CO 3116 EWR 20:25 LGW 08:40 +1 757 0 1234567 07:15 C N 30 Mar 24 Oct CO 116 EWR 20:25 LGW 08:40 +1 757 0 1234567 07:15 C N 30 Mar 24 Oct E0 6 JFK 20:30 STN 08:45 +1 752 0 1234567 07:15 4 - 22 Apr 04 Jul AF/DL 3664 JFK 20:30 LHR 09:00 +1 767 0 ----5-- 07:30 3 4 11 Apr 25 Jul DL 1 JFK 20:30 LHR 09:00 +1 767 0 ----5-- 07:30 3 4 02 May 30 May AF/DL 3664 JFK 20:55 LHR 09:25 +1 767 0 1234-67 07:30 3 4 27 May 04 Jun DL 1 JFK 20:55 LHR 09:25 +1 767 0 1234-67 07:30 3 4 01 May 04 Jun BA 188 EWR 21:00 LHR 09:15 +1 767 0 -----6- 07:15 B 4 03 May 18 Oct CO/VS 8222 EWR 21:10 LHR 09:05 +1 346 0 1234567 06:55 B 3 01 May 24 Oct VS 2 EWR 21:10 LHR 09:05 +1 346 0 1234567 06:55 B 3 01 May 24 Oct BA 188 EWR 21:15 LHR 09:15 +1 777 0 12345-7 07:00 B 4 29 Apr 15 Jun BA 114 JFK 21:20 LHR 09:20 +1 744 0 1234567 07:00 7 4 29 Apr 15 Jun AA 132 JFK 21:20 LHR 09:25 +1 777 0 1234567 07:05 8 3 06 Apr 04 Jul CO/VS 8230 JFK 21:35 LHR 09:30 +1 343 0 1234567 06:55 4 3 01 May 24 Oct VS 10 JFK 21:35 LHR 09:30 +1 343 0 1234567 06:55 4 3 01 May 24 Oct
#8
Original Poster




Join Date: Aug 2005
Location: DEN, FL350
Programs: IHTFP, AS 100k, ex-AA EXP, ex-UA 1K, Ex-BD/A6/AC *G (RIP)
Posts: 518
Thanks everyone for the input. These are all great suggestions that I'll have to look in to. The professional data sources are probably out of our price range but I'll see what I can get out of them for free.
Wow, I can't believe that I missed this option (I've subscribed to KVS in the past) and I had forgotten that it output data in that format. Thanks Zorn!
You should consider KVS - http://www.kvstool.com/
For $30 you can get Platinum level access, which affords access to the Amadeus timetable. It is very fast.
For $30 you can get Platinum level access, which affords access to the Amadeus timetable. It is very fast.

