Go Back  FlyerTalk Forums > Travel&Dining > Travel Tools
Reload this Page >

ITA software scrape

Community
Wiki Posts
Search

ITA software scrape

Thread Tools
 
Search this Thread
 
Old Dec 4, 2018, 7:57 am
  #1  
Original Poster
 
Join Date: Sep 2004
Programs: LH HON, AF Plat, BA Gold, SPG Plat, HHo Gold
Posts: 155
ITA software scrape

Just curious, whether anybody has succeeded in creating a scraping script for ITA software. I have seen a post over on headforpoints, which suggested it was possible. Any insight?

Thanks,

Jan
Jan@BRU is offline  
Old Dec 4, 2018, 11:28 pm
  #2  
Moderator: Hyatt; FlyerTalk Evangelist
 
Join Date: Jun 2015
Location: WAS
Programs: :rolleyes:, DL DM, Mlife Plat, Caesars Diam, Marriott Tit, UA Gold, Hyatt Glob, invol FT beta tester
Posts: 18,889
Originally Posted by Jan@BRU
Just curious, whether anybody has succeeded in creating a scraping script for ITA software. I have seen a post over on headforpoints, which suggested it was possible. Any insight?

Thanks,

Jan
What exactly do you mean by scraping?

Are you looking for functionality provided by either the Userscript or bookwithmatrix?

ITA-Matrix-PowerTools - Userscript for Orbitz/DL/UA/AA/BA/CZ/IB/LA/LH/LX/TK

BookWithMatrix.com: a tool to easily book with ITA Matrix
Zorak is offline  
Old Dec 7, 2018, 8:22 am
  #3  
Original Poster
 
Join Date: Sep 2004
Programs: LH HON, AF Plat, BA Gold, SPG Plat, HHo Gold
Posts: 155
Zorak,
No -- that's what I mean. From what I understand, the tools you mention assist you in purchasing specific search results, which indeed is a big help.

Scraping refers to automating requests sent to a website and harvesting the results.
Take for example the recently introduced limitation on ITA to limit the jointly queried departure points to cities in a single country. If you want to do a regional query of flights from all departure points in neighboring countries, you need to several requests.
Scraping would run a script, send out a request, wait for the results to display and than store the results before moving on the next request automatically etc.
It's obvious that google discourages that kind of data harvesting and has put in place a few barriers to do it.... I was wondering whether somebody has managed to successfully scrape ITA nonetheless.

Thanks

Jan
Jan@BRU is offline  
Old Dec 7, 2018, 10:10 am
  #4  
Moderator: Hyatt; FlyerTalk Evangelist
 
Join Date: Jun 2015
Location: WAS
Programs: :rolleyes:, DL DM, Mlife Plat, Caesars Diam, Marriott Tit, UA Gold, Hyatt Glob, invol FT beta tester
Posts: 18,889
I know what scraping means in general, but wanted to understand your specific use case and whether it had to do with booking or searching.

IMO if someone wanted to automate the searching process or aggregate the results of multiple searches, a more direct approach might be to use Developer Tools to try and figure out the underlying search API and then issue customized requests directly to that instead of going via the UI.
Zorak is offline  
Old Dec 7, 2018, 12:40 pm
  #5  
 
Join Date: Feb 2015
Location: London
Posts: 203
Same problem discussed in a slightly different way. Here in Western Europe several international airports may be within 150 miles of each other but in four different countries so while the old ITA-Matrix allowed a quick search that has now become more tedious. Any suggestions how a searcher might extend to search 'airports within x miles'
Just a thought

Last edited by gbs1112; Dec 7, 2018 at 12:41 pm Reason: spelling
gbs1112 is offline  
Old Dec 12, 2018, 4:48 am
  #6  
Original Poster
 
Join Date: Sep 2004
Programs: LH HON, AF Plat, BA Gold, SPG Plat, HHo Gold
Posts: 155
Originally Posted by Zorak
I know what scraping means in general, but wanted to understand your specific use case and whether it had to do with booking or searching.

IMO if someone wanted to automate the searching process or aggregate the results of multiple searches, a more direct approach might be to use Developer Tools to try and figure out the underlying search API and then issue customized requests directly to that instead of going via the UI.
Sorry, yes it's about the searching, not the booking.

I appreciate thata developer API might be better, it just so happens that itasodtware closed down its already limited QPX API last month.

So any thoughts on a js or vbnet based routine that jumps the itasoftware anti scraping hurdles would be welcome.
Jan@BRU is offline  
Old Dec 13, 2018, 7:44 pm
  #7  
Moderator: Hyatt; FlyerTalk Evangelist
 
Join Date: Jun 2015
Location: WAS
Programs: :rolleyes:, DL DM, Mlife Plat, Caesars Diam, Marriott Tit, UA Gold, Hyatt Glob, invol FT beta tester
Posts: 18,889
Originally Posted by Jan@BRU
Sorry, yes it's about the searching, not the booking.

I appreciate thata developer API might be better, it just so happens that itasodtware closed down its already limited QPX API last month.

So any thoughts on a js or vbnet based routine that jumps the itasoftware anti scraping hurdles would be welcome.
There may not be a public-facing API anymore but there's still whatever API their UI talks to. As I said above, my first thought absent any better suggestions would be to use Developer Tools in a browser, sniff whatever requests it's making and try to reverse engineer what it's doing.
Zorak is offline  
Old Dec 17, 2018, 9:56 am
  #8  
Original Poster
 
Join Date: Sep 2004
Programs: LH HON, AF Plat, BA Gold, SPG Plat, HHo Gold
Posts: 155
Originally Posted by Zorak
There may not be a public-facing API anymore but there's still whatever API their UI talks to. As I said above, my first thought absent any better suggestions would be to use Developer Tools in a browser, sniff whatever requests it's making and try to reverse engineer what it's doing.
Zorak,
oh yes thanks. I misread that. I obviously tried the sniff approach already. That works pretty cool with many websites, particularly if they use the URL to submit the request.
Unfortunately, this site is a lot more sophisticated and way beyond my capabilities. I cant even fill even the userform with all kinds of cookies and sessionIDs and what not ;-(
Jan@BRU is offline  
Old Dec 17, 2018, 6:08 pm
  #9  
 
Join Date: Nov 2018
Location: San Francisco
Programs: DL
Posts: 466
ITA's API is quite obfuscated. It will probably be easier to screen-scrape it, but you'll presumably get blocked at some point, either because of IP-level throttling, JS checks, or something else. I don't know what ITA does in terms of anti-abuse but it's pretty easy to imagine many ways they can silently screw you (start randomizing prices, hiding low-fare classes, etc).
dlflyer00 is offline  
Old Dec 19, 2018, 1:36 pm
  #10  
 
Join Date: Oct 2013
Posts: 639
yes, very easy to scrape ITA and no IP ban in place
fuyao is offline  
Old Jan 18, 2019, 8:06 am
  #11  
 
Join Date: Nov 2007
Location: Beijing
Posts: 349
Originally Posted by fuyao
yes, very easy to scrape ITA and no IP ban in place
care to elaborate please?
timesnaps is offline  
Old Jan 24, 2019, 10:08 pm
  #12  
 
Join Date: Jun 2005
Location: 🇸🇬 🇭🇰 🇫🇷
Programs: Many
Posts: 4,749
Originally Posted by timesnaps
care to elaborate please?
+1
bodory is offline  
Old Jan 27, 2019, 3:29 pm
  #13  
 
Join Date: Sep 2012
Posts: 29
Has anyone tried scraping using selenium driver ? I tried but get different results on ita matrix. It shows only couple of airlines instead of all airlines.
maverick2202 is offline  
Old Mar 3, 2019, 3:44 am
  #14  
UNC
 
Join Date: Dec 2018
Programs: AY+
Posts: 16
Originally Posted by maverick2202
Has anyone tried scraping using selenium driver ? I tried but get different results on ita matrix. It shows only couple of airlines instead of all airlines.
I built a scraper that goes through the advanced controls (using selenium), and specifically searches for a given airline. The scraping seems to give higher prices than a manual search, but didn't bother to check further for the reason behind this. Maybe specifying the fare basis would help with this?
UNC is offline  
Old Apr 28, 2022, 9:38 am
  #15  
 
Join Date: Jan 2015
Posts: 234
Originally Posted by fuyao
yes, very easy to scrape ITA and no IP ban in place
What about the new ITA Matrix, does this statement apply to that too?
drvb is offline  


Contact Us - Manage Preferences - Archive - Advertising - Cookie Policy - Privacy Statement - Terms of Service -

This site is owned, operated, and maintained by MH Sub I, LLC dba Internet Brands. Copyright © 2024 MH Sub I, LLC dba Internet Brands. All rights reserved. Designated trademarks are the property of their respective owners.