FlyerTalk Forums

FlyerTalk Forums (https://www.flyertalk.com/forum/index.php)
-   Travel Tools (https://www.flyertalk.com/forum/travel-tools-701/)
-   -   ITA software scrape (https://www.flyertalk.com/forum/travel-tools/1943861-ita-software-scrape.html)

Jan@BRU Dec 4, 2018 7:57 am

ITA software scrape
 
Just curious, whether anybody has succeeded in creating a scraping script for ITA software. I have seen a post over on headforpoints, which suggested it was possible. Any insight?

Thanks,

Jan

Zorak Dec 4, 2018 11:28 pm


Originally Posted by Jan@BRU (Post 30497520)
Just curious, whether anybody has succeeded in creating a scraping script for ITA software. I have seen a post over on headforpoints, which suggested it was possible. Any insight?

Thanks,

Jan

What exactly do you mean by scraping?

Are you looking for functionality provided by either the Userscript or bookwithmatrix?

https://www.flyertalk.com/forum/trav...-lh-lx-tk.html

https://www.flyertalk.com/forum/trav...ta-matrix.html

Jan@BRU Dec 7, 2018 8:22 am

Zorak,
No -- that's what I mean. From what I understand, the tools you mention assist you in purchasing specific search results, which indeed is a big help.

Scraping refers to automating requests sent to a website and harvesting the results.
Take for example the recently introduced limitation on ITA to limit the jointly queried departure points to cities in a single country. If you want to do a regional query of flights from all departure points in neighboring countries, you need to several requests.
Scraping would run a script, send out a request, wait for the results to display and than store the results before moving on the next request automatically etc.
It's obvious that google discourages that kind of data harvesting and has put in place a few barriers to do it.... I was wondering whether somebody has managed to successfully scrape ITA nonetheless.

Thanks

Jan

Zorak Dec 7, 2018 10:10 am

I know what scraping means in general, but wanted to understand your specific use case and whether it had to do with booking or searching.

IMO if someone wanted to automate the searching process or aggregate the results of multiple searches, a more direct approach might be to use Developer Tools to try and figure out the underlying search API and then issue customized requests directly to that instead of going via the UI.

gbs1112 Dec 7, 2018 12:40 pm

Same problem discussed in a slightly different way. Here in Western Europe several international airports may be within 150 miles of each other but in four different countries so while the old ITA-Matrix allowed a quick search that has now become more tedious. Any suggestions how a searcher might extend to search 'airports within x miles'
Just a thought

Jan@BRU Dec 12, 2018 4:48 am


Originally Posted by Zorak (Post 30509296)
I know what scraping means in general, but wanted to understand your specific use case and whether it had to do with booking or searching.

IMO if someone wanted to automate the searching process or aggregate the results of multiple searches, a more direct approach might be to use Developer Tools to try and figure out the underlying search API and then issue customized requests directly to that instead of going via the UI.

Sorry, yes it's about the searching, not the booking.

I appreciate thata developer API might be better, it just so happens that itasodtware closed down its already limited QPX API last month.

So any thoughts on a js or vbnet based routine that jumps the itasoftware anti scraping hurdles would be welcome.

Zorak Dec 13, 2018 7:44 pm


Originally Posted by Jan@BRU (Post 30526473)
Sorry, yes it's about the searching, not the booking.

I appreciate thata developer API might be better, it just so happens that itasodtware closed down its already limited QPX API last month.

So any thoughts on a js or vbnet based routine that jumps the itasoftware anti scraping hurdles would be welcome.

There may not be a public-facing API anymore but there's still whatever API their UI talks to. As I said above, my first thought absent any better suggestions would be to use Developer Tools in a browser, sniff whatever requests it's making and try to reverse engineer what it's doing.

Jan@BRU Dec 17, 2018 9:56 am


Originally Posted by Zorak (Post 30533308)
There may not be a public-facing API anymore but there's still whatever API their UI talks to. As I said above, my first thought absent any better suggestions would be to use Developer Tools in a browser, sniff whatever requests it's making and try to reverse engineer what it's doing.

Zorak,
oh yes thanks. I misread that. I obviously tried the sniff approach already. That works pretty cool with many websites, particularly if they use the URL to submit the request.
Unfortunately, this site is a lot more sophisticated and way beyond my capabilities. I cant even fill even the userform with all kinds of cookies and sessionIDs and what not ;-(

dlflyer00 Dec 17, 2018 6:08 pm

ITA's API is quite obfuscated. It will probably be easier to screen-scrape it, but you'll presumably get blocked at some point, either because of IP-level throttling, JS checks, or something else. I don't know what ITA does in terms of anti-abuse but it's pretty easy to imagine many ways they can silently screw you (start randomizing prices, hiding low-fare classes, etc).

fuyao Dec 19, 2018 1:36 pm

yes, very easy to scrape ITA and no IP ban in place

timesnaps Jan 18, 2019 8:06 am


Originally Posted by fuyao (Post 30554015)
yes, very easy to scrape ITA and no IP ban in place

care to elaborate please?

bodory Jan 24, 2019 10:08 pm


Originally Posted by timesnaps (Post 30670638)
care to elaborate please?

+1

maverick2202 Jan 27, 2019 3:29 pm

Has anyone tried scraping using selenium driver ? I tried but get different results on ita matrix. It shows only couple of airlines instead of all airlines.

UNC Mar 3, 2019 3:44 am


Originally Posted by maverick2202 (Post 30707561)
Has anyone tried scraping using selenium driver ? I tried but get different results on ita matrix. It shows only couple of airlines instead of all airlines.

I built a scraper that goes through the advanced controls (using selenium), and specifically searches for a given airline. The scraping seems to give higher prices than a manual search, but didn't bother to check further for the reason behind this. Maybe specifying the fare basis would help with this?

drvb Apr 28, 2022 9:38 am


Originally Posted by fuyao (Post 30554015)
yes, very easy to scrape ITA and no IP ban in place

What about the new ITA Matrix, does this statement apply to that too?


All times are GMT -6. The time now is 10:54 pm.


This site is owned, operated, and maintained by MH Sub I, LLC dba Internet Brands. Copyright © 2026 MH Sub I, LLC dba Internet Brands. All rights reserved. Designated trademarks are the property of their respective owners.