FlyerTalk Forums

FlyerTalk Forums (https://www.flyertalk.com/forum/index.php)
-   Travel Tools (https://www.flyertalk.com/forum/travel-tools-701/)
-   -   Flight Finding program (https://www.flyertalk.com/forum/travel-tools/1616197-flight-finding-program.html)

HGF Sep 27, 2014 4:33 pm

Flight Finding program
 
Hey all,

I've been working on a flight finding program that queries ITA's matrix for flights and alerts you of cheap/low ppm ones. I took a break for a few months due to life events, but I am going to be starting up again

Right now, it only does searches out of ORD to any number of inputted destinations between the specified dates for the specified trip length range. It then prints out the lowest ppm and lowest priced trips per destination I believe.

I was working on making it RESTful since I don't like doing UI work. After that, setting up a DB to cache results would probably be best. Matrix likes to throw errors relating to random failures or exceeding quotas a lot.

If anyone wants to help out, I have it on github here: https://github.com/HGF/flight_finder

gnutello Sep 28, 2014 8:33 am

Please consider merging with this project. You'll see that some of the legwork is done already. There are some more recent updates in the branches. They've already put a little consideration in how to store the results in a (NoSQL) database.

https://github.com/mayanez/flight_scraper

HGF Sep 28, 2014 12:46 pm

The goals of that project and my project are different. I am not trying to determine a correlation between seat availability and price fare information. I just want to find great flight deals for either mileage runs or cheap vacations by essentially automating the process that everyone already does on Matrix. There also hasn't been an update in over 6 months and a pull request has been pending for over a month.

I also want to do a lot of the legwork, so I can learn.

mgo72 Sep 28, 2014 1:56 pm

Hello,

What help do you need?

HGF Sep 28, 2014 2:10 pm


Originally Posted by mgo72 (Post 23594619)
Hello,

What help do you need?

I need to implement a DB for caching. I am also in the process of making it restful, so someone can crate a separate UI for it. Supporting multi-destination itineraries would be another good thing as well. Or, if you have your own features in mind, do those as well and it can be merged in.

I should create a TODO list in the readme for what needs to be done to give people some guidance

CAETravlr Sep 28, 2014 8:31 pm

I have written something similar. My suggestion is to look for apis if you can. You will have your IP blocked by Matrix if they determine that you are scraping their site. I learned the hard way, but probably because I got ambitious and multithreaded it. Look up the QPX api and see if you can get access to that. I am in the process of refactoring now to use that.

HGF Sep 29, 2014 5:53 pm


Originally Posted by CAETravlr (Post 23596057)
I have written something similar. My suggestion is to look for apis if you can. You will have your IP blocked by Matrix if they determine that you are scraping their site. I learned the hard way, but probably because I got ambitious and multithreaded it. Look up the QPX api and see if you can get access to that. I am in the process of refactoring now to use that.

Using the QPX API costs money per query though.

sokolov Sep 30, 2014 12:20 am

One problem is that ITA Matrix searches are only yay deep. You will never see everything, especially if you are searching for periods longer than single days.

Furthermore, results may differ between the web version and the mobile app.

Caching can be interesting for historical comparison, but for current results, I don't know. You would be caching something that comes out of a cache already. :-)

CAETravlr Oct 1, 2014 8:04 am


Originally Posted by HGF (Post 23601020)
Using the QPX API costs money per query though.

After the first 50 queries each day I believe. And even then, it is not very expensive. How many queries are you planning to run? Now granted I haven't refactored my program to get the data this way vs scraping matrix like I did before, but I think it is a small price to pay for a more reliable interface and to stay within their terms of use.

sokolov Oct 5, 2014 3:37 pm


Originally Posted by HGF (Post 23601020)
Using the QPX API costs money per query though.

Would you have the link to the pricing structure?

angatol Oct 5, 2014 6:28 pm

.....

bittihuduga Oct 9, 2014 3:52 pm

i get error on mac terminal:
python FlightFinder.py
Traceback (most recent call last):
File "FlightFinder.py", line 5, in <module>
from ITADao import ITADao
File "flight_finder-master/flight_finder/ITADao.py", line 4, in <module>
import requests
ImportError: No module named requests

Debonaire Oct 9, 2014 4:36 pm

It looks like you're just missing the requests module? Can you pip install requests?

Debonaire Oct 9, 2014 4:45 pm


Originally Posted by sokolov (Post 23630607)
Would you have the link to the pricing structure?

I think this is it: https://developers.google.com/qpx-express/v1/pricing

How does one get seat availability from ITA? Is the solution just to increase the number of passengers (up to 9)? That would significantly increase the number of queries.

Why not just match it with fare class availability via say, ExpertFlyer?

gnutello Oct 10, 2014 12:00 am


Originally Posted by HGF (Post 23594365)
The goals of that project and my project are different. I am not trying to determine a correlation between seat availability and price fare information. I just want to find great flight deals for either mileage runs or cheap vacations by essentially automating the process that everyone already does on Matrix. There also hasn't been an update in over 6 months and a pull request has been pending for over a month.

I also want to do a lot of the legwork, so I can learn.

My pull request finally got accepted. While the purpose of the project might be different, I see no harm in forking it -- particularly since it already has a NoSQL database integrated. Basically all you have to do is implement more XMLHttpRequests. The code is pretty easy to work with -- at the minimum, you can reuse a lot of its basic functionality as long as you redesign the database "tables".

When I get the chance, I'll take a look at your code and see if there's a fast way to port over some of the other code. You'll probably want to look into using Tor with Python to allow multithreaded queries. Also, when you make requests from ITA, make a driver that repeats requests until you get past the "server capacity exceeded" error, or 1 minute passes.

gnutello Oct 10, 2014 8:37 am

Okay, multi-city searching is more or less implemented in your code. You'll have to make some changes to the trip print statements to see the full itinerary.

HGF Oct 11, 2014 2:31 pm


Originally Posted by gnutello (Post 23656128)
Okay, multi-city searching is more or less implemented in your code. You'll have to make some changes to the trip print statements to see the full itinerary.

Awesome. I reviewed your pull request and left some comments.

sokolov Oct 12, 2014 7:05 pm


Originally Posted by Debonaire (Post 23653184)

Oh. 3.5 Cents per query. That is magnitudes more than I had expected - and you don't get more than 500 itineraries per query. So 30-day-searches are pointless.

Thank you for the link!

bridgeair Oct 13, 2014 1:14 am

This is interesting. One problem I have found with the matrix.ita tool is that some itineraries it comes up with are not bookable in reality. At least, I have found this with Chinese airlines.

sokolov Oct 14, 2014 9:01 pm


Originally Posted by bridgeair (Post 23668038)
This is interesting. One problem I have found with the matrix.ita tool is that some itineraries it comes up with are not bookable in reality. At least, I have found this with Chinese airlines.

Maybe. It all depends on the data the airlines provide, and this data is being cached, so it is not necessarily up to date.

Or maybe you are not using the right sales channel. Or the right location of ticket issue. Or the right currency.

Or you need a really good travel agent. I have booked several tickets that no travel website would be able to provide (and definitely not the airline's own booking engines). And only certain travel agents after very detailed instructions were able to do it. The ITA Matrix can be ingenius - or wrong. Most of the time it is simply doing a good job. :-)

s0ssos Oct 31, 2014 12:39 pm

Anybody figure out how to determine how much ITA searches? I know there is a limit, because you can search for A to B and it won't find the cheapest option (A to C to B), and if you limit the airlines sometimes you will find that A to C to B option.

I just find I have to plug in and plug in and churn away. It is really annoying.

I guess my question is what is the limit, or is it unknowable? Like, if I search daily and only from one airport to another is that good enough to look at all possibilities? Or could I stretch it to 2 cities to 1 city?

s0ssos Nov 1, 2014 12:04 pm


Originally Posted by angatol (Post 23770245)
I suspect it's a time limit, i.e. you just get the best option it found in the time allocated to your search. If you really want to find the best option you need to have the most restrictive search. e.g. if I were searching for options from FCO to SEA on oneworld, I'd try FCO::SEA/alliance oneworld, get a fare basis, say, I7SALE, then search FCO::SEA/alliance oneworld;f ..I7SALE. When it offers me LHR, I'd then reject it in the next search with FCO::~LHR/alliance oneworld;f ..I7SALE, etc. etc.

I know, but how do you know what the time limit is? Or where it cut off?
It'd be nice for it to say, searched 1250 out of 250000 options, or something.

Right now it is just us iterating. I guess there is no easy way for a machine to do it, as ita charges for queries.

fuyao Nov 3, 2014 10:30 am


Originally Posted by s0ssos (Post 23774678)
I know, but how do you know what the time limit is? Or where it cut off?
It'd be nice for it to say, searched 1250 out of 250000 options, or something.

Right now it is just us iterating. I guess there is no easy way for a machine to do it, as ita charges for queries.

Blogs used to say the time limit is 30 seconds, but its much shorter.
From my daily experience I'd say its under 10seconds or so.


All times are GMT -6. The time now is 10:13 am.


This site is owned, operated, and maintained by MH Sub I, LLC dba Internet Brands. Copyright © 2026 MH Sub I, LLC dba Internet Brands. All rights reserved. Designated trademarks are the property of their respective owners.