Community
Wiki Posts
Search
Old Jan 27, 2024, 10:52 pm
FlyerTalk Forums Expert How-Tos and Guides
Last edit by: Adam Smith
Print Wikipost

AC files suit against seats.aero

Thread Tools
 
Search this Thread
 
Old Jan 4, 2024, 9:08 pm
  #196  
A FlyerTalk Posting Legend
 
Join Date: Sep 2012
Location: SFO
Programs: AC SE MM, BA Gold, SQ Silver, Bonvoy Tit LTG, Hyatt Glob, HH Diamond
Posts: 44,583
Originally Posted by YOWgary
If we're sticking to this analogy - and I don't think it's a very good one - then AC is charging the first driver they could catch for doing 130kph on every road in town, all day, every day, for months.
I was going to say the analogy is more like charging the first driver they could catch for doing Warp 9, but sure.
canadiancow is offline  
Old Jan 4, 2024, 10:02 pm
  #197  
 
Join Date: Oct 2013
Location: YOW
Programs: AC SE, FOTSG Platinum
Posts: 5,823
Originally Posted by canadiancow
I was going to say the analogy is more like charging the first driver they could catch for doing Warp 9, but sure.
Like I said, it's not a great analogy, but given that Warp 9 is theoretically almost as fast as it's possible for mortals to travel, I think you'll agree Seats.Aero is probably closer to 130 kph than many multiples of the speed of light.
YOWgary is offline  
Old Jan 4, 2024, 10:06 pm
  #198  
A FlyerTalk Posting Legend
 
Join Date: Sep 2012
Location: SFO
Programs: AC SE MM, BA Gold, SQ Silver, Bonvoy Tit LTG, Hyatt Glob, HH Diamond
Posts: 44,583
Originally Posted by YOWgary
Like I said, it's not a great analogy, but given that Warp 9 is theoretically almost as fast as it's possible for mortals to travel, I think you'll agree Seats.Aero is probably closer to 130 kph than many multiples of the speed of light.
Fine, Mach 9
supine likes this.
canadiancow is offline  
Old Jan 4, 2024, 10:15 pm
  #199  
 
Join Date: Oct 2013
Location: YOW
Programs: AC SE, FOTSG Platinum
Posts: 5,823
Originally Posted by canadiancow
Fine, Mach 9
Reasonable.
YOWgary is offline  
Old Jan 5, 2024, 12:06 am
  #200  
 
Join Date: Dec 2020
Programs: QF, CoUniHound Refugee
Posts: 377
Scrapping is not a crime

As a matter of law I suspect the airline is in the wrong here. It is not illegal for third parties to scrape websites. If that was the case, search engines like Google wouldnt exist. Now whether Seats.aero used an API they are not authorized to use, then yes maybe a case could be made there. But even then I think the question would be how they got access to the API. Was it a case of AC initially providing access then then taking it away, resulting in seats.aero having to resort to scraping (legal IMHO). Or did they get access through hacking internal systems or engaging in reverse engineering a-la wireshark (questionable legality).

Regardless at the end of the day the onus is on Air Canada to prove the law is on their side and so far as I can tell reading this thread and following the story there has been no legal action in terms of discovery and evidence to prove anything is wrong here.

-RooFlyer88
kangarooflyer88 is offline  
Old Jan 5, 2024, 6:05 am
  #201  
 
Join Date: Jan 2017
Location: Halifax
Programs: AC SE100K, Marriott Lifetime Platinum Elite. NEXUS
Posts: 4,586
Originally Posted by kangarooflyer88
As a matter of law I suspect the airline is in the wrong here. It is not illegal for third parties to scrape websites.
No one is being charged with a crime.

The question at hand isn't if the activity violates the terms of service, and if terms of service are enforceable; it is a question of contract law.
Bohemian1 and YOWgary like this.
RangerNS is offline  
Old Jan 5, 2024, 9:27 am
  #202  
 
Join Date: Dec 2011
Location: YYZ
Programs: AC SEMM / HH Diamond
Posts: 3,203
Originally Posted by kangarooflyer88
As a matter of law I suspect the airline is in the wrong here. It is not illegal for third parties to scrape websites. If that was the case, search engines like Google wouldnt exist. Now whether Seats.aero used an API they are not authorized to use, then yes maybe a case could be made there. But even then I think the question would be how they got access to the API. Was it a case of AC initially providing access then then taking it away, resulting in seats.aero having to resort to scraping (legal IMHO). Or did they get access through hacking internal systems or engaging in reverse engineering a-la wireshark (questionable legality).
I think our difference of opinion is going to come down to a more nuanced conversation about what an API is.

I would have a lot more sympathy if I believed that seats.aero was only scraping the web site. Let's define what that means.

Fundamentally, a web site is designed for public access. Sure, at a very basic level a browser interacts with a web server via an API - but the key point is that the interaction follows well defined public standards, typically connecting via port 80 or 443, following well defined protocols for access.

At the risk of introducing more broken analogies, web access is akin to a developer building a concrete path through a park. If the path is connected to a public walkway, and there is no fence or gate blocking access, then it's clear & well understood that this path is designed for public access - even if the actual ground is privately owned.

The argument against web scraping would be akin to saying that "the public are allowed to walk on this path, but real estate agents who are going to take pictures are not allowed". To be clear, I disagree with that argument. I think a public path should be a public path, and the landowner does not get to pick & choose which subset of the public to allow - and the same goes for web access.

In other words, I do believe that actual web scraping should be (and is) legal. Let me try to be even clearer - the rules and protocols for web access, do not (and should not) limit the client technology to any specific "browser". If I want to write some java code to access a public web site over a well-known port using a standards based protocol, then I believe that I (and Google, and others - including seats.aero) are all well within their rights to do so.

The world changes when you stop using public APIs that follow well defined standards. Let's say that a web site includes some javascript which in turn invokes a proprietary API using a proprietary protocol ... that does not give me (or you) rights to invoke that same proprietary API. My rights go as far as the well-defined public standards go, and no further. To argue the contrary is a very slippery slope. I don't believe that the level of security that is enforced to access the proprietary API, has any bearing on the ethical question.
  • If I require an access token but it's clear that the access token is just a single byte - does that make it ethical for you to try all 256 options until you can access my API?
    • No, the fact that you *can*, does not make it right.
  • If I require a much longer & more complex access token, but you can find that token in my javascript, does that make it ok for you to use?
    • No, it does not make it ok.
The fact that you *can* do something is very very different than whether it's *ok* (legal, ethical, moral, etc) for you to do something. If I have a proprietary API, then the fact that you can figure out how to invoke that API does not make it ok.

Now, I do not have access to (or detailed knowledge of) the seats.areo code. Perhaps they really did just do pure screen scraping, using nothing more than HTTP access over port 443, in a way that's compliant with RFC 9112. If so, then I agree the failure is on Air Canada's part.

However, based on several comments I've read above (including from the seats.areo author) - and very explicitly stated in Air Canada claim (which you can choose to dispute, but this is what's claimed) - it seems very probable that seats.aero were *not* just "screen scraping", but they were invoking AC's APIs directly. Heck, I understand, actual screen scraping is a very inefficient way to extract a lot of data. Unfortunately for seats.aero, invoking a proprietary API (that you do not have "the colour of right" to access) is illegal (in my non-lawyers understanding of the law)

To me, this case is as simple as that.
  • If seats.aero can demonstrate that they only used public protocols & standards, then they should win.
  • If Air Canada can demonstrate that seats.aero directly accessed their proprietary APIs, then they should win.
I'm confident that there are logs which demonstrate which one is true.
canopus27 is online now  
Old Jan 5, 2024, 1:12 pm
  #203  
 
Join Date: Oct 2023
Posts: 8
Originally Posted by canopus27
I think our difference of opinion is going to come down to a more nuanced conversation about what an API is.

I would have a lot more sympathy if I believed that seats.aero was only scraping the web site. Let's define what that means.

Fundamentally, a web site is designed for public access. Sure, at a very basic level a browser interacts with a web server via an API - but the key point is that the interaction follows well defined public standards, typically connecting via port 80 or 443, following well defined protocols for access.

At the risk of introducing more broken analogies, web access is akin to a developer building a concrete path through a park. If the path is connected to a public walkway, and there is no fence or gate blocking access, then it's clear & well understood that this path is designed for public access - even if the actual ground is privately owned.

The argument against web scraping would be akin to saying that "the public are allowed to walk on this path, but real estate agents who are going to take pictures are not allowed". To be clear, I disagree with that argument. I think a public path should be a public path, and the landowner does not get to pick & choose which subset of the public to allow - and the same goes for web access.

In other words, I do believe that actual web scraping should be (and is) legal. Let me try to be even clearer - the rules and protocols for web access, do not (and should not) limit the client technology to any specific "browser". If I want to write some java code to access a public web site over a well-known port using a standards based protocol, then I believe that I (and Google, and others - including seats.aero) are all well within their rights to do so.

The world changes when you stop using public APIs that follow well defined standards. Let's say that a web site includes some javascript which in turn invokes a proprietary API using a proprietary protocol ... that does not give me (or you) rights to invoke that same proprietary API. My rights go as far as the well-defined public standards go, and no further. To argue the contrary is a very slippery slope. I don't believe that the level of security that is enforced to access the proprietary API, has any bearing on the ethical question.
  • If I require an access token but it's clear that the access token is just a single byte - does that make it ethical for you to try all 256 options until you can access my API?
    • No, the fact that you *can*, does not make it right.
  • If I require a much longer & more complex access token, but you can find that token in my javascript, does that make it ok for you to use?
    • No, it does not make it ok.
The fact that you *can* do something is very very different than whether it's *ok* (legal, ethical, moral, etc) for you to do something. If I have a proprietary API, then the fact that you can figure out how to invoke that API does not make it ok.

Now, I do not have access to (or detailed knowledge of) the seats.areo code. Perhaps they really did just do pure screen scraping, using nothing more than HTTP access over port 443, in a way that's compliant with RFC 9112. If so, then I agree the failure is on Air Canada's part.

However, based on several comments I've read above (including from the seats.areo author) - and very explicitly stated in Air Canada claim (which you can choose to dispute, but this is what's claimed) - it seems very probable that seats.aero were *not* just "screen scraping", but they were invoking AC's APIs directly. Heck, I understand, actual screen scraping is a very inefficient way to extract a lot of data. Unfortunately for seats.aero, invoking a proprietary API (that you do not have "the colour of right" to access) is illegal (in my non-lawyers understanding of the law)

To me, this case is as simple as that.
  • If seats.aero can demonstrate that they only used public protocols & standards, then they should win.
  • If Air Canada can demonstrate that seats.aero directly accessed their proprietary APIs, then they should win.
I'm confident that there are logs which demonstrate which one is true.
I would argue against this.

Having built a similar tool myself in the last 6 months, I am very confident that seats.aero uses the exact same API endpoints that I was using to scrape the data.

You mention that scraping should be and is legal, however if you start making requests using tokens that you can find in javascript to non publicly disclosed APIs then that is now not okay. The tokens in this case are all publicly available tokens, so the structure of the request sent to the API endpoints is the exact same whether you have your browser make it for you by visiting the web page, write some code to have a headless browser do it or create a script that just makes the HTTP request for you.

There are however also 2 other tokens that are not "public"
1. Your identity token: Provided by AWS Cognito, you can generate an identity to sign all of your requests using publicly available tokens
2. Akamai sensor data: This is the bot detection cookie token, provided by akamai after you pass their "bot detection" challenge and is only valid for a couple requests before you have to refresh the token by providing additional sensor data.

Having had very little experience reverse engineering obfuscated code before, it took me about a week reverse engineer akamai bot manager to be able to generate sensor data and have akamai give me a valid cookie. Now as updates are pushed out, it's quite easy to keep this up to date.

This may sound "shady", but it really isn't. Your browser performs all of these things in the background and makes the exact same requests that I am making. At the core, there is nothing different between this kind of scraping and the one people are more familiar with where you get an HTML page as a response and parse that. As web has moved forward, dynamic web content + PWAs made the old method of scraping obsolete on most websites as to get the content you need to execute the javascript that the site gives you.

To me this is equivalent of going to a self checkout vs standing in a long line at a grocery store.

If this was done via some non public API that Air Canada forgot to block from public and is meant only for internal use, then I would agree with scraping being unethical, but since this is the exact same thing the browser does, there's really nothing unethical about this.

I am glad the developer of seats.aero is a very good engineer and has the knowledge to figure out how to bypass the joke of a "security" layer that Air Canada has added. Maybe I'd have a different opinion if Air Canada offered a better service for searching or had reached out to seats.aero. They could realistically fix this issue by hiring better engineering talent, but with how little they pay it's no wonder that the only thing they've tried so far is integrating with an off the shelf bot detection system.
yyz_egg is offline  
Old Jan 5, 2024, 1:23 pm
  #204  
 
Join Date: Dec 2011
Location: YYZ
Programs: AC SEMM / HH Diamond
Posts: 3,203
Originally Posted by yyz_egg
I would argue against this.

.....
OK. We agree to disagree.

Originally Posted by yyz_egg
I am glad the developer of seats.aero is a very good engineer and has the knowledge to figure out how to bypass the joke of a "security" layer that Air Canada has added.
I think this says it all about our different opinions. You appear to believe that a "security" layer is ok to bypass if it's poorly implemented. OK.

I believe that a poor implementation may not be good from a security perspective, but that does nothing to change how ethical it is to attempt to bypass it.
canopus27 is online now  
Old Jan 5, 2024, 1:27 pm
  #205  
A FlyerTalk Posting Legend
 
Join Date: Sep 2012
Location: SFO
Programs: AC SE MM, BA Gold, SQ Silver, Bonvoy Tit LTG, Hyatt Glob, HH Diamond
Posts: 44,583
Originally Posted by canopus27
OK. We agree to disagree.



I think this says it all about our different opinions. You appear to believe that a "security" layer is ok to bypass if it's poorly implemented. OK.

I believe that a poor implementation may not be good from a security perspective, but that does nothing to change how ethical it is to attempt to bypass it.
1. Do the ethics changed based on what you're doing with it?
To give three real examples:
a. You're just doing it for your own personal searches
b. You're providing a free service to people
c. You're providing a paid service to people

2. Do the ethics change if it's actually causing real monetary damages to the target?

AC's complaint seems to be focused on 1c and 2.

It would also appear cowtool never hit 2 (based on the timing of seats.aero's Aeroplan launch and the massive uptick in searches), and was definitely 1b.

I think 1a is probably fine, since it would never be noticed by AC (I mean there are ways to detect it such as frequency of searches, but when you're doing a couple hundred total, they likely won't care).
Bohemian1 likes this.
canadiancow is offline  
Old Jan 5, 2024, 1:30 pm
  #206  
 
Join Date: Oct 2023
Posts: 8
Originally Posted by canopus27
OK. We agree to disagree.



I think this says it all about our different opinions. You appear to believe that a "security" layer is ok to bypass if it's poorly implemented. OK.

I believe that a poor implementation may not be good from a security perspective, but that does nothing to change how ethical it is to attempt to bypass it.
Can you explain what is different from making a direct request and getting data versus opening up a web browser and the web browser making those requests for you automatically because they are triggered by some JavaScript code? The exact same thing happens. You can't be for scraping but be against this method just because there's a browser sitting in the middle really makes zero difference.
yyz_egg is offline  
Old Jan 5, 2024, 1:51 pm
  #207  
 
Join Date: Dec 2011
Location: YYZ
Programs: AC SEMM / HH Diamond
Posts: 3,203
Originally Posted by canadiancow
1. Do the ethics changed based on what you're doing with it?
To give three real examples:
a. You're just doing it for your own personal searches
b. You're providing a free service to people
c. You're providing a paid service to people

2. Do the ethics change if it's actually causing real monetary damages to the target?

AC's complaint seems to be focused on 1c and 2.

It would also appear cowtool never hit 2 (based on the timing of seats.aero's Aeroplan launch and the massive uptick in searches), and was definitely 1b.

I think 1a is probably fine, since it would never be noticed by AC (I mean there are ways to detect it such as frequency of searches, but when you're doing a couple hundred total, they likely won't care).
I'm a computer geek, not an ethicist

But yes, I think intent does matter. To slip back to the somewhat broken speed limit example ... technically anyone who's speeding by even 1km/hour is breaking the law. I expect that the police & courts would generally not enforce speeding of that tiny amount, but if you wanted to argue that this proves that going 1km/hour over is therefore legal - I would disagree.

Similarly, if you're racing down the streets at 30km/hr over the limit, and the police chase you to a hospital whereupon they discover that your passenger is your pregnant wife who's giving birth, or your father who's had a heart attack ... in cases like that, I would expect there to be a lot more leniency about any speeding charges. But none of that makes it technically legal.

Intent does matter (IMHO)

Originally Posted by yyz_egg
Can you explain what is different from making a direct request and getting data versus opening up a web browser and the web browser making those requests for you automatically because they are triggered by some JavaScript code? The exact same thing happens. You can't be for scraping but be against this method just because there's a browser sitting in the middle really makes zero difference.
It's not so much that the browser is doing the same thing - it's that the code invoking the API may (or may not) be doing the same thing. As a trivial example, it could be feasible that the developers implemented some form of client side API throttling in the javascript code. Awful implementation, and more realistically the browser is likely to only invoke the API once per page refresh while you might be doing so hundreds of times a second in threads ... but that would be a difference between the browser invoking the API and your code invoking the API.

I stand by my point that a poor implementation (of security, throttling, whatever) doesn't give you free reign to legitimately exploit that poor implementation.
canopus27 is online now  
Old Jan 5, 2024, 2:35 pm
  #208  
 
Join Date: Oct 2023
Posts: 8
Originally Posted by canopus27
I'm a computer geek, not an ethicist

But yes, I think intent does matter. To slip back to the somewhat broken speed limit example ... technically anyone who's speeding by even 1km/hour is breaking the law. I expect that the police & courts would generally not enforce speeding of that tiny amount, but if you wanted to argue that this proves that going 1km/hour over is therefore legal - I would disagree.

Similarly, if you're racing down the streets at 30km/hr over the limit, and the police chase you to a hospital whereupon they discover that your passenger is your pregnant wife who's giving birth, or your father who's had a heart attack ... in cases like that, I would expect there to be a lot more leniency about any speeding charges. But none of that makes it technically legal.

Intent does matter (IMHO)



It's not so much that the browser is doing the same thing - it's that the code invoking the API may (or may not) be doing the same thing. As a trivial example, it could be feasible that the developers implemented some form of client side API throttling in the javascript code. Awful implementation, and more realistically the browser is likely to only invoke the API once per page refresh while you might be doing so hundreds of times a second in threads ... but that would be a difference between the browser invoking the API and your code invoking the API.

I stand by my point that a poor implementation (of security, throttling, whatever) doesn't give you free reign to legitimately exploit that poor implementation.
I feel like you just have arbitrarily drawn lines for what is considered ethical or not. Why is the old school "traditional" scraping fine by your rules and not unethical? Afterall, the old school way also makes direct requests to some web server via some script. Again it makes no difference if I write a script that makes 200k requests to scrape data or pay 1000 people to make the 200 requests each.

I do agree that it's unethical to do so many requests in a short period of time that it brings down the service (DoS), but again I can almost guarantee you that in this case the requests were throttled from seats.aero as bringing down the service is counterproductive to scraping. If air Canada search can't compete, someone will make a better service if available. I remember reading that he attempted to reach out to air Canada and work with them but was ultimately ignored.
​​​​​
yyz_egg is offline  
Old Jan 5, 2024, 3:05 pm
  #209  
Moderator, Air Canada; FlyerTalk Evangelist
Original Poster
 
Join Date: Feb 2015
Location: YYC
Programs: AC SE MM, FB Plat, WS Plat, BA Silver, DL GM, Marriott Plat, Hilton Gold, Accor Silver
Posts: 16,853
While I'm neither a lawyer nor a tech guy, I've found some of the recent debate on the technical side of this thing to be fascinating to read. Thanks to @RatherBeInYOW @canopus27 @canadiancow for some really detailed technical points and intriguing discussion (even though you don't all agree with each other).
canopus27, canadiancow and yulscs like this.
Adam Smith is offline  
Old Jan 5, 2024, 3:12 pm
  #210  
A FlyerTalk Posting Legend
 
Join Date: Sep 2012
Location: SFO
Programs: AC SE MM, BA Gold, SQ Silver, Bonvoy Tit LTG, Hyatt Glob, HH Diamond
Posts: 44,583
Originally Posted by yyz_egg
I remember reading that he attempted to reach out to air Canada and work with them but was ultimately ignored.
​​​​​
And AC claims this didn't happen. Or rather, they claim the reach out was shortly before the C&D, to some random person on LinkedIn.

I also claim that cowtool did reach out on multiple occasions and was ignored though. I know this to be true, but the rest of you should treat it with as much veracity as both seats.aero and AC's claims.

So whether or not seats.aero reached out, I have doubts AC would have engaged.
canadiancow is offline  


Contact Us - Manage Preferences - Archive - Advertising - Cookie Policy - Privacy Statement - Terms of Service -

This site is owned, operated, and maintained by MH Sub I, LLC dba Internet Brands. Copyright © 2024 MH Sub I, LLC dba Internet Brands. All rights reserved. Designated trademarks are the property of their respective owners.