Originally Posted by
nevansm
Agreed that there are some legacy issues with a PSS as whole, but that's only by their own doing. Not logically moving towards nTier, too much reliance on legacy platforms, and an unwillingness to better architect a solution.
Let's get this straight... cloud services like FB, Twitter, and instagram are way more transactional than airlines. Yet their outages are more rare and less impactful because they've architected or re-architected apps better ways.
Airlines (and other large orgs to some extent) have resisted this change because they don't think it's worth the investment in moving away from their legacy systems. It won't happen overnight, but it truly has to eventually. SABRE and travelport/worldpan already offer cloud versions of their apps that have a lot of these benefits. So it's not like it's impossible.
It's simply an issue of spending money.
SABRE is one of the biggest mainframe shops in TX...
Facebook and Twitter can architect their applications in different ways because their business demands are different. Facebook is fine with you not seeing your buddies update for a few minutes because global DNS load balancing sent you different DC and their change hasn't replicated there. Eventual consistency is fine.
Some business requirements are aligned with cloud native application design, and some aren't.
Banking and airline applications are sequential transactions that require immediate consistency. Immediate consistency requires a single source of truth at the data persistence layer. One DB that you read and write from. Milliseconds matter. If you used the type of data persistence layers that lead to cloud native applications that can survive an entire DC going down you aren't recording sequential transactions. There is a single ticket in K bucket, I can't sell it to two people because they connected to different datacenters. There is one seat 16a, I can't assign it to two people because they connected to different data centers.
You can replicate the data persistence layer, but it is going to take time to fail over. I'm sure DL replicates the SAN under their mainframes and X86 servers, but it takes time to fail over. You have to make the decision to failover or wait it out based on how long you thing the outage will take.