28 August 2012 - United System Outage (Imapct, Discussion, etc.)
#451
Join Date: Jan 2005
Location: DEN
Programs: UA Gold-MM, AA Gold-MM, F9-Silver, Hyatt Something, Marriott Gold, IHG Plat, Hilton Diamond
Posts: 6,315
MBA finance would say that giving elites a $200 cert might drive a lot of incremental business back to UA. Put some restrictions on it -- only good on G fares. Only good in January. Do whatever you want UA. But instead they do nothing.

#452
Join Date: Nov 2010
Location: DEN
Programs: 2012 Plat-2013 Plat-2014 Silver-2015 GM
Posts: 816
I think there will be a lot of blame and mud-slinging amongst the two pre-merger camps. On my flight yesterday which was caught in the shut-down, the pilot made a point of telling us that the flight crew were legacy continental crew in a tone as if to absolve them (and other CO staff) of the chaotic situation.
Working as a data network engineering manager for a firm with multiple data centers, I can attest to the fact that 'no single point-of-failue' network design is niether overly complex nor expensive (considering the outage costs, etc.).
I checked the UA website but did not find any current vacancies for network egineering/management type positions :-)
I checked the UA website but did not find any current vacancies for network egineering/management type positions :-)
I'm hopeful "someone" at UA knows the truth. We'll see.
Just a FYI for the anti-SHARES hysteria crowd. I just received word via an internal communication that the cause of the outage wasn't SHARES, nor did it have anything to do with SHARES.
The cause was communication equipment in the data center that failed. This was a hardware problem, not software.
Now having said that, I am curious to know why there were no redundant systems in the data center or a quick (ie instant) method of re-routing network traffic via a different data center.
The cause was communication equipment in the data center that failed. This was a hardware problem, not software.
Now having said that, I am curious to know why there were no redundant systems in the data center or a quick (ie instant) method of re-routing network traffic via a different data center.
You beat me to it. I was going to suggest that the host (SHARES) was not the issue. So to blame the PSS as many articles and lots of posters here are doing, is bit short sighted.
My guess would be that it is related to a firewall or router. As far as redundancy, I would wager several years salary that the redundant hardware is in place. I would bet that a fail over test has probably even been run fairly recently. But, we know how that goes, especially in large enterprise configurations.......the primary fails, and HSRP (or failover) doesn't auto-magically kick in.....you have a national press event.
Now you are suggesting the Jeff Smisek makes network architecture decisions?
My guess would be that it is related to a firewall or router. As far as redundancy, I would wager several years salary that the redundant hardware is in place. I would bet that a fail over test has probably even been run fairly recently. But, we know how that goes, especially in large enterprise configurations.......the primary fails, and HSRP (or failover) doesn't auto-magically kick in.....you have a national press event.
Now you are suggesting the Jeff Smisek makes network architecture decisions?

Last edited by FlyinHawaiian; Aug 30, 12 at 7:44 am Reason: merge

#453
Join Date: Jun 2004
Location: What I write is my opinion alone..don't read into it anything not written.
Posts: 9,674
That sounds like 10th grade economics.
MBA finance would say that giving elites a $200 cert might drive a lot of incremental business back to UA. Put some restrictions on it -- only good on G fares. Only good in January. Do whatever you want UA. But instead they do nothing.
MBA finance would say that giving elites a $200 cert might drive a lot of incremental business back to UA. Put some restrictions on it -- only good on G fares. Only good in January. Do whatever you want UA. But instead they do nothing.
Your restrictions do make your point better...a Sat night stay requirement, and 21 day advanced purchase requirement...fare fences to make sure that it is used for personal incrimental travel vs dilutionary ticketing on a ticket that was going to be bought anyway.

#454
Join Date: Nov 2011
Programs: UA Premier Exec (err... Premier Platinum), SPG Platinum
Posts: 8
A bit of levity?
Alex: Hi, I'm Alex, your united.com guide. I've been designed to make your time on our website as fast and efficient as possible. Just type in a question, keyword or phrase below and I'll take you to the information you're looking for.
You Asked: What happened to your website yesterday?
Alex: I'm not sure I understand, please try rewording your question.
You Asked: What caused yesterday's system failure?
Alex: I am not positive that I understand what you are asking, please try rewording your question.
You Asked: I think it would be courteous if United posted an explanation and apology.
Alex: I'm not positive that I understand what you are asking. If you would like to reach us, please use the Contact Us link below.
Here's More:
Contact Us
You Asked: What happened to your website yesterday?
Alex: I'm not sure I understand, please try rewording your question.
You Asked: What caused yesterday's system failure?
Alex: I am not positive that I understand what you are asking, please try rewording your question.
You Asked: I think it would be courteous if United posted an explanation and apology.
Alex: I'm not positive that I understand what you are asking. If you would like to reach us, please use the Contact Us link below.
Here's More:
Contact Us

#455
FlyerTalk Evangelist
Join Date: May 2007
Location: Houston
Programs: UA Plat, Marriott Gold
Posts: 12,128
375 were delayed in the same time period the previous day, so it's only 200 additional delays. Although the average delay was worse, 75 min Tuesday vs 50 min Monday.
This is a huge non-sequitur. Every major airline asks for a ground stop for their operations at an airport on occasion.
This is a huge non-sequitur. Every major airline asks for a ground stop for their operations at an airport on occasion.
Last edited by mduell; Aug 29, 12 at 10:40 pm

#456
Join Date: Jun 2002
Location: MHT/BOS
Programs: AA EXP; UA 1P
Posts: 200
Thanks, good to know. Do you know what the deal is with this supposed more-advanced PMCO system?

#457
Join Date: Sep 2010
Location: San Francisco Bay Area
Posts: 5,825
My experience yesterday.
Checked into my flight 24 hours in advance. I'm flying from CHS to IAD.
I check in in Charleston, and my flight is transiting on time. Lots of other flights are delayed due to weather, mostly Delta flights to/from Atlanta.
Our flight lands on time. We don't start boarding at the right time. No announcements. Eventually we're told that the flight crew is checking the airplane. They board other flights to Chicago.
As time passes for us to depart, the come back on and let us know there will be a 30 min delay because of an APU, which they assured us is just paperwork, that a mechanic just have to verify something and that's it. About 45 min past departure, they say if you have any questions, just approach the counter. After about 5 min of this, they announce that their servers are down. At one hour past departure, I thought I would call United and find out if they could protect me on a later flight. That's when I find out that all computers are down everywhere. They can't do anything over the phone. Also, even though I had internet access, I can't do anything there either.
So, our flight finally boarded 90 min late. If I could have reached United, I might have been able to reroute through Chicago or on another airline, because if I missed my connection in IAD, there were no more flights to LAS that night.
We landed and left the airplane 11 min before my LAS flight was scheduled to depart. Tried using the mobile app, but it was not working. The monitors showed the flight departing on time. I arrived at A2 and Departure was from D2 and the shuttle buses are not running, only method of going between terminals was to take the train.
I ran anyway - and got to the gate to find the airplane boarded but still there. They were there 14 min past departure. They let me on - they were unable to scan my boarding pass - just wrote down where I was seated, etc. About 5 min later, they closed the door and we departed.
Even more shocking was that my checked bag arrived in LAS.
It would not have been so bad if I had been traveling alone, but I was meeting my mother in IAD (who was connecting from another city) and flying with her to LAS.
So, even though I made it, and my bag made it - there was absolutely no useful information that I could get through the Web, agents or phone to tell me what was going on and to make backup plans. All I knew was my departing flight was leaving 90 min late, and my 100 min connection time in IAD dropped to 10 min. Easyupdate never gave me updates on late flights - the gate info available from the mobile app was wrong - had I not called my mother and went to the gate she said the flight was departing - I might have tried to go to the wrong gate and missed the connection.
There were long lines at the customer service lines in IAD.
This is just no way to run an airline.
Billy
Checked into my flight 24 hours in advance. I'm flying from CHS to IAD.
I check in in Charleston, and my flight is transiting on time. Lots of other flights are delayed due to weather, mostly Delta flights to/from Atlanta.
Our flight lands on time. We don't start boarding at the right time. No announcements. Eventually we're told that the flight crew is checking the airplane. They board other flights to Chicago.
As time passes for us to depart, the come back on and let us know there will be a 30 min delay because of an APU, which they assured us is just paperwork, that a mechanic just have to verify something and that's it. About 45 min past departure, they say if you have any questions, just approach the counter. After about 5 min of this, they announce that their servers are down. At one hour past departure, I thought I would call United and find out if they could protect me on a later flight. That's when I find out that all computers are down everywhere. They can't do anything over the phone. Also, even though I had internet access, I can't do anything there either.
So, our flight finally boarded 90 min late. If I could have reached United, I might have been able to reroute through Chicago or on another airline, because if I missed my connection in IAD, there were no more flights to LAS that night.
We landed and left the airplane 11 min before my LAS flight was scheduled to depart. Tried using the mobile app, but it was not working. The monitors showed the flight departing on time. I arrived at A2 and Departure was from D2 and the shuttle buses are not running, only method of going between terminals was to take the train.
I ran anyway - and got to the gate to find the airplane boarded but still there. They were there 14 min past departure. They let me on - they were unable to scan my boarding pass - just wrote down where I was seated, etc. About 5 min later, they closed the door and we departed.
Even more shocking was that my checked bag arrived in LAS.
It would not have been so bad if I had been traveling alone, but I was meeting my mother in IAD (who was connecting from another city) and flying with her to LAS.
So, even though I made it, and my bag made it - there was absolutely no useful information that I could get through the Web, agents or phone to tell me what was going on and to make backup plans. All I knew was my departing flight was leaving 90 min late, and my 100 min connection time in IAD dropped to 10 min. Easyupdate never gave me updates on late flights - the gate info available from the mobile app was wrong - had I not called my mother and went to the gate she said the flight was departing - I might have tried to go to the wrong gate and missed the connection.
There were long lines at the customer service lines in IAD.
This is just no way to run an airline.
Billy
I had some flexibility in my schedule and did not book my normal direct flight - goal to build some UA PQM's.
Flew BDL-IAD-MIA-IAD no problem. The MIA-IAD flight on time, no problems.
The end of my itinerary was IAD-ORF-ORD-SFO, with SFO as final destination 10:00 PM.
So, MIA-IAD went fine, arrived at my A gate for the IAD-ORF flight about 2:30. Initially, the Q200 had a mechanical issue so a mechanical delay. Coudn't fix it fast, and had another one standing by so they switched to the second aircraft. No ground power unit (?) available, so a further delay. And somewhere in here, the computers crashed.
I see my flight delaying further, and decide to bail on my original itinerary and see if I can just get on a direct IAD-SFO. Since system was down, and CS nor 1K line could do anything, I went back over to the C (or D?) gates to see if I could get on the 4:30 or so direct flight to SFO. The GA there was unequivocal in her denial: No way I can switch you on to this flight with our computer system down. Period.
I went back to the A gates - further delay on the ORF flight... Every 10 minutes or so I call the 1K line and get the same message: system is still down. I go to Customer Service desk in A again to see if they can reroute me. No dice.
Finally I decide to get on my IAD-ORF flight and take my chances...
In retrospect, I should have just stayed at IAD and pushed harder to get on one of the direct SFO flights, even if it took until the 10:00 PM departure...
Fly to ORF, 40 minute flight took 60 minutes since they navigated around a storm. Land and the airport is in groundstop due to massive lightning. Not sure how long this was, but at least 45 minutes.
During this time call the 1K line, systems are now back up. They rebook me on an immediate flight back to IAD, then IAD to SFO (the 10:00 PM).
When we finally deplaned an agent was waiting for me with my ORF-IAD and IAD-SFO BP's. She showed great concern and really was the kind of UA employee you like to meet. She even ran out in the rain to get my bag from the handlers for me! (They would not let us get them, even though we gate checked them, and were going to send them all to baggage claim due to the unstable weather.)
Walk off the IAD-ORF, over one gate and right back on to the ORF-IAD (my second there and back in one day, a record for me

We hit the runway and stop. For nearly an hour. Weather. Finally take off and then arrive at IAD at 10:01 PM, at the A gates. My direct SFO flight was at the D gates, scheduled to depart at 9:59. Oh well.
Go get in line at CS desk (my phone had run out of battery by this point). About 30 minutes later get to the front of the line and am provided with a voucher for the Sheraton Reston, and 2 $10 food certs. They would have given me taxi fare, but the Sheraton has a 24 hour shuttle so I did not need it.
Also, rebooked on the 8:15AM IAD to SFO in Y, so I immediately jump to the top of the upgrade list with 5 seats available.
5 or so hours of sleep, back to IAD on the 6:00 AM shuttle.
Quickly through security, use my second $10 cert for some breakfast, and head to D24. About boarding time, called up to the desk and handed my new F BP. Good, because I needed more sleep...
Uneventful flight, arrive SFO 40 minutes early and of course our gate is still full... 15 - 20 minutes later, at the gate and on my way home.
I picked a bad day for a Mileage Run! (It actually was not really an MR, since I needed to do the flight anyway, but I did add a lot of unnecessary legs. I am not really sure how many PQM's I will earn with the Y rebookings...)
Sorry for the long post...


#458
Join Date: Mar 2012
Posts: 24
Well - we know from multiple first-hand accounts that there was stale and missing data when the system finally came back up. That certainly suggests an application problem, rather than a hardware problem - unless you're saying there was a hardware/network problem that was so severe that they failed over to a redundant system with stale data just to get something back up...

#459
Join Date: Feb 2009
Location: The World.
Programs: UA MP/UC - "RIP Tulip Plat"
Posts: 1,221
I think there will be a lot of blame and mud-slinging amongst the two pre-merger camps. On my flight yesterday which was caught in the shut-down, the pilot made a point of telling us that the flight crew were legacy continental crew in a tone as if to absolve them (and other CO staff) of the chaotic situation.
Last edited by UAL4life; Aug 30, 12 at 3:41 am

#460
Join Date: Sep 2000
Location: Denver, CO
Programs: UA 1K 23 years/2MM, Honors LT Diamond, AVIS & Hertz Prez Club
Posts: 4,612
I'm surprised this hasn't been quoted up til now.
United’s operations are running normally today following yesterday’s network outage. The outage lasted approximately two hours, and as a result we experienced 580 delays and nine cancellations. (8 yesterday, 1 residual today). The outage was caused when a piece of communication equipment in one of our data centers failed and disabled communications with our airports and web site. We have fully redundant systems and we are working with the manufacturers to determine why the backup equipment did not work as it was supposed to.
Source: http://thebat-sf.com/2012/08/29/over...etwork-outage/
Sounds like a router or firewall device failed and the backup didn't work either.
The only comment I have here is that I would probably design entry points to the systems such that -
united.com
airport ops
reservations
all hit different points of entry with networked devices that back one another up in such a way that if one of these platforms has to be down, the others are not impacted.
United’s operations are running normally today following yesterday’s network outage. The outage lasted approximately two hours, and as a result we experienced 580 delays and nine cancellations. (8 yesterday, 1 residual today). The outage was caused when a piece of communication equipment in one of our data centers failed and disabled communications with our airports and web site. We have fully redundant systems and we are working with the manufacturers to determine why the backup equipment did not work as it was supposed to.
Source: http://thebat-sf.com/2012/08/29/over...etwork-outage/
Sounds like a router or firewall device failed and the backup didn't work either.
The only comment I have here is that I would probably design entry points to the systems such that -
united.com
airport ops
reservations
all hit different points of entry with networked devices that back one another up in such a way that if one of these platforms has to be down, the others are not impacted.

#461
Join Date: Mar 2006
Location: SFO
Programs: DL DM/MM; UA Premier 1K; AA EXP; ICH Plat Ambassador
Posts: 1,565
Alex: Hi, I'm Alex, your united.com guide. I've been designed to make your time on our website as fast and efficient as possible. Just type in a question, keyword or phrase below and I'll take you to the information you're looking for.
You Asked: What happened to your website yesterday?
Alex: I'm not sure I understand, please try rewording your question.
You Asked: What caused yesterday's system failure?
Alex: I am not positive that I understand what you are asking, please try rewording your question.
You Asked: I think it would be courteous if United posted an explanation and apology.
Alex: I'm not positive that I understand what you are asking. If you would like to reach us, please use the Contact Us link below.
Here's More:
Contact Us
You Asked: What happened to your website yesterday?
Alex: I'm not sure I understand, please try rewording your question.
You Asked: What caused yesterday's system failure?
Alex: I am not positive that I understand what you are asking, please try rewording your question.
You Asked: I think it would be courteous if United posted an explanation and apology.
Alex: I'm not positive that I understand what you are asking. If you would like to reach us, please use the Contact Us link below.
Here's More:
Contact Us

#462
Join Date: Apr 2012
Location: SFO; SJC
Programs: UA Silver; WN; Marriott; SPG; Hilton; IHG; National; TSA Pre; Clear
Posts: 199
UA says HW issue caused outage
http://www.mercurynews.com/rss/ci_21429831?source=rss
Failure of communications HW seems like a plausible explanation, since both ticketing and website were down.
Failure of communications HW seems like a plausible explanation, since both ticketing and website were down.

#463
Join Date: Jun 2006
Location: ORD
Programs: AA EXP, UA1K/2MM, Marriott Platinum Premier Lifetime
Posts: 357
http://www.mercurynews.com/rss/ci_21429831?source=rss
Failure of communications HW seems like a plausible explanation, since both ticketing and website were down.
Failure of communications HW seems like a plausible explanation, since both ticketing and website were down.

#464
FlyerTalk Evangelist
Join Date: Aug 2002
Location: Bay Area, CA
Programs: UA Plat 2MM; AS MVP Gold 75K
Posts: 35,062
http://www.mercurynews.com/rss/ci_21429831?source=rss
Failure of communications HW seems like a plausible explanation, since both ticketing and website were down.
Failure of communications HW seems like a plausible explanation, since both ticketing and website were down.

They wouldn't dare blame it on EDS, because EDS is probably why they're using SHARES in the first place -- there's that Texas connection between EDS and CO after all.

#465
Join Date: Dec 2007
Location: Las Vegas
Programs: DL Diamond, UA Platinum, AA Lifetime Gold, Hilton Diamond, Marriott Titanium, Radisson Gold
Posts: 6,561
Wirelessly posted (Mozilla/5.0 (iPhone; CPU iPhone OS 5_1_1 like Mac OS X) AppleWebKit/534.46 (KHTML, like Gecko) Version/5.1 Mobile/9B206 Safari/7534.48.3)
Iphone app is not showing any activity post 3/3
Iphone app shows me losing nearly half my PQM and RDM. Website is normal
Iphone app is not showing any activity post 3/3
Iphone app shows me losing nearly half my PQM and RDM. Website is normal
Last edited by iluv2fly; Aug 30, 12 at 9:35 am Reason: merge
