Go Back  FlyerTalk Forums > Miles&Points > Airlines and Mileage Programs > United Airlines | MileagePlus
Reload this Page >

28 August 2012 - United System Outage (Imapct, Discussion, etc.)

Community
Wiki Posts
Search

28 August 2012 - United System Outage (Imapct, Discussion, etc.)

Thread Tools
 
Search this Thread
 
Old Aug 29, 2012, 9:02 pm
  #451  
 
Join Date: Jan 2005
Location: DEN
Programs: UA Gold-MM, AA Gold-MM, F9-Silver, Hyatt Something, Marriott Gold, IHG Plat, Hilton Diamond
Posts: 6,393
Originally Posted by njcommodore
For a 2 hr delay? If they gave everyone (or even all elites) certs for a 2 hr delay the airline would be bankrupt.
That sounds like 10th grade economics.

MBA finance would say that giving elites a $200 cert might drive a lot of incremental business back to UA. Put some restrictions on it -- only good on G fares. Only good in January. Do whatever you want UA. But instead they do nothing.
hobo13 is offline  
Old Aug 29, 2012, 9:07 pm
  #452  
 
Join Date: Nov 2010
Location: DEN
Programs: 2012 Plat-2013 Plat-2014 Silver-2015 GM
Posts: 818
Originally Posted by ani90
I think there will be a lot of blame and mud-slinging amongst the two pre-merger camps. On my flight yesterday which was caught in the shut-down, the pilot made a point of telling us that the flight crew were legacy continental crew in a tone as if to absolve them (and other CO staff) of the chaotic situation.
Typical and not surprising.

Originally Posted by makhdoom
Working as a data network engineering manager for a firm with multiple data centers, I can attest to the fact that 'no single point-of-failue' network design is niether overly complex nor expensive (considering the outage costs, etc.).

I checked the UA website but did not find any current vacancies for network egineering/management type positions :-)
You won't find any until the pmCO IT folks are shown the door. This is not an IT Dept, I'm sure it's a club that covers each others butt. The pmCO powers-to-be are probably deflecting blame to other than where it belongs.

I'm hopeful "someone" at UA knows the truth. We'll see.

Originally Posted by bocastephen
Just a FYI for the anti-SHARES hysteria crowd. I just received word via an internal communication that the cause of the outage wasn't SHARES, nor did it have anything to do with SHARES.

The cause was communication equipment in the data center that failed. This was a hardware problem, not software.

Now having said that, I am curious to know why there were no redundant systems in the data center or a quick (ie instant) method of re-routing network traffic via a different data center.
Was that communication equipment housed at Hewlett Packard?

Originally Posted by xzh445
You beat me to it. I was going to suggest that the host (SHARES) was not the issue. So to blame the PSS as many articles and lots of posters here are doing, is bit short sighted.

My guess would be that it is related to a firewall or router. As far as redundancy, I would wager several years salary that the redundant hardware is in place. I would bet that a fail over test has probably even been run fairly recently. But, we know how that goes, especially in large enterprise configurations.......the primary fails, and HSRP (or failover) doesn't auto-magically kick in.....you have a national press event.


Now you are suggesting the Jeff Smisek makes network architecture decisions?
Again I ask, even if firewall or router, is that housed at Hewlett Packard?

Last edited by FlyinHawaiian; Aug 30, 2012 at 7:44 am Reason: merge
ibuyyoufly is offline  
Old Aug 29, 2012, 9:17 pm
  #453  
 
Join Date: Jun 2004
Location: What I write is my opinion alone..don't read into it anything not written.
Posts: 9,686
Originally Posted by hobo13
That sounds like 10th grade economics.

MBA finance would say that giving elites a $200 cert might drive a lot of incremental business back to UA. Put some restrictions on it -- only good on G fares. Only good in January. Do whatever you want UA. But instead they do nothing.
That sounds like 9th grade economics. Certs given to people who are most likely going to fly UA again are likely to just reduce that revenue. Certs given to people that don't fly UA...well that is incremental revenue. If I am going to buy a product, any discount is just that much less revenue. If I was not going to buy a product, then a discount might entice me to buy that product and contribute a littl emore.

Your restrictions do make your point better...a Sat night stay requirement, and 21 day advanced purchase requirement...fare fences to make sure that it is used for personal incrimental travel vs dilutionary ticketing on a ticket that was going to be bought anyway.
fastair is online now  
Old Aug 29, 2012, 10:13 pm
  #454  
 
Join Date: Nov 2011
Programs: UA Premier Exec (err... Premier Platinum), SPG Platinum
Posts: 8
A bit of levity?

Alex: Hi, I'm Alex, your united.com guide. I've been designed to make your time on our website as fast and efficient as possible. Just type in a question, keyword or phrase below and I'll take you to the information you're looking for.

You Asked: What happened to your website yesterday?

Alex: I'm not sure I understand, please try rewording your question.

You Asked: What caused yesterday's system failure?

Alex: I am not positive that I understand what you are asking, please try rewording your question.

You Asked: I think it would be courteous if United posted an explanation and apology.

Alex: I'm not positive that I understand what you are asking. If you would like to reach us, please use the Contact Us link below.
Here's More:
Contact Us
NYCJenChannel9 is offline  
Old Aug 29, 2012, 10:29 pm
  #455  
FlyerTalk Evangelist
 
Join Date: May 2007
Location: Houston
Programs: UA Plat, Marriott Gold
Posts: 12,693
Originally Posted by XLR26
News reports say 580 flights were delayed. Ouch.
375 were delayed in the same time period the previous day, so it's only 200 additional delays. Although the average delay was worse, 75 min Tuesday vs 50 min Monday.

Originally Posted by mccullo3
When you have an airline having to call the FAA to groundstop planes heading for hubs that is endemic of a bigger issue within the organization -- this being the IT department in this instance.
This is a huge non-sequitur. Every major airline asks for a ground stop for their operations at an airport on occasion.

Last edited by mduell; Aug 29, 2012 at 10:40 pm
mduell is offline  
Old Aug 29, 2012, 11:48 pm
  #456  
 
Join Date: Jun 2002
Location: MHT/BOS
Programs: AA EXP; UA 1P
Posts: 200
Originally Posted by ibuyyoufly
I haven't read further, but col puck is incorrect. The pmCO system is housed on the pmCO hardware. For the most part, that's all housed with the fine company of Hewlett Packard.
Thanks, good to know. Do you know what the deal is with this supposed more-advanced PMCO system?
rafatmit is offline  
Old Aug 30, 2012, 12:00 am
  #457  
 
Join Date: Sep 2010
Location: San Francisco Bay Area
Posts: 5,825
Originally Posted by Delta3MM
My experience yesterday.

Checked into my flight 24 hours in advance. I'm flying from CHS to IAD.

I check in in Charleston, and my flight is transiting on time. Lots of other flights are delayed due to weather, mostly Delta flights to/from Atlanta.

Our flight lands on time. We don't start boarding at the right time. No announcements. Eventually we're told that the flight crew is checking the airplane. They board other flights to Chicago.

As time passes for us to depart, the come back on and let us know there will be a 30 min delay because of an APU, which they assured us is just paperwork, that a mechanic just have to verify something and that's it. About 45 min past departure, they say if you have any questions, just approach the counter. After about 5 min of this, they announce that their servers are down. At one hour past departure, I thought I would call United and find out if they could protect me on a later flight. That's when I find out that all computers are down everywhere. They can't do anything over the phone. Also, even though I had internet access, I can't do anything there either.

So, our flight finally boarded 90 min late. If I could have reached United, I might have been able to reroute through Chicago or on another airline, because if I missed my connection in IAD, there were no more flights to LAS that night.

We landed and left the airplane 11 min before my LAS flight was scheduled to depart. Tried using the mobile app, but it was not working. The monitors showed the flight departing on time. I arrived at A2 and Departure was from D2 and the shuttle buses are not running, only method of going between terminals was to take the train.

I ran anyway - and got to the gate to find the airplane boarded but still there. They were there 14 min past departure. They let me on - they were unable to scan my boarding pass - just wrote down where I was seated, etc. About 5 min later, they closed the door and we departed.

Even more shocking was that my checked bag arrived in LAS.

It would not have been so bad if I had been traveling alone, but I was meeting my mother in IAD (who was connecting from another city) and flying with her to LAS.

So, even though I made it, and my bag made it - there was absolutely no useful information that I could get through the Web, agents or phone to tell me what was going on and to make backup plans. All I knew was my departing flight was leaving 90 min late, and my 100 min connection time in IAD dropped to 10 min. Easyupdate never gave me updates on late flights - the gate info available from the mobile app was wrong - had I not called my mother and went to the gate she said the flight was departing - I might have tried to go to the wrong gate and missed the connection.

There were long lines at the customer service lines in IAD.

This is just no way to run an airline.

Billy
This is similar to my experience yesterday, although mine did not end up as successful as yours...

I had some flexibility in my schedule and did not book my normal direct flight - goal to build some UA PQM's.

Flew BDL-IAD-MIA-IAD no problem. The MIA-IAD flight on time, no problems.

The end of my itinerary was IAD-ORF-ORD-SFO, with SFO as final destination 10:00 PM.

So, MIA-IAD went fine, arrived at my A gate for the IAD-ORF flight about 2:30. Initially, the Q200 had a mechanical issue so a mechanical delay. Coudn't fix it fast, and had another one standing by so they switched to the second aircraft. No ground power unit (?) available, so a further delay. And somewhere in here, the computers crashed.

I see my flight delaying further, and decide to bail on my original itinerary and see if I can just get on a direct IAD-SFO. Since system was down, and CS nor 1K line could do anything, I went back over to the C (or D?) gates to see if I could get on the 4:30 or so direct flight to SFO. The GA there was unequivocal in her denial: No way I can switch you on to this flight with our computer system down. Period.

I went back to the A gates - further delay on the ORF flight... Every 10 minutes or so I call the 1K line and get the same message: system is still down. I go to Customer Service desk in A again to see if they can reroute me. No dice.

Finally I decide to get on my IAD-ORF flight and take my chances...

In retrospect, I should have just stayed at IAD and pushed harder to get on one of the direct SFO flights, even if it took until the 10:00 PM departure...

Fly to ORF, 40 minute flight took 60 minutes since they navigated around a storm. Land and the airport is in groundstop due to massive lightning. Not sure how long this was, but at least 45 minutes.

During this time call the 1K line, systems are now back up. They rebook me on an immediate flight back to IAD, then IAD to SFO (the 10:00 PM).

When we finally deplaned an agent was waiting for me with my ORF-IAD and IAD-SFO BP's. She showed great concern and really was the kind of UA employee you like to meet. She even ran out in the rain to get my bag from the handlers for me! (They would not let us get them, even though we gate checked them, and were going to send them all to baggage claim due to the unstable weather.)

Walk off the IAD-ORF, over one gate and right back on to the ORF-IAD (my second there and back in one day, a record for me ) At least this flight was on a Q400!

We hit the runway and stop. For nearly an hour. Weather. Finally take off and then arrive at IAD at 10:01 PM, at the A gates. My direct SFO flight was at the D gates, scheduled to depart at 9:59. Oh well.

Go get in line at CS desk (my phone had run out of battery by this point). About 30 minutes later get to the front of the line and am provided with a voucher for the Sheraton Reston, and 2 $10 food certs. They would have given me taxi fare, but the Sheraton has a 24 hour shuttle so I did not need it.

Also, rebooked on the 8:15AM IAD to SFO in Y, so I immediately jump to the top of the upgrade list with 5 seats available.

5 or so hours of sleep, back to IAD on the 6:00 AM shuttle.

Quickly through security, use my second $10 cert for some breakfast, and head to D24. About boarding time, called up to the desk and handed my new F BP. Good, because I needed more sleep...

Uneventful flight, arrive SFO 40 minutes early and of course our gate is still full... 15 - 20 minutes later, at the gate and on my way home.

I picked a bad day for a Mileage Run! (It actually was not really an MR, since I needed to do the flight anyway, but I did add a lot of unnecessary legs. I am not really sure how many PQM's I will earn with the Y rebookings...)

Sorry for the long post...
LarkSFO is offline  
Old Aug 30, 2012, 2:56 am
  #458  
 
Join Date: Mar 2012
Posts: 24
Originally Posted by xzh445
O.K. the HSRP comment was a bit simplified. The points I was making were that I am certain that the design of the network provides for redundancy/failover, and the Host (SHARES) was not the cause of the outage. It was running fine..but couldn't talk to anyone.
Well - we know from multiple first-hand accounts that there was stale and missing data when the system finally came back up. That certainly suggests an application problem, rather than a hardware problem - unless you're saying there was a hardware/network problem that was so severe that they failed over to a redundant system with stale data just to get something back up...
mbeck69 is offline  
Old Aug 30, 2012, 3:28 am
  #459  
 
Join Date: Feb 2009
Location: The World.
Programs: UA MP/UC - "RIP Tulip Plat"
Posts: 1,225
Originally Posted by ani90
I think there will be a lot of blame and mud-slinging amongst the two pre-merger camps. On my flight yesterday which was caught in the shut-down, the pilot made a point of telling us that the flight crew were legacy continental crew in a tone as if to absolve them (and other CO staff) of the chaotic situation.
That is a fire-able offense now. As it should be.

Last edited by UAL4life; Aug 30, 2012 at 3:41 am
UAL4life is offline  
Old Aug 30, 2012, 3:37 am
  #460  
 
Join Date: Sep 2000
Location: Denver, CO
Programs: UA 1K 25 years/2MM, Honors LT Diamond, AVIS & Hertz Prez Club
Posts: 4,753
I'm surprised this hasn't been quoted up til now.

United’s operations are running normally today following yesterday’s network outage. The outage lasted approximately two hours, and as a result we experienced 580 delays and nine cancellations. (8 yesterday, 1 residual today). The outage was caused when a piece of communication equipment in one of our data centers failed and disabled communications with our airports and web site. We have fully redundant systems and we are working with the manufacturers to determine why the backup equipment did not work as it was supposed to.

Source: http://thebat-sf.com/2012/08/29/over...etwork-outage/

Sounds like a router or firewall device failed and the backup didn't work either.

The only comment I have here is that I would probably design entry points to the systems such that -

united.com
airport ops
reservations

all hit different points of entry with networked devices that back one another up in such a way that if one of these platforms has to be down, the others are not impacted.
SFO 1K is offline  
Old Aug 30, 2012, 5:38 am
  #461  
 
Join Date: Mar 2006
Location: SFO
Programs: DL DM/MM; UA Premier 1K; AA EXP; ICH Plat Ambassador
Posts: 1,565
Originally Posted by NYCJenChannel9
Alex: Hi, I'm Alex, your united.com guide. I've been designed to make your time on our website as fast and efficient as possible. Just type in a question, keyword or phrase below and I'll take you to the information you're looking for.

You Asked: What happened to your website yesterday?

Alex: I'm not sure I understand, please try rewording your question.

You Asked: What caused yesterday's system failure?

Alex: I am not positive that I understand what you are asking, please try rewording your question.

You Asked: I think it would be courteous if United posted an explanation and apology.

Alex: I'm not positive that I understand what you are asking. If you would like to reach us, please use the Contact Us link below.
Here's More:
Contact Us
That wasn't Alex answering. It was actually SMI/J!
mike_plat is offline  
Old Aug 30, 2012, 7:14 am
  #462  
 
Join Date: Apr 2012
Location: SFO; SJC
Programs: UA Silver; WN; Marriott; SPG; Hilton; IHG; National; TSA Pre; Clear
Posts: 199
UA says HW issue caused outage

http://www.mercurynews.com/rss/ci_21429831?source=rss

Failure of communications HW seems like a plausible explanation, since both ticketing and website were down.
rwmiller56 is offline  
Old Aug 30, 2012, 7:25 am
  #463  
 
Join Date: Jun 2006
Location: ORD
Programs: AA EXP, UA1K/2MM, Marriott Platinum Premier Lifetime
Posts: 357
Originally Posted by rwmiller56
http://www.mercurynews.com/rss/ci_21429831?source=rss

Failure of communications HW seems like a plausible explanation, since both ticketing and website were down.
This was reported yesterday. I call this push the blame onto something else but their own faults. I'm surprised they didn't blame it on their outsourcing partner EDS.
shortkidd is offline  
Old Aug 30, 2012, 9:20 am
  #464  
FlyerTalk Evangelist
 
Join Date: Aug 2002
Location: Bay Area, CA
Programs: UA Plat 2MM; AS MVP Gold 75K
Posts: 35,068
Originally Posted by rwmiller56
http://www.mercurynews.com/rss/ci_21429831?source=rss

Failure of communications HW seems like a plausible explanation, since both ticketing and website were down.
They're saying the redundant system failed to kick in. I didn't realize the SHARES mainframe could support a redundant interface. It's tough to do redundancy with only one (1) serial port.


Originally Posted by shortkidd
This was reported yesterday. I call this push the blame onto something else but their own faults. I'm surprised they didn't blame it on their outsourcing partner EDS.
They wouldn't dare blame it on EDS, because EDS is probably why they're using SHARES in the first place -- there's that Texas connection between EDS and CO after all.
channa is offline  
Old Aug 30, 2012, 9:21 am
  #465  
 
Join Date: Dec 2007
Location: Las Vegas
Programs: DL Platinum, AA Lifetime Gold, Hilton Diamond, Marriott Platinum, Radisson Premium
Posts: 6,638
Wirelessly posted (Mozilla/5.0 (iPhone; CPU iPhone OS 5_1_1 like Mac OS X) AppleWebKit/534.46 (KHTML, like Gecko) Version/5.1 Mobile/9B206 Safari/7534.48.3)

Iphone app is not showing any activity post 3/3

Iphone app shows me losing nearly half my PQM and RDM. Website is normal

Last edited by iluv2fly; Aug 30, 2012 at 9:35 am Reason: merge
demkr is offline  


Contact Us - Manage Preferences - Archive - Advertising - Cookie Policy - Privacy Statement - Terms of Service -

This site is owned, operated, and maintained by MH Sub I, LLC dba Internet Brands. Copyright © 2024 MH Sub I, LLC dba Internet Brands. All rights reserved. Designated trademarks are the property of their respective owners.