Source: http://www.pcmag.com/article2/0,2817,2413716,00.asp
Alas, it appears the downtime was the result of human error. According to Amazon's calculations, a developer deleted a portion of ELB data at 12:24 p.m. Pacific on Dec. 24.
"This data is used and maintained by the ELB control plane to manage the configuration of the ELB load balancers in the region (for example tracking all the backend hosts to which traffic should be routed by each load balancer)," Amazon said.
"Unfortunately, the developer did not realize the mistake at the time," Amazon continued. "After this data was deleted, the ELB control plane began experiencing high latency and error rates for API calls to manage ELB load balancers."
Since Amazon didn't realize that the data had been deleted, its team initially focused on the API errors. "The team was puzzled as many APIs were succeeding (customers were able to create and manage new load balancers but not manage existing load balancers) and others were failing," Amazon said.
As a result, it took Amazon several hours to figure out that data had been deleted. When it did, around 5 p.m. Pacific, the team disabled several of the ELB control plane workflows and recovered some data. They tried to restart the system by bringing it back to the state in which it was in just before the data deletion, but that "failed to provide a usable snapshot of the data."