Good news folks! Dropbox has managed to fully restore access to its services after an unfortunate system outage had the cloud company’s servers limping and lethargic for several days. The restoration itself took three full days, a delay attributed to the slow software tools required to retrieve data.
The Dropbox Post-Mortem
And in a welcome sign of customer care, the world’s most popular backup and cloud storage company has also decided to explain exactly what caused the unfortunate outage, which was subject to false yet panic inducing claims of hacking by two separate groups.
As explained in a post-mortem on Dropbox’s tech blog, Akhil Gupta, the head of infrastructure at Dropbox wrote:
“On Friday at 5:30 PM PT, we had a planned maintenance scheduled to upgrade the OS on some of our machines. During this process, the upgrade script checks to make sure there is no active data on the machine before installing the new OS.” Further adding. “A subtle bug in the script caused the command to reinstall a small number of active machines. Unfortunately, some master-replica pairs were impacted which resulted in the site going down”.
Basically, Dropbox tried to update a few of its servers and that simply messed up the whole system. Luckily, it seems like the effected machines were in charge of services such as photos, uploading and API tools, and did not store any user data or files.
Or, at least that’s what Dropbox is assuring us. Either way, no reports of any data loss from Dropbox’s servers have come to light, so there’s no real reason not to take Gupta at his word. Of course, every machine, server and computer at Dropbox has a backup of a backup, of a back up.
|Plan||Dropbox Plus||Dropbox Professional||Dropbox Business|
$ 9 99monthly
$ 119 00yearly
$ 19 99monthly
$ 239 88yearly
$ 15 00monthly
$ 180 00yearly
Learn, Forgive and Forget
Having learned their lesson and found a solution, Dropbox also assures us that it will be taking extra measures to insure this never happens again. Among the steps taken by Dropbox are distributed state verification and faster recovery periods. The machines will now have to locally verify their states before accepting scripts, and a specialized tool has been created by Dropbox to significantly speed up the recovery of their MySQL backups. And also promised to release this tool as an open source software for others to enjoy and benefit from.
In the end, Akhil signed off with an apology, and thanked everyone for the patience shown during this trying period.
“We know you rely on Dropbox to get things done, and we’re very sorry for the disruption. We wanted to share these technical details to shed some light on what we’re doing in response. Thanks for your patience and support.”
Now that the saga of Dropbox’s new year fiasco is over, feel free to catch up on parts one and two of this saga, and please do leave us a comment on your thoughts about the whole situation–in the comments section below.