Tim Owens

Tim Owens

Post Mortem on BJM Server Outage

On Monday, June 14th at 2PM EST an upgrade to the MySQL engine on server bjm.reclaimhosting.com (Shared Hosting) was initiated to upgrade from MySQL 5.7 to MySQL 8. This upgrade was tested on a development server without issue, however the upgrade failed and in the process of upgrading corrupted the InnoDB data. Attempts were made to repair the InnoDB data files through standard recovery processes but all attempts were unsuccessful. The decision was made at 8AM EST on June 15th to begin the disaster recovery process and restore all accounts from any available backups. A clean copy of the original database engine, MySQL 5.7, was reinstalled and restoration of all accounts was initiated. This restoration process completed at 8PM on June 15th.

Reclaim Hosting uses a backup tool called Jetbackup and stores these backups on an offsite server. Unfortunately while most accounts had been backing up regularly, some accounts had older backups than others due to size and required time to backup. Every effort was made to recover as much data as possible, but there will be some instances with newer accounts on this server where database information was lost and users will need to restore from their own local backups where available. Reclaim Hosting does regular audits of our backup system but unfortunately due to the size and scope of the amount of data that is hosted on our systems it is a moving target that caused a small percentage of data loss in this instance. We will continue to rethink our backup strategies but want to also stress that it is important for customers to not only rely on our backups but store their own for any mission critical data as best practice.

We can’t apologize enough for the trouble this has caused our customers. We are lucky that major outages are few and far between but a major incident like this one is never a comfortable situation to be in regardless of how many years of experience we have. If there is anything we can do to alleviate the situation for affected users please reach out and let us know.

Share this post

Share on facebook
Share on google
Share on twitter
Share on linkedin
Share on pinterest
Share on print
Share on email
css.php