Aug. 4th 2014 | by Stefan Schuster
As some of you probably have noticed, Mind42 wasn't available this weekend. This was caused by the server move and migration as announced last week. Now Mind42 is back online, and it's time for a summary of the events and problems we faced.
There were two reasons for this maintenance downtime. One was, that we moved Mind42 to new servers. The other was, that we tried to migrate Mind42 to a new database (like last February). Because Mind42 is now online for more than 5 years, moving and/or migrating Mind42 unfortunately always takes a long time because of the sheer amount of data that was created since its launch.
In short: The move worked without any problems, the migration was aborted (again). The idea behind the migration to a new database system is, to make Mind42 more reliable. The new database would be a cluster of 3 servers, so if one fails, Mind42 could still operate - something we can't do with the system we are currently operating on. Unfortunately we just can't test the new system under real conditions. This was, what caused the abort last time we tried this in February. We learned from the problems back then, and were better prepared for this. But another problem hit us. It all started with a power outage in one of the data centers of our hosting providers, immediately after the end of the data migration. Some data on the disk got corrupted. We tried to fix the migrated data. Since this took too long, we started the migration process again, but the script aborted in the middle of the night. So today in the morning we still haven't been ready to switch to the new database, and aborted the process. Mind42 is now back online with the old setup, on new servers.
We will of course analyze why the power outage, and the reboot of one of the servers this caused, corrupted the data. The whole idea behind the cluster is to be able to deal with such situations. A lesson we learned though, is to avoid such mega-migrations. It's not desirable to have such long downtimes, and too much can go wrong - like an incalculable power outage and failing UPS of our hosting provider. We will reconsider our plans of how to make Mind42 more reliable.