Migrating ds106 to the Reclaim Cloud

If the migration of bavatuesdays was a relatively simple move to Reclaim Cloud, doing the same for ds106 was anything but. Five days after starting the move I finally was successful, but not before a visceral sense of anguish consumed my entire week. Obsession is not healthy, and at least half the pain was my own damn fault. If I would have taken the time to read Mika Epstein’s 2012 meticulous post about moving a pre-3.5 version of WordPress Multisite from blogs.dir to uploads/sites in its entirety, none of this would have ever happened.

I started the migration on Tuesday of last week, and I got everything over pretty cleanly on the first try. At first glance everything was pretty much working so I was thrilled. I was even confident enough to point DNS away from the low-tenant shared hosting server it had been residing on.*

The question might be asked, why move the ds106 sites to Reclaim Cloud at all?  First off, I thought it would be a good test for seeing how the new environment handles a WordPress Cluster that is running multisite with subdomains. What’s more, I was interested in finding out during our Reclaim Cloud beta exactly how many resources are consumed and how often the site needs to scale to meet resource demands. Not only to do a little stress-testing on our one-click WordPress Cluster, but also try and get insight into costs and pricing. All that said, Tim did warn me that I was diving into the deep end of the cloud given the number of moving parts ds106 has, but when have I ever listened to reason?

Like I said, everything seemed smooth at first. All pages and images on ds106.us were loading as expected, I was just having issues getting local images to load on subdomain sites like http://assignments.ds106.us or http://tdc.ds106.us. I figured this would be an easy fix, and started playing with the NGINX configuration given from experience I knew this was most likely a WordPress Multisite re-direct issue. WordPress Multisite was merged into WordPress core in version 3.0, when this happened older WordPress Multi-user instances (like ds106) were working off legacy code, one of the biggest differences is where images were uploaded and how they were masked in the URL. In WPMU images for sub sites were uploaded to wp-content/blogs.dir/siteID/files, and using .htaccess rules were re-written to show the URL as http://ds106.us/files/image1.jpg. After WordPress 3.0 was released, all new WordPress Multisite instances (no longer was it called multi-user) would be uploaded to wp-content/uploads/sites/siteID, and they they no longer mask, effectively including the entire URL, namely http://ds106.us/wp-content/uploads/sites/siteID/image1.jpg.

So, that’s a little history to explain why I assumed it was an issue with the .htaccess rules masking the subdomain URLs. In fact, in the end I was right about that part at least. But given ds106.us was moving from an apache server-based stack to one running NGINX, I made another assumption that the issue was with the NGINX redirects—and that’s where I was wrong and lost a ton of time. On the bright side, I learned more than a little about the nginx.conf file, and let me take a moment to document some of that below for ds106 infrastructure posterity. So, the .htaccess file is what Apache uses to control re-directs, and the those look something like this for a WordPress Multisite instance before 3.4.2:

# BEGIN WordPress
RewriteEngine On
RewriteBase /
RewriteRule ^index\.php$ - [L]

# uploaded files
RewriteRule ^files/(.+) wp-includes/ms-files.php?file=$1 [L]

RewriteCond %{REQUEST_FILENAME} -f [OR]
RewriteCond %{REQUEST_FILENAME} -d
RewriteRule ^ - [L]
RewriteRule . index.php [L]
# END WordPress

In WordPress 3.5 the ms-files.php function was deprecated, and this was my entire problem, or so I believe. Here is a copy of the .htaccess file for WordPress Multisite after version 3.5:

RewriteEngine On
RewriteBase /
RewriteRule ^index\.php$ - [L]

# add a trailing slash to /wp-admin
RewriteRule ^wp-admin$ wp-admin/ [R=301,L]

RewriteCond %{REQUEST_FILENAME} -f [OR]
RewriteCond %{REQUEST_FILENAME} -d
RewriteRule ^ - [L]
RewriteRule ^(wp-(content|admin|includes).*) $1 [L]
RewriteRule ^(.*\.php)$ $1 [L]
RewriteRule . index.php [L]

No reference to ms-files.php at all. But (here is where I got confused cause I do not have the same comfort level with nginx.conf as I do .htaccess) in the nginx.conf file on the Reclaim Cloud server there is a separate subdom.conf file that deals with these re-directs like so:

    #WPMU Files
        location ~ ^/files/(.*)$ {
                try_files /wp-content/blogs.dir/$blogid/$uri /wp-includes/ms-files.php?file=$1 ;
                access_log off; log_not_found off;      expires max;
        }

    #WPMU x-sendfile to avoid php readfile()
    location ^~ /blogs.dir {
        internal;
        alias /var/www/example.com/htdocs/wp-content/blogs.dir;
        access_log off;     log_not_found off;      expires max;
    }

    #add some rules for static content expiry-headers here
}

[See more on nginx.conf files for WordPress here).]

Notice the reference to WPMU in the comments, not WPMS. But I checked the ds106.us instance on the apache server it was being migrated from and this line existed:

RewriteRule ^files/(.+) wp-includes/ms-files.php?file=$1 [L]

So ds106 was still trying to use ms-files.php even though it was deprecated long ago. While this is very much a legacy issue that comes with having a relatively complex site online for over 10 years, I’m still stumped as to why the domain masking and redirects for images on the subdomain sites worked cleanly on the Apache server but broke on the NGINX server (any insight there would be greatly appreciated). Regardless, they did and everything I tried to do to fix it (and I tried pretty much everything) was to no avail.

I hit this post on Stack Exchange that was exactly my problem fairly early on in my searches, but avoided doing it right away given I figured moving all uploads for subdomain  sites out of blog.dir into uploads/sites would be a last resort. But alas 3 days and 4 separate migrations of ds106 later—I finally capitulated and realized that Mika Epstein’s brilliant guide was the only solution I could find to get this site moved and working. On the bright side, this change should help future-proof ds106.us for the next 10 years 🙂

I really don’t have much to add to Mika’s post, but I will make note of some of the specific settings and commands I used along the way as a reminder when in another 10 years I forget I even did this.

I’ll use Martha Burtis‘s May 2011 ds106 course (SiteID 3) as an example of a subdomain migrated to capture the commands.

The following command moves the files for site with ID 3 (may11.ds106.us) into its new location at uploads/sites/3

mv ~/wp-content/blogs.dir/3 ~/wp-content/uploads/sites/

This command takes all the year and month-based files in 3/files/* and moves them up one level, effectively getting rid of the files directory level:

mv ~/wp-content/uploads/sites/3/files/* ~/wp-content/uploads/sites/3

At this point we use the WP-CLI tool do a find and replace of the database for all URLs referring to may11.ds106.us/files and replace them with may11.ds106.us/wp-content/uploads/sites/3:

wp --network --allow-root search-replace 'may11.ds106.us/files' 'may11.ds106.us/wp-content/uploads/sites/3'

The you do this 8 or 9 more times for each subdomain, this would obviously be very , very painful and need to be scripted for a much bigger site with many 10s, 100s or 1000s of sub sites.†

To move over all the files and the database I had to run two commands. The first was to sync files with the new server:

rsync -avz root@ds106.oldserver.com:/home/ds106/public_html/ /data/ROOT/

Rsync is is the best command ever and moves GBs and GBS of data in minutes.

The second command was importing the database, which is 1.5 GBs! I exported the database locally, then zipped it up and uploaded it to the database cluster container and then unzipped it and ran the database import tool, which takes a bit of time:

mysql -u user_name -p database_name < SQL_file_to_import

After that, I had to turn off ms_files_rewriting, the culprit behind all my issues. That command was provided in Mika’s post linked to above:

INSERT INTO `my_database`.`wp_sitemeta` (`meta_id`, `site_id`, `meta_key`, `meta_value`) VALUES (NULL, '1', 'ms_files_rewriting', '0');

You also need to add the following line to wp-config.php:

define( 'UPLOADBLOGSDIR', 'wp-content/uploads/sites' );

The only other thing I did for safe-keeping was create a quick plugin function based on Mika’s stupid_ms_files_rewriting to force the re-writing for any stragglers to the new URL:

function stupid_ms_files_rewriting() {
$url = '/wp-content/uploads/sites/' . get_current_blog_id();
define( 'BLOGUPLOADDIR', $url );
}
add_action('init','stupid_ms_files_rewriting');

I put that in mu-plugins, and the migrated ds106.us multisite install worked! There was some elation and relief this past Saturday when it finally worked. I was struggle-bussing all week as a result of this failed migration, but I am happy to say the Reclaim Cloud environment was not the issue, rather legacy WordPress file re-writes seemed to be the root cause of my problems.

I did have to also update some hardcoded image URLs in the assignment bank theme , but that was easy. The only thing left to do now is fix the ds106 MediaWIki instance and write that to HTML so I can preserve some of the early syllabi and other assorted resources. It was a bit of a beast, but I am very happy to report that ds106 is now on the Reclaim Cloud and receiving all the resources it deserves on-demand 🙂


*VIP1 was the most recent in a series of temporary homes given how resource intensive the site can be given the syndication hub it has become.

†I did all these changes on the Apache live site before moving them over (take a database back-up if you are living on the edge like me), and then used the following tool to link all the

bava in the cloud with clusters

http://reclaim.cloud

Last weekend I took a small step for the bava, but potentially a huge step for Reclaim Hosting. This modest blog was migrated (once again!) into a containerized stack in the cloud in ways we could only dream about 7 years ago. There is more to say about this, but I’m not sure now is the right time given there is terror in the streets of the US of A and the fascist-in-charge is declaring warfare on the people. What fresh hell is this?!  But, that said, I’ve been hiding from American for years now, and quality lockdown time in Italy can make all the difference. Nonethless, I find myself oscillating wildly between unfettered excitement about the possibilities of Reclaim and fear and loathing of our geo-political moment. As all the cool technologists say, I can’t go on, I’ll go on….

For anyone following along with my migrations since January, there have been 4 total. I migrated from one of Reclaim Hosting’s shared hosting server’s in early January because the bava was becoming an increasingly unpredictable neighbor. The HOA stepped in, it wasn’t pretty. So, it meant new digs for the bava, and I blogged my moved from cPanel to a Digital Ocean droplet that I spun up. I installed a LEMP environment, setup email, firewall, etc. I started with a fresh Centos 7.6 sever and set it up as a means to get more comfortable with my inner-sysadmin. It went pretty well, and costs me about $30 per month with weekly backups. But while doing a migration I discovered a container-based WordPress hosting service called Kinsta which piqued my interest, so I tried that out. But it started to get pricey, so I jumped back to Digital Ocean in April (that’s the third move) thinking that was my last.*

Imag

But a couple of weeks later I was strongly considering a fourth move to test out a new platform we’re working on, Reclaim Cloud, that would provide our community a virtualized container environment to fill a long-standing gap in our offerings to host a wide array of applications run in environments other than LAMP. I started with a quick migration of my test Ghost instance using the one-click installer for Ghost (yep, that’s right, a one-click installer for Ghost). After that it was a single export/import of content and copying over of some image files. As you can see from the screenshot above, while this Ghost was a one-click install, the server stack it runs on is made visible. The site has a load balancer, an NGINX application server, and a database which we can then scale or migrate to different data centers around the world.

In fact, geo-location at Reclaim for cloud-based apps will soon be a drop-down option. You can see the UK flag at the top of this one as hope springs eternal London will always be trEU. This was dead simple, especially given I was previously hosting my Ghost instance on a cPanel account which was non-trival to setup. So, feeling confident after just a few minutes on a Saturday, I spent last Sunday taking on the fourth (and hopefully final) migration of this blog to the Reclaim Cloud! I’ve become an old hand at this by now, so grabbing a database dump was dead simple, but I did run into an issue with using the rsync command to move files to the new server, but I’ll get to that shortly.

First, I had to setup a WordPress cluster that has a NGINX load balancer, 2 NGINX application servers, a Gallera cluster of 3 MariaDB databases, and a NFS file system. Each of these are within their own containers, pretty cool, no? But don’t be fooled, I didn’t set this up manually—though one could with some dragging and dropping—the Reclaim Cloud has a one-click WordPress Custer install that allows me to spin-up a high-performance WordPress instance, all of which are different layers of a containerized stack:

And like having my own VPS at Digital Ocean, I have SSH and SFTP access to each and every container (or node) in the stack.

In fact, the interface also allows access and the ability to edit files right from the web interface—a kind of cloud-based version of the File Manager in cPanel.

I needed SSH access to rsync files from Digital Ocean, but that is where I ran into my only real hiccup. My Digital Ocean server was refusing the connection because it was defaulting to a SSH key, and given the key on the Reclaim Cloud stack was not what it was looking for, I started to get confused. SSH keys can make my head spin, Tim explained it like this:

I never liked that ssh keys were both called keys. Better analogy would be “private key and public door”. You put your door on both servers but your laptop has the private key to both. But the key on your laptop is not on either server, they both only have the public door uploaded. On your laptop at ~/.ssh you have two files id_rsa and id_rsa.pub. The first is the key. Any computer including a server that needs to communicate over ssh without a password would need the key. And your old server was refusing password authentication and requiring a key.

That’s why Timmy rules, after that I enabled the prompting of an SSH server password when syncing between the Cloud and Digital Ocean using this guide. After that hiccup, I was in business. The last piece was mapping the domain bavatuesdays.com:

And issuing an SSL certificate through Let’s Encrypt:

It’s worth noting here that I am using Cloudflare for DNS, and once I pointed bavatuesdays.com to the new IP address and cleared the local hosts file on my laptop the site resolved cleanly with https and was showing secure. Mission accomplished. I was a cloud professional, I can do anything. THE BAVA REIGNS! I RULE!  Ya know, the usual crap from me.

But that was all before I was terribly humbled by trying to migrate ds106.us the following day. That was a 5-day ordeal that I will blog about directly, but until then—let me enjoy the triumph of a new, clustered day of seamless expansion of resources for my blog whenever resources run high.

I woke up to this email which is what the clustering is all about, I have bavatuesdays set to add another NGINX application server to the mix when resource on the existing two go over 50%. That’s the elasticity of the Cloud that got lost when anything not on your local machine was referred to as the cloud. A seamlessly scaling environment to meet the resource demands, but only costing you what you use like a utility was always the promise that most “cloud” VPS providers could not live up to. Once the resource spike was over I got an email telling me the additional NGINX node was spun down. I am digging this feature of the bava’s new home; I can sleep tight knowing the server Gremlins will be held at bay my the elastic bands of virtualized hardware.


*I worked out the costs of Digital Hosting vs Kinsta, and that was a big reason to leave Kinsta given the bava was running quite well in their environment.

N.B:  While writing this Tim was working on his own post and he found some dead image links on the bava as a result of my various moves, and with the following command I fixed a few of them 🙂
wp search-replace 'https://bavatuesdays.com/wp-content/uploads' 'https://bavatuesdays.com/wp-content/uploads'
….
Made 8865 replacements. Please remember to flush your persistent object cache with `wp cache flush`.