Migrating ds106 to the Reclaim Cloud

If the migration of bavatuesdays was a relatively simple move to Reclaim Cloud, doing the same for ds106 was anything but. Five days after starting the move I finally was successful, but not before a visceral sense of anguish consumed my entire week. Obsession is not healthy, and at least half the pain was my own damn fault. If I would have taken the time to read Mika Epstein’s 2012 meticulous post about moving a pre-3.5 version of WordPress Multisite from blogs.dir to uploads/sites in its entirety, none of this would have ever happened.

I started the migration on Tuesday of last week, and I got everything over pretty cleanly on the first try. At first glance everything was pretty much working so I was thrilled. I was even confident enough to point DNS away from the low-tenant shared hosting server it had been residing on.*

The question might be asked, why move the ds106 sites to Reclaim Cloud at all?  First off, I thought it would be a good test for seeing how the new environment handles a WordPress Cluster that is running multisite with subdomains. What’s more, I was interested in finding out during our Reclaim Cloud beta exactly how many resources are consumed and how often the site needs to scale to meet resource demands. Not only to do a little stress-testing on our one-click WordPress Cluster, but also try and get insight into costs and pricing. All that said, Tim did warn me that I was diving into the deep end of the cloud given the number of moving parts ds106 has, but when have I ever listened to reason?

Like I said, everything seemed smooth at first. All pages and images on ds106.us were loading as expected, I was just having issues getting local images to load on subdomain sites like http://assignments.ds106.us or http://tdc.ds106.us. I figured this would be an easy fix, and started playing with the NGINX configuration given from experience I knew this was most likely a WordPress Multisite re-direct issue. WordPress Multisite was merged into WordPress core in version 3.0, when this happened older WordPress Multi-user instances (like ds106) were working off legacy code, one of the biggest differences is where images were uploaded and how they were masked in the URL. In WPMU images for sub sites were uploaded to wp-content/blogs.dir/siteID/files, and using .htaccess rules were re-written to show the URL as http://ds106.us/files/image1.jpg. After WordPress 3.0 was released, all new WordPress Multisite instances (no longer was it called multi-user) would be uploaded to wp-content/uploads/sites/siteID, and they they no longer mask, effectively including the entire URL, namely http://ds106.us/wp-content/uploads/sites/siteID/image1.jpg.

So, that’s a little history to explain why I assumed it was an issue with the .htaccess rules masking the subdomain URLs. In fact, in the end I was right about that part at least. But given ds106.us was moving from an apache server-based stack to one running NGINX, I made another assumption that the issue was with the NGINX redirects—and that’s where I was wrong and lost a ton of time. On the bright side, I learned more than a little about the nginx.conf file, and let me take a moment to document some of that below for ds106 infrastructure posterity. So, the .htaccess file is what Apache uses to control re-directs, and the those look something like this for a WordPress Multisite instance before 3.4.2:

# BEGIN WordPress
RewriteEngine On
RewriteBase /
RewriteRule ^index\.php$ - [L]

# uploaded files
RewriteRule ^files/(.+) wp-includes/ms-files.php?file=$1 [L]

RewriteCond %{REQUEST_FILENAME} -f [OR]
RewriteCond %{REQUEST_FILENAME} -d
RewriteRule ^ - [L]
RewriteRule . index.php [L]
# END WordPress

In WordPress 3.5 the ms-files.php function was deprecated, and this was my entire problem, or so I believe. Here is a copy of the .htaccess file for WordPress Multisite after version 3.5:

RewriteEngine On
RewriteBase /
RewriteRule ^index\.php$ - [L]

# add a trailing slash to /wp-admin
RewriteRule ^wp-admin$ wp-admin/ [R=301,L]

RewriteCond %{REQUEST_FILENAME} -f [OR]
RewriteCond %{REQUEST_FILENAME} -d
RewriteRule ^ - [L]
RewriteRule ^(wp-(content|admin|includes).*) $1 [L]
RewriteRule ^(.*\.php)$ $1 [L]
RewriteRule . index.php [L]

No reference to ms-files.php at all. But (here is where I got confused cause I do not have the same comfort level with nginx.conf as I do .htaccess) in the nginx.conf file on the Reclaim Cloud server there is a separate subdom.conf file that deals with these re-directs like so:

    #WPMU Files
        location ~ ^/files/(.*)$ {
                try_files /wp-content/blogs.dir/$blogid/$uri /wp-includes/ms-files.php?file=$1 ;
                access_log off; log_not_found off;      expires max;
        }

    #WPMU x-sendfile to avoid php readfile()
    location ^~ /blogs.dir {
        internal;
        alias /var/www/example.com/htdocs/wp-content/blogs.dir;
        access_log off;     log_not_found off;      expires max;
    }

    #add some rules for static content expiry-headers here
}

[See more on nginx.conf files for WordPress here).]

Notice the reference to WPMU in the comments, not WPMS. But I checked the ds106.us instance on the apache server it was being migrated from and this line existed:

RewriteRule ^files/(.+) wp-includes/ms-files.php?file=$1 [L]

So ds106 was still trying to use ms-files.php even though it was deprecated long ago. While this is very much a legacy issue that comes with having a relatively complex site online for over 10 years, I’m still stumped as to why the domain masking and redirects for images on the subdomain sites worked cleanly on the Apache server but broke on the NGINX server (any insight there would be greatly appreciated). Regardless, they did and everything I tried to do to fix it (and I tried pretty much everything) was to no avail.

I hit this post on Stack Exchange that was exactly my problem fairly early on in my searches, but avoided doing it right away given I figured moving all uploads for subdomain  sites out of blog.dir into uploads/sites would be a last resort. But alas 3 days and 4 separate migrations of ds106 later—I finally capitulated and realized that Mika Epstein’s brilliant guide was the only solution I could find to get this site moved and working. On the bright side, this change should help future-proof ds106.us for the next 10 years 🙂

I really don’t have much to add to Mika’s post, but I will make note of some of the specific settings and commands I used along the way as a reminder when in another 10 years I forget I even did this.

I’ll use Martha Burtis‘s May 2011 ds106 course (SiteID 3) as an example of a subdomain migrated to capture the commands.

The following command moves the files for site with ID 3 (may11.ds106.us) into its new location at uploads/sites/3

mv ~/wp-content/blogs.dir/3 ~/wp-content/uploads/sites/

This command takes all the year and month-based files in 3/files/* and moves them up one level, effectively getting rid of the files directory level:

mv ~/wp-content/uploads/sites/3/files/* ~/wp-content/uploads/sites/3

At this point we use the WP-CLI tool do a find and replace of the database for all URLs referring to may11.ds106.us/files and replace them with may11.ds106.us/wp-content/uploads/sites/3:

wp --network --allow-root search-replace 'may11.ds106.us/files' 'may11.ds106.us/wp-content/uploads/sites/3'

The you do this 8 or 9 more times for each subdomain, this would obviously be very , very painful and need to be scripted for a much bigger site with many 10s, 100s or 1000s of sub sites.†

To move over all the files and the database I had to run two commands. The first was to sync files with the new server:

rsync -avz root@ds106.oldserver.com:/home/ds106/public_html/ /data/ROOT/

Rsync is is the best command ever and moves GBs and GBS of data in minutes.

The second command was importing the database, which is 1.5 GBs! I exported the database locally, then zipped it up and uploaded it to the database cluster container and then unzipped it and ran the database import tool, which takes a bit of time:

mysql -u user_name -p database_name < SQL_file_to_import

After that, I had to turn off ms_files_rewriting, the culprit behind all my issues. That command was provided in Mika’s post linked to above:

INSERT INTO `my_database`.`wp_sitemeta` (`meta_id`, `site_id`, `meta_key`, `meta_value`) VALUES (NULL, '1', 'ms_files_rewriting', '0');

You also need to add the following line to wp-config.php:

define( 'UPLOADBLOGSDIR', 'wp-content/uploads/sites' );

The only other thing I did for safe-keeping was create a quick plugin function based on Mika’s stupid_ms_files_rewriting to force the re-writing for any stragglers to the new URL:

function stupid_ms_files_rewriting() {
$url = '/wp-content/uploads/sites/' . get_current_blog_id();
define( 'BLOGUPLOADDIR', $url );
}
add_action('init','stupid_ms_files_rewriting');

I put that in mu-plugins, and the migrated ds106.us multisite install worked! There was some elation and relief this past Saturday when it finally worked. I was struggle-bussing all week as a result of this failed migration, but I am happy to say the Reclaim Cloud environment was not the issue, rather legacy WordPress file re-writes seemed to be the root cause of my problems.

I did have to also update some hardcoded image URLs in the assignment bank theme , but that was easy. The only thing left to do now is fix the ds106 MediaWIki instance and write that to HTML so I can preserve some of the early syllabi and other assorted resources. It was a bit of a beast, but I am very happy to report that ds106 is now on the Reclaim Cloud and receiving all the resources it deserves on-demand 🙂


*VIP1 was the most recent in a series of temporary homes given how resource intensive the site can be given the syndication hub it has become.

†I did all these changes on the Apache live site before moving them over (take a database back-up if you are living on the edge like me), and then used the following tool to link all the

WordPress Multisite: Multi-Network versus Multiple Independent Networks

One of the things we find ourselves doing more and more of at Reclaim Hosting is managed hosting, in particular for WordPress Multisite (WPMS). In the end was the beginning for this blog. So, I was on a call last week were the discussion around running multiple, independent WPMS instances versus one WPMS instance with multiple networks, i.e. sites.stateu.org and courses.stateu.org represent two functioning WPMS instances using subdomains (or subdirectories) such as mysite.sites.stateu.org or mycourse.courses.stateu.org that both point and share one set of core WordPress files. I experimented with this over 10 years ago by running a WPMS (then called WPMU) service for Longwood University off the core WordPress files of UMW Blogs. I thought it would be revolutionary for the ability to share infrastructure across Virginia public institutions of higher ed, but not so much. That said, I was glad to see Curtiss Grymala to take the whole idea of multi-networks to the next level for UMW’s main website.

Anyway, enough about the past, that was then, this is now …. for now. The question is why would you run several independent WPMS instances with distinct core files versus running multiple instances of WPMS off of one shared set of files, plugins, themes, etc.? For me the value of running everything off one shared set of files was shared themes, plugins, and updates that make management easier than across numerous separate installs.*  Another benefit was a single space for site/user administration between networks. Additionally, managing single sign-on through one instance should prove a bit easier for setup, but will need to double-check on this one. I also know you can have various portals for each WPMS network mapped on a single set of files, so it will not be confusing for the users, for them the fact they share core files will be invisible. So, in this regard the choice comes down to whether or not consolidation makes sense for the WPMS admin, which is often a question of convenience.

But there may be some practical reasons not to use a multi-network setup. Like, for example, if you are planing on running thousands of sites on each of these WPMS instances you may want to keep them separate given scaling issue with the WPMS database.** Having three WPMS instances share core files means if one goes down, they all go down, which can be an issue. Also, if you have an existing WPMS site you want to incorporate into an existing multi-network setup it may get tricky depending on whether there are shared users across the various instances of WPMS that you’re combining. I will have to do more research here, and would love to know about anyone’s experience in this regard, but I imagine users across a multi-network instance would need to be able to access the various networks with the same email/username across networks for the sake of both convenience and single sign-on (which are often one in the same).

Which raises another question that I’m unsure of,  if users sign-in through one network of a multi-network setup can they cleanly move between sites on different networks? I’m wondering if keeping single sign-on and users separate in this instance may prove less problematic in the long run. I’ll be working through these scenarios this week, but wanted to post this here cause I know a few folks have experience with running multi-networks on bit sites and wanted to be sure I was not overlooking any major red flags before making some recommendations.


*It also allows you to share any premium themes or plugins across one instance.

**Although if this is the case you will have to shard databases anyway, so one could argue it would be easier to do that for one instance rather than many. 

Reclaiming WordPress Multisite

Lauren Brumfield already announced that we’re officially rolling out WordPress Multisite (WPMS) hosting at Reclaim. What’s more, she created an online calculator that provides transparent pricing going forward, which is a big part of why we’re finally announcing something we have done for years. While we’ve been pretty laser-focused on shared hosting and Domain of One’s Own for the last four years, we’ve still picked up more than a few WPMS instances. In fact, we jumped in at the deep-end of the pool when we started hosting the colossus that is VCU’s Rampages. As a result Tim was able to really fine-tune high demand WPMS environments like Rampages, and we’re in a situation where we can comfortably manage just about anything out there in higher ed.

It’s fun for me because I cut my teeth on WordPress Multiuser (even before it was multisite), and when Tim came onboard at UMW the first thing we asked him was how he felt about managing UMW Blogs. The rest is Reclaim history, he proved an insanely quick study and went from UMW Blogs to Hippie Hosting to Domain of One’s Own to Reclaim Hosting in two short years. That’s a resume!

back to the future, we really weren’t comfortable with announcing WPMS at Reclaim too early because we were experimenting with different setups across various data centers like AWS, Linode, and Digital Ocean, so things were always custom based on several factors which meant the pricing varied. But when Digital Ocean recently announced their new plans and pricing model, we were sure we had a solid setup through Digital Ocean that would allow us to stabilize our WPMS offerings as well as making them extremely competitive when it comes to pricing.

Before we could announce anything we had to reach out to all existing customers we and let them know of the new setup, for many of them this meant a significant savings. Once took care of that, we figured it was high time to officially announce that we are in the WPMS hosting business. So if you have a WPMS site you want to offload to external hosting, let’s talk. Pricing is simple: server, backups, and software licensing (Bitninja, cPanel, etc) at cost, whereas our monthly maintenance and management fees starts at $250 per month. This model finally allows us to decouple server and software costs from management demands, and establishes a baseline for what our time is worth to ensure you get the service we’re known for. It also makes clear that what you pay us for is not the hardware or software, but the peace of mind that tried and true experts are on the job. I mean let’s be honest here, this isn’t some hack outfit working from a ramshackle UMW office in duPont Hall trying to duct tape together some kind of chitty-chitty-bang-bang syndication solution, we’re professionals—and for that you must pay!