Reclaim Cloud Case Study: Containing TEI Publisher in the Cloud

It started out as an innocent enough ticket into Reclaim Hosting from Dr. Laura Morreale, whose work involves transcribing and translating texts from medieval manuscripts using online digital facsimiles, asked if we can run eXist-db on her cPanel account in shared hosting. In particular she needed to run TEI Publisher, an open source application that is described as follows in this documentation:

The motivation behind TEI Publisher was to provide a tool which enables scholars and editors to publish their materials without becoming programmers, but also does not force them into a one-size-fits-all framework. Experienced developers will benefit as well by writing less code, avoiding redundancy, improve maintenance and interoperability – to just name a few. TEI Publisher is all about standards, modularity, reusability and sustainability!

A quick look at the basic installation documentation for eXist-db told me it was a Java app which is a hard no for cPanel. But avoiding hard NOs when someone comes asking for help is one of the main reasons we started Reclaim Cloud. A cursory search for a Docker container for this application led me to a container that seemed out-dated. I responded suggesting we could try installing it on the Cloud if they had a current Docker instance, which I was not finding. Turns out I wasn’t looking hard enough, it was linked from the eXistDB homepage right in front of my eyes. I was wrong, and Dr. Morreale responded suggesting she was becoming increasingly frustrated trying to get this application running online saying, and I misquote for comic effect: “Dammit Jim, I am Medievalist, not a server admin!” She was right, and this was why we started the Cloud in the first place; I needed to try harder. What’s more, I appreciated the fact she was so determined to make this work. So much so that soon after after the last email I sent to try and get this working, she sent sent me a link to the right Docker container on the recommendation of the folks at eXist-db:

That was all we needed, I simply searched for this container in the Docker area when creating a new environment in Reclaim Cloud:

Click “Next” and add the subdomain of this test environment, in my example teipublisher.us.reclaim.cloud (now deleted), and then clicked “Create.”

And within moments I was able to access the site at at that subdomain:

The eXistdb splash page redirects to a suite of tools, including TEI Publisher!

A click on that icon brings us into that application:

While there are a still few things to work out in regards to user management for the application, it seems like we may have a winner with this Docker container. In fact, Dr. Morreale’s struggle highlights a pain point for many humanities PhDs that need to run an application that demands a bespoke server environment. This is when the value of containers is extremely evident. In this case, running a Java server environment that can provide an  application that provides a stable and citable publication venue for a Medievalist’s transcriptions and translations is a perfect case in point. In fact, Dr. Morreale was kind enough to furnish me with some insight of her work, process, and challenges for this post:

Like a growing number of humanities PhDs, I am an independent scholar who maintains relationships with several programs and institutions. I am currently affiliated in an official capacity with Fordham, Georgetown, and Harvard Universities, and am also engaged in ongoing projects with partners at Stanford and Princeton Universities.  My medievalist practice has always been characterized by a physical distance from both the repositories that hold sources which I study, and the institutions where my scholarly work finds its home. For this reason, digital methods have offered me a solution for my scholarly work when I had few others.

Some of the most rewarding efforts which have in turn informed much of my traditional analytical work, involve transcribing and translating texts found in medieval manuscripts using online digital facsimiles. Using a tool called FromThePage combined with IIIF image technology, I can now easily choose digitized manuscript images from any online repository, upload them, then immediately begin to transcribe the text from the medieval source. I can also translate my own transcription after it is complete, and I have undertaken both individual and collaborative translation projects using this method. Right now my projects include corpus of early 13th century aristocratic legal codes from Crusader Cyprus, a rarely-cited history of Florence that was buried in a late 14th-century letter from a father to his son, and a little known work by Renaissance Florentine Leon Battista Alberti, found in a larger manuscript that has broken up, with parts of it now housed at Harvard’s Houghton Library.

The one difficulty has been to find a stable and citable publication venue for these transcriptions and translations. I have tried several different programs over the years, but could never easily publish all the work I had done to bring more attention to these texts and manuscripts. Using Reclaim Hosting  and a program called TEI Publisher allows me to create the kind of edition I would like, and to allows me to integrate images, notes, and other explanatory materials into my online editions.

In the end, the fact that we could help Dr. Morreale get what she needed fairly seamlessly is a thrill, and it highlights everything we hoped Reclaim Cloud would be. I am planning on turning this Docker container into a one-click application for the Reclaim Cloud marketplace so that other folks can hopefully scratch a similar itch. And special thanks to Dr. Morreale for so generously sharing her process and work to complete this post. Avanti!

The Evolution of the Cloud

We have some really exciting things that we're working on right now that we'll be sharing in the coming days and weeks ahead. But I think it's worth first taking a look back at my personal and Reclaim Hosting's  history in hosting to understand why I think this next step is such an important one. This will be a long post, but anyone who has been watching or has been a part of this history will likely appreciate the context for how these ideas have evolved (I know I personally do, and after all that's what this blog is for).

Shortly after I joined DTLT at the University of Mary Washington in the summer of 2011 one of my first projects was to start DTLT Today, a semi-regular video podcast that everyone in the group to some capacity participated in. We streamed it live which provided a really cool synchronous element and I still go back and watch the shows today to get a sense of where our heads were at around particular topics at the time. You can find all the episodes on my YouTube channel if you scroll back far enough. And that streaming part was an interesting one. I had experience through ds106.tv with third party tools like Justin.tv (now defunct because they became Twitch) and it was before the days of YouTube offering free streaming. I got it in my head to play around with Amazon Web Services to quickly create a streaming server of our own running streaming software called Wowza. The software would have been quite expensive and I surely had no server of our own on campus I could run it on, but for just an hour of live streaming we could do it very cheap (pennies per day).

Streaming Live Video without Ads for Pennies
I’ve been in search of a decent low cost video streaming solution for a long time now. It doesn’t take long playing with Ustream [http://ustream.tv] and Livestream [http://livestream.com…
The Evolution of the Cloud

That was my first time experiencing what "the cloud" could really do for web-based projects. The distributed plug and play nature of the tools got me hooked because there was a very low barrier to entry. If something wasn't working right, destroy it and start a new one. There was a marketplace of applications I could rely on for prebuilt stuff and even back in 2011 AWS had a lot of different services (now exponentially so).

By 2014 we had started to turn our eyes to AWS once again as a possible alternative hosting environment for UMW Blogs. We had suffered performance issues on that platform for quite awhile as we tried to handle growth and this seemed an opportunity for us to have a more flexible infrastructure that could grow with that project. We had planned to just move the database to an RDS instance and maybe the uploads to S3. But we ran into such latency that rather than scrap everything we added an EC2 server in and moved the whole application. And it was so fast!

UMW Blogs in the Cloud
Anyone tasked with running a large WordPress Multisite install at an institutional level has likely dealt with their fair share of issues in the past few years from database scaling with large growth to the constant barrage of spam and login attempts that can DDOS and server and bring it to its knee…
The Evolution of the Cloud

In the Fall of 2014 I had also begun playing with a piece of software I was starting to hear about called Docker. Something about servers being the old guard and "containers" being the new hotness. But what was a container? On the face of it they felt a lot like virtual servers. There was a hub I could search and find all kinds of different applications just like at AWS. But these were much faster than spinning up an EC2 instance because containers didn't have to start an entire operating system and make use of all the hardware of the computer they were running on, they were much smaller virtualized instances of just the application they were running with the ability to share all the common OS-level stuff across multiple containers.

Keep in mind Reclaim Hosting had been a thing for a year at this point and Domain of One's Own at UMW for 2 years. Add to that Hippie Hosting and I had about 3 years of server experience under my belt but it was all in the context of shared hosting with Plesk and cPanel. Because Docker allowed me to run applications I could never run previously in LAMP servers I was immediately hooked. I figured out a way to share multiple Ghost containers on a single server so we started offering that through Reclaim Hosting. Similar with Discourse forum hosting. I even had a Federated Wiki instance up and running for a time that was running on Docker containers (remember that?!). It's been a long running dream for us to offer an elegant solution to allow people to play with this stuff. But there was still a lot of manual work creating and removing these containers when users would sign up, and a lack of a real solution meant slow adoption and difficulty supporting it.

Fast forward to the end of 2014 and I've given notice that I'm going to do Reclaim Hosting full time, quite a leap of faith given the security of working at a public institution. But this shit was fun, people were trusting us, and the community was growing. In December of that year Jim and I had the thought to bring Kin Lane to Fredericksburg to talk to us about APIs and think through how we might build an architecture for that at Reclaim Hosting. My head was still in the clouds with all the Docker stuff and we couldn't resist thinking about it in terms of containers, all different endpoints and applications as various containers with their own unique hostnames. It was a solid few days of brainstorming that pushed the very boundaries of my understanding at the time and I still treasure not just these scribbled notes on a whiteboard but also the generosity of Kin's time with us.

The Evolution of the Cloud

In early 2015 I learned of a project called Sandstorm that allowed for container-based hosting of applications. It was open source and I set a server up to start playing. It checked a lot of boxes in making it easy to start containers running Ghost, Etherpad, and other apps without the overhead of needing to be a developer, but the drawbacks were that it was not using Docker and building apps to run in the environment was very difficult with their model. It also seemed pretty heavily to favor collaboration applications with less of a focus on publishing so oddly content management systems were more difficult to work with in their model. The group developing Sandstorm got hired by Cloudflare and the project is mostly abandoned at this point, but I know OpenETC has made quite a bit of use of the platform.

The Coming Sandstorm.io

By August of 2015 I return back to AWS again, this time to experiment with moving Reclaim Hosting's main site to a distributed stack there. Not just a single EC2 instance with files in S3, but rather a load balanced stack with multiple app servers in different regions, a CDN on top, and deployment through Git and a staging server to boot. This was amazing but it also ended up being quite expensive and a lot of overhead to manage. I didn't find Amazon's tools very user-friendly (there are much more developer-focused of course) and I couldn't justify the high costs of multiple EC2 instances for a site that while mission critical also wasn't a massive load at the time even on a shared server.

Reclaim Hosting in the Cloud
This past week marked the 2 year anniversary of Reclaim Hosting [https://reclaimhosting.com] and what started as something of an experiment has turned into a successful business and one of the most rewarding things I’ve been a part of in my professional career. We’ve come a long way in 2 years but w…
The Evolution of the Cloud

3 years ago in Fall of 2017 I learned of a project called Cloudron and started playing with it. Similar to Sandstorm you could install it on your own server, it was (key word was) open source and this system used Docker containers at its core. Even more it made it fairly easy to map custom domains, provision SSL certs, and run your software on the web with ease and the list of support apps had a lot of amazing stuff like Gitlab, Ghost, Etherpad, and more. We piloted a program with Bates College in which we integrated a Cloudron server with their Domain of One's Own program and we even considered at one point whether or not Cloudron could be the backend of some kind of future "Domains 2.0" environment where folks weren't limited by the LAMP stack. Unfortunately what we found was that the focus on Cloudron as a personal server made it less of a great fit for a multi-tenant model. There was a basic user management built in, but most models reflected a single or multiple admins handling all the installation rather than a true platform as a service style offering that we were hoping for. And the business model of Cloudron has changed a lot over the years with open source now being nothing more than a label (turns out the ability to install applications is not open source and requires subscription fees paid to them so there's a lot of lock-in there) and development of our own applications was proving to be challenging because they supported Docker but you had to build with the container as a base and setup volumes in a very particular way for them to work. So back to the drawing board.

Beyond LAMP
Since Reclaim Hosting was founded in 2013, cPanel and the traditional “LAMP” stack have been at its core. LAMP is an acronym for Linux, Apache, MySQL, PHP and some of the more familiar applications you love like WordPress and Omeka run that tech stack. This comes with its own limitations as newer so…
The Evolution of the Cloud

The common themes here are that this stuff is not new (nothing ever is) but that it has traditionally been very complex and framed for developer audiences. And in lot of ways that's where Reclaim Hosting found themselves with the state of web hosting in 2013. It wasn't that cPanel or LAMP stacks were some brave new world we were exploring, but we saw an opportunity to build a community around providing easy hosting for educators in a context where you could get real support and build out a presence on the web. We found our niche in that and have continued to double down on that with our Domain of One's Own program and Managed Hosting services. The platform plays a big role, but so do the people.

Domain of One’s Own: Notes from the Trailing Edge

Maybe you're building your Next Generation Digital Learning Environment and wanting to use open source tools like Mattermost, Etherpad, Jitsi Meet, and WordPress to create a framework for your courses. Maybe you're like me in 2011 and wanted to play with some complex piece of software without investing in the high server costs and overhead of buying a VPS (ever tried to run your own install of Canvas?). Maybe you're curious to try running your blog on Ghost like this one here. And maybe you want to start a project small with potential for it to grow in a big way over time and you don't want to have to continually adjust hosting environments for whatever level will account for new traffic you are experiencing (we see that a lot with WP Multisites on campuses where year 1 looks very different from year 5 of the project, or hell even a Sunday evening when everyone's homework is due). Is there a DH project that you've been eyeing but not sure how to install and run it? Are you trying to get support for newer scholarly publishing platforms at your university but you aren't sure where to start?

Our efforts and experiments throughout all of this have been to answer the question of "what's next?" and look towards the future of web hosting with an evolution of the past. It needs to have a user-friendly interface. It needs to be affordable. It needs to be without limitations. It needs to scale.

Today starts that journey, and Reclaim Cloud is coming very soon. In the coming weeks we will be talking more and more about what this platform is and what it can provide. If any of this post hits on struggles you've had or interests please do add your email to the list to be updated and we'll be working to add more people to the beta as we ramp up towards a public launch.

Restarting a Discourse Container

We have a server that runs a kind of multisite Discourse environment that I discussed a number of years ago in this post. It is an Ubuntu server with Docker installed, and each of the Discourse instances on that server are spun up in Docker containers. It’s a very small, experimental part of what we do. In fact, we discontinued offering Discourse and Ghost in this kind of environment  a while back, and are far more interested in options like Cloudron, which makes hosting Ghost a breeze. That said, we have a couple of Discourse instances we still host and today the biggest one went down, which is always a bit of a scare for me given it is a unique environment. So, this post is simply going to retrace my steps in terminal to fix this because I always forget given it is not something I do often enough.

When I learned the server was down I figured I would try stopping and restarting the Container to see if that works. To do that I needed to go to var/discourse:

cd /var/discourse

From there, I tried to stop the container (to find the container name I looked in the /var/discourse/containers/ directory which has all the YAML files for each install, and the container names are everything before the .yml extension.

./launcher stop containername

That will stop the container and the following will restart it:

./launcher start containername

But when I went to stop the container I got the a storage full error, and when I ran a

df -h

on the server it was confirmed, the disk was full. I then proceeded to run the trusty NCDU command to get a sense of what was taking up all the space, and I have a suspicion it might be related to this overlay2 storage space issue others have complained about with Docker, but I took the easy route and deleted 10 GBs of old backups for the site and it was immediately back up and running. In the end a restart was not necessary, and I was able to solve a fairly random issue fairly quickly. 

Look a(nother) Ghost

Since May of 2014 I have been playing on and off with the blogging platform Ghost. It has been an on again off again affair, and I have never left WordPress for it, but rather use it as a test bed for exploring how Reclaim might host applications outside the LAMP stack—an ongoing theme for us over the last 3 or 4 years. So, I have been marking my progress with running Ghost both here on the bava as well as on my Ghost blog. I talked about the idea of this as the Next Generation Sandbox, experimented with getting Ghost running on AWS using Bitnami, feeble terminal work, setting up key pairs in AWS, moving to Reclaim’s container-based setup for a kind of multi-site Ghost, setting up mail for Ghost, and most recently using Cloudron to setup Ghost.

Seven posts over three years about (and on) Ghost is not that much in the end (running out of punny titles), but reading over them whiling writing this I realized there’s a lot of learning wrapped up in trying to figure out AWS, Bitnami images, command line, Docker containers, and Cloudron. All stuff I have been trying to focus on more an more, so this side site in many ways lives up to its subtitle: “Letters from the Cloud.” And I came back to it recently because while I blogged about setting up Ghost through Cloudron back in September, my Ghost instance on Reclaim had been terminated when we decided to no longer offer it through Reclaim Hosting. Given my Ghost blogging had been dormant for a while, I totally forgot I was hosting it through Reclaim and it vanished. Luckily I blogged everything on Ghost through the bava, so nothing was lost, and I had backups of all images, etc. So, I used the occasion of things finally slowing down at Reclaim Hosting and my being under the weather to finally get BavaGhost back online, and now it is!

Continue reading “Look a(nother) Ghost”

And you get a server, and you get a server, and you….

I have been remiss in responding to Keegan’s post in early August exploring the idea of “A Server of One’s Own,” but I have not forgotten it. In fact, what he outlines in that post is something that dogs me regularly. Namely, how can we provide more options for folks when it comes to hosting a more diverse array of applications beyond what Domain of One’s Own currently provides.

Let me explain. As it stands right now, Domain of One’s Own has definitive technical limitations given it is built around a LAMP server environment. What does that mean? Well, it means beyond HTML, you are pretty much limited to PHP, Python, and Perl scripting languages. Also, it only supports the Apache web server software and MySQL (or MariaDB) databases. In other words, it is a specific server environment (a.k.a stack) that only supports specific applications. But given the wild success of PHP apps over the last 15 years, in particular WordPress, for most of us web plebeians that has been enough.

Continue reading “And you get a server, and you get a server, and you….”

A Domain of the Practical

tumblr_lsj3cvuLrW1r3h8j5o1_500

Adam Croom offered up a hypothesis in response to my post about the “Long Short History of Reclaim.” He argues that as much as Domains at the University of Oklahoma is deeply embedded in a philosophy of empowerment, ownership, and experimentation, it’s also extremely useful. Who knew?!

OU Create for us has became a practical tool for our community as much as philosophical one. It is indeed an infrastructure that makes building full websites possible to a much greater audience.It also gives us enough slack to build in a plethora of digital literacy components. This complexity is highly valuable in serving a range of needs.

I think the practical component of folks having their own space to publish easily to the web has been a huge draw. Tim has made the whole experience so seamless and dead simple that someone can literally help themselves to an Omeka or WordPress instance (or both) on a brand new domain in seconds. This is where the practical meets good design to make a near perfect marriage. When you take someone through a demo they’re incredulous, “That’s it?” And we’re convinced we can make it even more streamlined. While we’re driven by the ideals undergirding reclaiming the web, we are also deeply conscious of the fact that good design with practical applications will make that vision a reality quicker than any of the rhetoric.

Another interesting post that dovetails with this idea is the great Tony Hirst’s “Getting Your Own Space on the Web.”  Tony acknowledges the value of offering a space to folks who want to assume the responsibility of running their own applications for publishing to the web. But what about those who don’t?

What if you only want to share an application to the web for a short period of time? What if you want to be able to “show and tell” and application for a particular class, and then put it back on the shelf, available to use again but not always running? Or what if you want to access an application that might be difficult to install, or isn’t available for your computer?

I would add to this, what if the application you want to install doesn’t run on the widely popular LAMP stack we’ve built Reclaim Hosting on? This is where Tony’s explorations of virtualized server environments and containers over the last year have been fascinating. Tony has traditionally been the canary in the coal mine when it comes to pushing innovative edtech. The work he’s been doing and questions he’s been asking fit well with the work Tim and I having been pushing on for over a year (with some serious help from Kin Lane). How does this personal webspace also include virtualized apps and containers glued together with APIs to enable experimentation with a wide range applications across a variety of server environments and dependencies for short (or long) periods of time. How do we start realizing the possibilities of server infrastructure as a teaching and learning utility we can count on for fast, cheap, and out of control edtech?

Tony is thinking hard about how this effects deploying educational software for distance and online education, his role—assumed or official I don’t know—at the Open University. That practical use case provides some truly compelling challenges and possibilities for such work. The issue remains that it’s still not easy to work with virtual servers and containers, though Docker hosting services like Tutum are beginning to make some real headway in this regard. As my time at UMW comes to a close, more and more of my attention and focus will be pointed at this emerging virtual architecture of edtech, and what it might means in terms of the work we do at Reclaim.

Abstractions: Running WordPress Multi-Site using AWS, Docker, and BTSync

Heads up: this is not a technical run through, but more of a conceptual overview. Apologies if you came here looking for a how-to. Hopefully we will have just that in the next few months.

But enough about the past, let’s talk about the future!

aws_wpms_setup

This past week Tim Owens and I went down to VCU’s ALT Lab to meet with Tom Woodward, Jon Becker, and Mark Luetke about the work they’re doing with Ram Pages. I already blogged about a couple of plugins they created for making syndication-based course sites dead simple. We also got to talking about some of the ways we have been using Amazon Web Services (AWS)to scale UMW Blogs. At this point Tim took us to school on the whiteboard explaining a possible setup he has been imagining, which is still fairly experimental.

Don’t let Tim fool you, he is DevOps #4life now. He can be found in his spare time watching presentations about load balancing a site for a billion users or scaling infrastructure for small services like Netflix. I’m becoming more and more interested in infrastructure discussions because they highlight interesting trends in the shifting nature of tech that deeply effects edtech, such as virtualization, containers, and APIs.

image

Anyway, the image above is a look at a potential setup for a large WordPress Multisite instance on AWS. It has a couple of elements worth discussing in some detail because I want to try and get my head around each of them. The first is a load balancer that runs in its own EC2 instance.

loadbalancer

What the load balancer does is direct traffic to the Ec2 instance running the WordPress core files with the least load. So if you have four EC2 instances each running WordPress’s core files, the one with the least usage get the next request. Additionally, if all the instances have too great a load another could, theoretically, be spun up to meet the demand. That’s one of the core ideas behind elastic computing. The load balancer Tim used for UMW Blogs was HAProxy.

image

As mentioned above, you can setup a series of instances of EC2 on AWS with the core WordPress files save the wp-content directory, which is the only directory folks write to. But you will notice in the fourth instances Tim switched things up. He suggested here that we could have an EC2 instance running Docker that could then run several WordPress instances within it. What’s the difference? And this is where I am still struggling a bit, but from what I understand this allows you to spin up new instances quicker, isolate instances from each other for more security, and upgrade and switch out instances seamlessly. It effectively makes WordPress upgrades in a large environment trivial.

image

We have yet another EC2 instance that is the Network File Storage, this holds the wp-content files. The uploads, plugins, themes, upgrades, etc.  And each of the above instances share this instance. The all write to this, but one of the issues here is that this can be a single point of failure, kinda like the load balancer. So, Tim suggested there is BitTorrent Sync (BTSync), which I still don’t totally understand but sounds awesome. It’s basically technology that synchs files from your computer to spot on the internet, or between spaces in the internet, etc. So, what if we had several bucket where the various instances of WordPress core files were writing the upload files, themes, plugins, etc, and those buckets used BTSync to share between them almost immediately. So then you wouldn’t have a single point of failure, you would have the various instances writing to various buckets of files that would be constantly synching using the technology behind BitTorrent. Far out, right?

btsynch
BTSync provide ability to immediately copy and synch files across several buckets of the same files that get written to regularly.

Another option, and I think this was before we started talking about BTSync, but not sure if this would be in possible in addition to BTync, is have the blogs.dir folder for a WordPress Multisite that handle all the individual site files uploaded be sent to S3, Amazon’s file storage service.

image

You get the sense that part of what’s happening when you move an application like WordPress Multisite onto AWS, or some other cloud-based, virtualized server environment, is each element is abstracted out to it basic functions. Core files that are read-only are separate from anything that is written to, whether that be themes, plugins, or uploads. Additionally, the database is also abstracted out, and you can run an EC2 instance on AWS with Docker containers each running MySQL (with a sharDB or HyberDB to further break up load) that can also replicate various writes and calls using BTSync? No single point of failure, and you greatly reduce the load on a WPMS which is completely I’m out of my depth here, but if I I accomplished anything here it might be giving you an insight to my confusion, which is also my excitement about figuring out the possibilities.

imageI have no idea if this makes sense, and I would really love any feedback from anyone who knows what they are talking about because I’m admittedly writing this to try and understand it. Regardless, it was pretty awesome hearing Tim lay it out because it certainly provides a pretty impressive solution to running large, resource intensive WordPress Multisite instance.

Duke’s Website has Gone Docker

I was excited to see Tony Hirst retweet the news that Duke University’s website is being run in a Docker environment, and it could even be served through Amazon Web Services. Chris Collins, senior Linux admin at Duke, wrote about “Using Docker and AWS to Survive and Outage” they had as a result of DDoS attacks on their main site back in January. I love the way he tells the story:

While folks were bouncing ideas around on how to bring the site up again while still struggling with the outage, I mentioned that I could pretty quickly migrate the site over to Amazon Web Services and run it in Docker containers there. The higher-ups gave me the go-ahead and a credit card (very important, heh) and told me to get it setup.  The idea was to have it there so we could fail over to the cloud if we were unable to resolve the outage in a reasonable time.

TL;DR – I did, it was easy, and we failed over all external traffic to the cloud. Details below.

He goes on to describe his process in some detail, and it struck me how the shift in IT infrastructure is moving, and also made me wonder how many IT organizations in higher ed are truly rethinking their architecture along these lines. It’s one thing to push your services to a third party vendor that hosts all your stuff, it’s all together different to bring in a team that understands and is prepared to move a university’s infrastructure into a container-based model that can be hosted in the cloud. Not to mention what this might soon mean for personal options, and a robust menu teaching and learning applications heretofore unimaginable. This would make the LAMP environment options Domain of One’s Own offers look like Chucky from Child’s Play Duke’s Website has Gone Docker

I know Tim and I are looking forward to thinking about what such a container-based architecture might means for an educational hosting environment that is simple, personalized, and expansive. Tim turned me on to Tutum recently, which starts to get at the idea of a personalized cloud across various providers—something Tim Klapdor gets at brilliantly:

MYOS is very much the model the Jon Udell laid out as “hosted life bits” – a number of interconnected services that provide specific functionality, access and affordances across a variety of contexts. Each fits together in a way that allows data to be controlled, managed, connected, shared, published and syndicated. The idea isn’t new, Jon wrote about life bits in 2007, but I think the technology has finally caught up to the idea and it’s now possible to make this a reality in very practical way.

His post on the topic deserves a close reading, and it’s the best conceptual mapping of what we might build I have read yet. I wanna help realize this vision, and I guess I am writing about Duke University’s move to Docker because it suggests this is the route Higher Ed IT will be moving towards anyway (sooner or later—which could be a long later for some Duke’s Website has Gone Docker ). Seems we might have an opportunity to inform what it might look like for teaching and learning from the ground floor. It’s not a given it will be better, that will depend upon us imagining what exactly a teaching and learning infrastructure might look like. Tim Klapdor has provided one of the most compelling visions to date, building on Jone Udell’s thinking, but that’s just the beginning.

Dockers

s02e09_container 01

The above GIF is from an episode of The Wire during Season 2. The docks are ubiquitous in season 2, and this particular image is a visualization of a cloned machine that captures the vanishing container—presumably filled with illegal cargo. I’m fascinated by the representation of technology throughout the series, but season 2 in particular is really interesting. There’s the highlighting of a cultural move to digital cameras, the increasingly popularity of the web, GPS, and much more that’s constantly being discussed, but there’s also the radical changes to the physical technology of the dock. The first part of the following video features the presentation from Season 2, Episode 7 about the automation of the port of Rotterdam.

Frank Sobotka refers to this as a “horror movie” noting the eroding need of stevedores, but more generally labor. The automated container technology becomes a sign of labor’s vanishing past.

containers

At the same time the container systems that have redefined the way shipping works have metaphorically come to servers thanks to Docker.

Docker

To the degree I fully understand it, Docker provides an open platform for building, running, and shipping distributed applications. In other words, you can get a pre-configured container through Docker that has the proper server environment for running a specific application. For example, if you want to run the the forum software Discourse or the blog engine Ghost (which is what Tim Owens has figured out recently for Reclaim Hosting), we have a server that has the docker engine installed which allows us to quickly fire up different application environments and run them for anyone who requests it.

Docker

And we are grabbing those application images from an open repository of virtualized possibilities that helps us avoid become overly dependent on a closed platform like Amazon Web Services, which is a major bonus. Additionally, Tim is playing with Shipyard, which allows you to manage various containers and resources on your server. What strikes me about all of this is how the metaphorical language of docks, shipyards, and containers helps me wrap my head around this technology. What’s more, it’s cool to see it both through the eyes of Frank Sobotka and Tim Owens—two of my heroes :)