Broken backs and body-flounders: Building


Techies my age always seem to be the ones asked to fix this computer or that printer, only because we have some knowledge of these things. My current day job is a mix between being a web developer, and being a Devops engineer and infrastructure architect. When I'm done at work for the day, it's rare that I want to do more web development.

When something you consider a good is threatened, and you can help, you gotta answer the call.

A blog in need...

EscherGirls is a comic and media commentary blog. It's goal -- by sheer relentless volume of examples -- is to demonstrate how women are distorted, misproportioned, or sexualized out of context. It's a method my college anthropology professor would have approved. It started as a blog on Tumblr, a social network with a rather fraught history. It changed hands from an independent project to Yahoo, to Oath/Verizon Media. Nothing much changed until last year, when new "community standards" descended.

The business motive was obvious -- the new owners wanted less a Batman Returns, and more a Batman Forever -- something safe and marketable from which they can shake profit. The problem is, a social network isn't just tech, nor is it just property, it's people. And the new owners had so little respect for the people that made up their new Facebook Killer. They began censoring the artwork, locking down blogs without warning or notice, while making EULA changes which forced many accounts off the platform altogether. Traffic and interest in the social network collapsed so badly, that once again, Tumblr is up for auction.

Excellent job. Quelle surprise.

I've been increasingly worried about this trend on the Internet the last few years. While I was fortunate to go online prior to the existence of the term "social media", many my age and younger have little idea how the web is made outside of Facebook, Twitter, and...whomever. While I maintain a presence on these networks, I try to keep most of my own content on my own site. After all, to post material, you need to sign an End User License Agreement (EULA) which grants ownership of whatever you post to the network. While this is intended to ensure posters don't sue you for redistributing what you post on their site, many have used it as a legal pretext to steal artwork, strip credit, and bury independent creators. Several years ago in a panic, I closed up my DeviantArt account, losing a few smaller pieces and sketches where I had no copy elsewhere.

EscherGirl's content was on the knife-edge of the new "community standards", and the blog's owner decided that she was done with US-run social media mega-giants. She wanted to create her own site, and host it in her own nation of Canada, where the copyright laws are rather less vindictive than they are in the 'States.

The problem was...she never did this before.

When her call went out on her Twitter account, I was staring down the beginning of 6 months of 60-80 hour weeks of work. In addition to my day job, I often do freelance technical writing. I enjoy the work, but even with my holidays free (I haven't been invited to a family event in over a decade), it was a lot of work. Still, I didn't want to say no. In my view, EscherGirls does good work. It raises awareness while doing so with enjoyable snark and wit.

Not given a reason, I decided to go find one.

I decided to do this project for free, simply because I knew that it was unlikely to be anything else. However, there are a lot more reasons to do something than money alone. Migrating EscherGirls off of Tumblr and creating a new site wasn't just a positive contribution to the world in my view, but an opportunity to challenge myself and learn something new.

Getting the content

The first challenge was getting the content in the first place. When the "community standards" were announced, I decided to export all my content from the site, and delete everything. The exports took hours to days to produce, no doubt hampered by the sudden surge in demand. When downloaded, posts were available in an enormous XML file. In theory, that file could be extracted by some custom code, transformed, and then loaded into a new site. This ETL (Extract - Transform - Load) process is common in the industry.

Drupal 8 supports it's own ETL process via the Migrate API. Having worked on half a dozen site migrations, as well as maintaining some migration modules on, I knew my way around. Still, the XML file was huge and a mess to try to process. Thanks GDPR, but I swear most companies aren't complying willingly.

Fortunately, an open source project called Tumblr Utils also performs an export of Tumblr content. Tumblr Utils creates an offline copy of your blog in a human-consumable format. You're given an index page, an archive page with year links, a directory of HTML files, and a directory of full-size media. Each post in the export is a single HTML file, and all the media files had unique names. While intended for people, the format was just structured enough to be machine readable.

My first task was to take the content of a single HTML file, and transform it into XHTML -- a stricter form of HTML that complies with the XML standard. Why is this important? As XML, I would leverage XPath to extract the content of individual tags. Fortunantely, the exported content was so regular it made this process surprisingly easy to accomplish. I threw together a PHP script to handle the exaction given a file's contents. It showed enough promise that I was confident we could import it to any sort of site we wanted.

You wanted Drupal, right?

So, I'm...opinionated about my CMS. I work with Drupal sites every day. While EscherGirls could have been created in Wordpress, or Laravel, or any CMS, the vast majority of my experience is with Drupal sites and Drupal in particular. I knew that I could create the site she wanted in Drupal, but also that she could be supported by Drupal's welcoming community even without me.

Ownership was also a prime concern. Everything in Drupal is fully open source. Official modules must be Free and Open Source, they cannot be for profit. As a result, the software is fully under the control of the site owner. Drupal's community is international -- there are even events in the site owner's own city -- so I knew she wouldn't have to deal with someone else's "community standards" or predatory EULAs again.

Not just to learn, but to give back

In addition to learning, Escher Girls was an opportunity for me to write code and give that back to the community for everyone to use.

One of the problems with the Migrate API is that it tends to work best when migrating from a database to a database. When migrated from files in a directory to a database, migrations become less repeatable and interruptable. Often during development of a migration, you need to run and rerun the migration again and again until you catch most of the edge costs and 95% of the total source content.

The solution was to load all the files, do only minimal processing on them, and then store the content in a database table. Then, a follow-up migration can draw from the table and create the final data for the Drupal site, doing whatever additional transformation along the way. To resolve both of these problems, a pair of new Drupal modules were created: Migrate Directory, and Migrate Staging Table.

The first provides a migration source for files. It takes a directory as input, then loops through each file, presenting it to the migration as an input. Once we have the file's contents, we need to save them to a database table. This is where Migrate Staging Table comes into play. It provides a destination to save the contents to a custom database table, but also provides a means to create the table in the first place. Furthermore, the module also provides a migration source from the staging table to use in follow-up migrations.

The follow-up migrations do a number of different transformations on each file. Often, we need to extract key pieces of data, search and replace key pieces of text (like part of a URL) or manipulate arrays of data. I've created several other modules at work for each of these purposes, and leveraged them for this site as well. While there's no one module that will take a Tumblr Utils export and import it into a Drupal site, there are several pieces that you can pull off the shelf for free and build your own solution.

A sweeter, more syrupy locale

Now that we knew we could migrate the site, and what we were migrating into, we needed somewhere for the site to live. We needed Infrastructure. For Drupal sites, I usually recommend using a Virtual Private Server provided by Linode or DigitalOcean. Unfortunately, both of those are American companies, and (at the time) only provided hosting in the US.

I had never looked into Canadian hosting. The landscape is...much smaller than the US. While there are many shared hosting companies, VPS hosts were few and often very expensive. After a lot of research we settled on one that had an affordable price for a reasonable amount of hardware and bandwidth. The experience taught me a lot, and left me daydreaming about starting my own hosting company further north with a heavy emphasis on ethical operation and user privacy. I still have that dream today, although without monetary and legal help, I doubt that'd be possible.

Once we had the server set up, we also needed somewhere to host the source code. The source code isn't the site, but it makes up the software that powers the site. While many people choose Github for hosting code, the fact repositories needed to be public presented a problem. Instead, we went with Gitlab. Gitlab has free private repositories and built-in Continuous Integration. We set up two repositories -- one to configure the server itself, and one to host site code. Now, whenever we need to roll out a change, we only need to interact with the git repository, and not the underlying infrastructure itself. For a single site, this seems a bit overblown, but it's the same one I use for my own site, so it was familiar and easy to configure.

The art is awful, but the site looks good

Most of the project so far has been things I've been familiar with. Most of the above is either backend development, or operations. That usually what I do at work, so I was able to accomplish a lot of that quickly. What was left was to design a create a theme for the site. We needed a frontend.

A site theme is more than just making it look good. It had to also be functional. I'm not sure if workflow-based design is a thing or not in the web field, but that's what I like to call my method. Instead of just focusing on the appearance, the primary goal is to facilitate the kind of interactions we want, and to encourage follow-up interactions. For my own site, I realized it is more likely that people will land on the site via a shared link or a web search. Once on a page, the content should be the star of the show. All other design elements need to stay out of the way. Once you finish reading, only then is it a good idea to show links for related or popular content. Anything earlier is just a distraction.

With this in mind, I pulled a black paper notebook and a bag of gel pens. There's something about working in bright colors on a dark background that make it easier for me to focus and get down my thoughts. I didn't have a solid idea, but after a page or two a few things became clear.

Like my own site, EscherGirls will likely get most of it's traffic not from the front page, but by arriving on a particular post from a social media post or a shared link. Unlike my site, there was additional need to drive engagement such as re-sharing, submitting additional content, and supporting the site.

The resulting sketches became the basis for the site's design. Most of EG's 5000 posts have one or more images as their central focus. On smaller devices, the images are shown edge to edge to give maximum detail. On larger devices, the image is placed alongside the remaining post's content such as commentary and community comments. The pages are fully responsive and adapt even if you change the window size.

Instead of a header bar, EG has persistent, floating elements over the page. The menu icon and site logo appear over each image, with text content sliding out of the way where applicable. The menu is also "partially collapsing". By default, the menu is closed on all devices, but key icons for sharing, submitting, and finding more content remain on screen.

On paper, I loved this design, but now I had to get down to the work of making it happen.

Originally, I wanted to build the theme in Bootstrap, a modern frontend framework. The problem was that my understanding of CSS is spotty at best, and I would need to learn Bootstrap while also trying to solidify my understanding of some core CSS concepts. The project had gone on for several months already, as I was scraping time between work, freelancing, and writing for my own Patreon. There was a lot of pressure to complete the bare minimum needed to launch the site.

So...I cheated.

I threw Bootstrap aside and started with a copy of my own site's theme. The idea was that I already knew how the theme was structured, where all the interaction points were, so it would be faster than trying to start with a new Drupal "base theme" or Bootstrap. This actually worked rather well, and I borrowed ideas from the failed Bootstrap attempt to create a better theme for EG.

To recreate a comic book feel, I chose Bangers, a free and open web font. For the body text, I wanted something suitably bold and unapologetic. Usually I use Oswald for header text, but it works surprisingly well as a body text here. It helps that many of EG's posts are short, so making the body text stand out more adds to the overall effect. The colors were also brighter than my usual choices. I almost went with primary colors, but to keep it from feeling like a plastic kids toy, I desaturated the colors slightly to make them look like old newsprint.

While the partially collapsed menu was challenging, it posed fewer obstacles than I expected and came together surprisingly quickly over the course of a weekend.

The majority of the theme was finished. We decided to remove some features so we could get to launching the site more quickly. We spent what seemed to be another two weeks polishing and refining.

Then, it came time to launch it.

This part was actually rather anti-climatic. We had already set up the live server environment ahead of time. In fact, most of our internal polishing was done on the server that would become the new live site. Once we were ready, the only thing that was necessary was to update the domain's DNS entry. 15 minutes later, the new site was live. We were off Tumblr.

Thanks to our Continuous Integration system with Gitlab, it was easy to roll out new features and fixes with a minimum of downtime. From my end, I only need to commit the code to the repository, then push my changes. The CI does all the rest of the work of building and updating the site, consistently, each time. A week after launch we rolled out new tag pages using CSS Grids -- another first for me! There's still several things we'd like to add -- especially Disqus integration -- we have yet to deploy.

EG was a lot of work to put together, and like any project, I'm relying on an extensive open source library in order to make it happen. Even though it took several nights and weekends over the course of months to create, I'm really happy with the result. The site looks good, is more fluid than my own site, and I've learned a ton more about front end design and CSS.

This post was created with the support of my wonderful supporters on Patreon.

If you like this post, consider becoming a supporter at:

Thank you!!!