Premise
This is a tale about a blog on WordPress.com that had a loyal readership and regular, high quality content (so yeah, not writing about this blog). The owner wanted to use the odd plugin and advanced theme, and I was always bothered that WordPress was living off of, well I exaggerate wildly now, this person’s words by putting ads everywhere in addition to the massive annual fee.
Liberation
So with the lure of freedom on them yonder hills, we moved the blog off of WordPress.com onto a Windows VM on Azure (yeah, well… yeah…). Domain hosted on dnSimple, so the logistics of pointing the domain to Azure instead of WordPress and setting up verification TXT records and such was a doddle.
Hosted MySQL instance on Azure was easy enough, but the WordPress.COM theme we had been using was not available on WordPress.Org so we had to pick another one. Sadly we went with Customizr which really means vendor lock-in, as you do a bunch of customisations, hence the name, that are all out the window once you change themes.
Of course, there is no option but to run HTTPS today, and trying to pinch pennies we weren’t going to buy an EV cert from one of the remaining dodgy CAs out there, but iinstead we went with Let’s Encrypt using a tutorial posted by Scott Hanselman.
Selling out – but is anybody buying?
To make the big bucks we hooked the the site up to Google Analytics and ditto Adsense, and there were plugins to really automate that stuff. Yoast SEO beat out MonsterInsights on features for the analytics and integrates both with Search Console and Analytics. The killer feature for Yoast SEO is the customisable canonical URL which is useful if you reprint blog posts from another site and want to beg Google for mercy for the crime of duplicate content.
The actual ads, how do they work? Well by cunningly just clicking like an insane person (which really is the best way to learn), I managed to understand the concept of Auto Ads. This again is abstracted away by a plugin, in our case Advanced Ads. As the site owner didn’t want ads on all pages, we had to hack it by creating a plain text and code ad with the Auto Ad code from Google pasted in there and then the Advanced Ads thing deciding which pages to actually serve the ad code. The downside is a persistent nagging that you ‘shouldn’t display visible ads in headers’, but I guess that’s fine. They are just script tags, so there is nothing visible there..
Also, all the cool kids enter the Amazon Affiliate program, so we did that. They do have a minimum number of referrals you have to make as they don’t want to deal with tiny unprofitable sites, so I suspect we shall be unceremoniously booted out fairly soon, but the concept of having widgets where you choose your favourite books related to the subject of your blog and maybe in the long term share some revenue if people take you up on your recommendations seems fair. Shame that the widgets themselves are so immensely horribly broken and difficult to use. Allegedly, they are supposed to update when you make changes in the affiliate program site, but they really aren’t. I don’t get paid by Amazon so I shan’t debug their system, but it can really only be that the command that goes back to save settings isn’t picked up, or that they are unable to bust the cache and have old widgets served, but I strongly suspect it is the actual save that is broken, since the widget loses the data already in the wizard before you even enter the last page.
AMPed up
After a few hours I noticed that all the permalinks from the old site were broken on the new one, so I checked the Permalinks tab and it turned out there was a custom setting that I just set to default which made things work and there was much rejoicing. No audit log here so I can’t check, but if I made that change it must have been unintentional. My favourite hypothesis is that somehow the otherwise impressive WordPress XML-based import somehow failed to bring over the settings correctly.
As I rarely venture out into the front end I had not quite grasped what AMP is. I realised I was getting another load of 404s – his time for URL’s ending in /amp. I did a bit of googling and I realised I should probably get yet another plugin to handle this. Like with most WordPress plugins there are varying degrees of ambition and usually they want you to spend $200 in extras to get what you need, but although I brought the site off WordPress.com to deny them ad revenue for the site in question, I was under no illusion that I would be able to produce any such revenue to the owner as whatever $3 would be produced would definitely be eclipsed by the hosting cost.
By going with the default WordPress AMP plugin you can’t do ads, but it works – ish, by using the major competitor you get a functional site, but a completely different look compared to the non-AMP site, and we didn’t want that after all the effort we had already put in.
After reading some more, I realised that everybody was going off AMP anyway, for varying reasons, but that was all the peer pressure I needed, so I broke out the Azure debug console and edited web.config to put in a URL redirect from AMP URL to a normal one.
This was incredibly frustrating as first I forgot that .NET Regexes are different from normal regexes and also you have to not be stupid and use the correct match in the redirect expression ({R:0} is the whole source data, while {R:1} is the first match, which is what I needed).
<staticContent>
<remove fileExtension=".woff2" />
<mimeMap fileExtension=".woff2" mimeType="font/woff2" />
</staticContent>
<rewrite>
<rules>
<rule name="Disable AMP" stopProcessing="true">
<match url="^(.)amp\/?\r?$" />
<action type="Redirect"
url="https://<awesomesite>.com/{R:1}"
redirectType="Found" />
</rule>
<rule name="Redirect to naked" stopProcessing="true">
<match url="(.)" />
<conditions>
<add input="{HTTP_HOST}"
pattern="www.<awesomesite>.com" />
</conditions>
<action type="Redirect"
url="https://<awesomesite>.com/{R:0}"
/>
</rule>
<rule
name="WordPress: https://<awesomesite>.com"
patternSyntax="Wildcard">
<match url="*"/>
<conditions>
<add input="{REQUEST_FILENAME}"
matchType="IsFile" negate="true"/>
<add input="{REQUEST_FILENAME}"
matchType="IsDirectory" negate="true"/>
</conditions>
<action type="Rewrite" url="index.php"/>
</rule>
</rules>
</rewrite>
So there are a couple of things here – first a mime type correction to make IIS server web fonts, thne a redirect for AMP sites, then a redirect from www.awesomesite.com to awesomesite.com for prettiness, and also to canonicalise it to avoid duplicate records in the offices of Google, which they do not like. WordPress itself will force https if necessary, so all we need to do in this config file is to curb the use of www.
Summary
The actions we took to move the blog were the following:
- Set up the new site
- Create site
- Create blob storage
- Create redis cache (I did this later, but you might as well)
- Set up a database
- Export existing data from old blog
- Import data into new system
- Choose a theme
- Verify that old google links work on the new site (I didn’t do this fast enough)
- Verify that any way you try and call the site is redirected to a canonical represenation. Use a hosts file if you haven’t redirected the DNS yet, which with hindsight should have been the way I did it.
- Move the DNS to point to the new site
- Add the LetsEncrypt support to the site by following the guide. No more certificate errors
- Install plugins for analytics and ads.
- Create a Google account
- Register with Google Search Console
- Register with Bing search console (for those two or three people that don’t know Google.
- Register with Google Analytics
- Register with Google AdSense
Conclusion
So this was very easy and horribly frustrating at once. DnSimple and provisioning resources was a doddle. Following the internet guide to set up Let’s Encrypt and HTTPS was super straightforward, but then WordPress plug-in management, PHP and Amazon widgets were shit shows to be honest. I mean I realise Amazon has a complex architecture and their systems are never 100% up or 100% down and so on, but a save button being completely broken doesn’t feel even slightly “up” from the point of view of the end user.
PHP is garbage and brittle and you are hard-pressed to build anything viable on top of it (but obviously some have succeeded). These plugin smiths aren’t Facebook though. They would correctly interject that I am running WordPress on the least suitable platform imaginable. That is true (it has to do with how the Azure VM instances are set up, on the fact that they run on Windows and most importantly NTFS which has performance characteristics that are completely unsuitable for Unix style applications and favour a small number of large files where EXT4 favours large amounts of small files), but if the Powers that Be really consider Windows and NTFS to be such tremendous deal-breakers, then they should simply not allow Microsoft to host WordPress on Windows at all. As it stands, it does WP no favours with 1 minute turnaround to save settings for a plugin and similar. Then again, I also live in the UK which notoriously has a Internet infrastructure dating back to the Victorian Era, so it’s hard to tell what’s actually the worst culprit, but the sidecar web app that hosts the debug console for the blog is a lot snappier than WordPress, and that is hosted in the same IIS intallation as the WordPress site, although not in the same app pool.