How I Host this Static Ghost Blog on Github Pages with wget
Lately I've been wanting to go back to writing more on my personal blog, so I'm currently in the process of gathering articles, slides, videos and code samples I wrote on various platforms.
Initially I wanted to write articles and pages in Markdown, generate some static blog pages from them, push to a repository, and host them in GitHub Pages. I tried several approaches, including building my blog off the great Stencil site generator. But then decided that I didn't want to deal too much with templates and design, or I would spend more time coding and I'd never get to writing 😙
Coming from Medium, I liked Ghost's design and simplicity the most. Keeping another Node.js server with NGNIX and MySQL, backups and so on, not so much 😅. I found some middle ground installing Ghost locally (with the database in a SQLite file I can commit to a repo) and generating a static site with all the articles and pages.
I found many how-to's using Buster, a Python tool that automates the process of grabbing every page generating the site. The problem is that I soon noticed that the tool is no longer maintained and many things have changed in Ghost after it was last updated, so the result is essentially broken. I spent the rest of the day trying all the other forks with different results: some fixed some things and others fixed some other things. Either links were still pointing to localhost or pages were missing.
Then I remembered that the fantastic wget
command has some backup and mirroring features, which Buster were also using under the hood. So I wrote a very simple bash script to automate the static generation process.
Here are the steps I followed to set up this blog. If you have a local Ghost installation, you can skip ahead to the script.
Installing Ghost locally
To install Ghost locally you will need the following:
- A supported version of Node.js
- npm or yarn to manage packages
- A clean, empty directory on your machine
Ghost has a nice CLI tool we can install globally to handle multiple blogs:
npm install -g ghost-cli
Then we need to create the empty directory were we want to store the blog, cd
into it and run the local install command:
ghost install local
After that, the blog will be installed in this directory and served at https://fdezromero.com
, if you don't have previous Ghost installations. The admin is accessible from https://fdezromero.com/ghost
.
The database file is located at content/data
, in case you want to keep a backup in a private repo.
Generating the static site
Once we have set up our Ghost blog, edited our user, created some articles and pages or installed a different theme, let's generate the static site so it can be served by GitHub Pages (or Firebase, Netlify, etc).
First, if you're using macOS like me, let's install wget
via Homebrew. It's like the apt-get
of macOS. If you use Linux you should have it already installed.
brew install wget
And then to the final part, the bash script. I saved it at the root directory of the blog, and called it generate.sh
:
#!/bin/bash
# Copy blog content
wget --recursive --no-host-directories --directory-prefix=static --adjust-extension --timeout=30 --no-parent --convert-links https://fdezromero.com/
# Copy 404 page
wget --no-host-directories --directory-prefix=static --adjust-extension --timeout=30 --no-parent --convert-links --content-on-error --timestamping https://fdezromero.com/404.html
# Copy sitemaps
wget --recursive --no-host-directories --directory-prefix=static --adjust-extension --timeout=30 --no-parent --convert-links https://fdezromero.com/sitemap.xsl
wget --recursive --no-host-directories --directory-prefix=static --adjust-extension --timeout=30 --no-parent --convert-links https://fdezromero.com/sitemap.xml
wget --recursive --no-host-directories --directory-prefix=static --adjust-extension --timeout=30 --no-parent --convert-links https://fdezromero.com/sitemap-pages.xml
wget --recursive --no-host-directories --directory-prefix=static --adjust-extension --timeout=30 --no-parent --convert-links https://fdezromero.com/sitemap-posts.xml
wget --recursive --no-host-directories --directory-prefix=static --adjust-extension --timeout=30 --no-parent --convert-links https://fdezromero.com/sitemap-authors.xml
wget --recursive --no-host-directories --directory-prefix=static --adjust-extension --timeout=30 --no-parent --convert-links https://fdezromero.com/sitemap-tags.xml
# Replace localhost with domain
LC_ALL=C find ./static -type f -not -wholename *.git* -exec sed -i '' -e 's/http:\/\/fdezromero.com/https:\/\/fdezromero.com/g' {} +
LC_ALL=C find ./static -type f -not -wholename *.git* -exec sed -i '' -e 's/fdezromero.com/fdezromero.com/g' {} +
LC_ALL=C find ./static -type f -not -wholename *.git* -exec sed -i '' -e 's/http:\/\/www.gravatar.com/https:\/\/www.gravatar.com/g' {} +
# Set up Github Pages CNAME
echo "fdezromero.com" > static/CNAME
In order to adapt it to your blog, replace all the occurrences of fdezromero.com
with your port number if different, and fdezromero.com
with your own domain or subdomain. You can also use the subdomain provided by GitHub Pages, like username.github.io
.
Notice that all the script does is to start crawling all links from the home page and converting them to relative when possible. Then the same with the sitemaps for better SEO. At the end, it replaces any broken link still pointing to the local installation and creates the CNAME
file needed by GitHub Pages to use your own custom domain.
Once we have the script configured, we just need to give it execution permissions and run it:
chmod +x generate.sh
./generate.sh
If the blog is up, the script will scrape its contents and you'll have a static copy of your blog in the static
directory. From here, you can initialize a git repo and push the site to GitHub:
cd static
git init
git remote add origin https://github.com/user/repo.git
git push master origin
And finish configuring your custom domain or subdomain to point to your blog with this guide.