Alex Pearwin

Simpler Jekyll searching

This post has been archived. It's pretty old and likely concerns topics that are avoidable by using more modern tools and techniques. The original text is preserved below for posterity but it may no longer be relevant or correct.

All the way back in 2012, I wrote about how to create a Jekyll search page using a combination of a JSON page index and some JavaScript. I recently rewrote large parts of my blog when upgrading to Jekyll 3, and implemented a lazier, hackier, but simpler way of achieving the same effect. This post guides you through how it’s done.

What we want

You can label Jekyll posts with whatever metadata you want using YAML front matter, which looks like this for the post you’re looking at:

---
layout: post
title: Simpler Jekyll searching
category: Tips
tags: [Jekyll, JavaScript]
description: How to create a searchable post index on a Jekyll site
---

The interesting things here are the category value and tags list. If we’re going to label our posts like this, it would be nice to have an index for each category and tag; a page that lists all the other posts with the same category or tag.

The old way

In my old post, the general idea was this:

  1. Create a near-empty search.html page;
  2. Generate a list of all posts, along with their category and tags, as a JSON file;
  3. When a user hits the search page with a URL like search.html?tag=foobar, some JavaScript searches the JSON file for matching posts, and injects them in to the page.

This worked, but it required quite a bit of JavaScript to go from the JSON to a formatted HTML list. It also required an additional HTTP request to asynchronously retrieve the JSON file.

I came up with something arguably less elegant, but easier to understand.

The new way

Now, instead of generating a list of posts as a JSON file, I just generate one list per category and tag in the search.html page directly. This means looping over each category, creating a header, and then an unordered list of post titles and links.

<div class="category-index">
  {% for category in site.categories %}
    {% assign nposts = category | last | size %}
    <div class="collection" data-name="{{ category | first | escape }}">
      <h1>{{ category | first }}</h1>
      <h2>{{ nposts }} post{% if nposts != 1 %}s{% endif %}</h2>
      <ul>
        {% for posts in category %}
          {% for post in posts %}
            {% if post.title %}
              <li><a href="{{ post.url }}">{{ post.title }}</a></li>
            {% endif %}
          {% endfor %}
        {% endfor %}
      </ul>
    </div>
  {% endfor %}
</div>
{% endraw %}

This snippet is for the categories. The same logic applies to the loop over the tags.

After this, search.html contains one list per category and tag, and so all we need to do in the JavaScript is to hide all the lists that don’t match the label the user is searching for. The logic I went for goes like this:

  1. The user searches for something, visiting /search.html?tag=foo for example;
  2. Some JavaScript parses the GET parameters for either a tag or a category parameter, storing the value;
  3. The list of .collection containers is searched, and a container is hidden if it’s data-name property doesn’t match the value of the search parameter.

There are many ways you could write the JavaScript logic, but you can view what I ended up with on GitHub.

Wrap up

The new JavaScript has under half the number of lines of the old, and no longer depends on jQuery. Hooray for faster page loads!

As I said, I think the new implementation is more of a hack than the old. But I’m OK with that; sometimes it’s worth sacrificing a ‘pure’ implementation for something that’s easier to understand, as it encourages maintainability in the future (who wants to touch that magic one-liner?).

One improvement that could be made is to hide all the containers by default, in the CSS, and show matching containers, rather than hiding ones that don’t match. This would avoid the flash of ‘unstyled’ content where all containers are briefly shown before the JavaScript kicks in, but it would mean that everything is hidden when a user has JavaScript disabled. What path to take is up to you.

It should be easier for you to adapt the system I use for your own site. I’d love to hear if anyone gives it a go!