Shiki syntax highlighting (with Markdoc) on Cloudflare Workers

Shiki and Markdoc service on Cloudflare Workers for powerful documentation or blogging, on the edge

Last updated:

GitHub repos:
service worker
client app

Shiki is a great choice for code syntax highlighting thanks to its high level of accuracy, support for many different languages and themes, customizability etc.

Our goal is to run it as an on-demand service on Cloudflare Workers, called by a React Router client app running on a separate Worker connected through service bindings. If we wish to make requests to the service worker from some other consumer outside the Cloudflare Workers ecosystem we can simply do so via the standard Fetch API.

This sounds good, but attempting to deploy an app that imports Shiki dependencies to Cloudflare the default way will trigger an error message: Uncaught CompileError: WebAssembly.instantiate(): Wasm code generation disallowed by embedder or something similar.

The solution is given in the Shiki docs, which still makes use of the original Oniguruma WASM engine. The approach taken is to import only the core dependencies and the bare minimum of languages and themes required for the project.

We can go further by trying Shiki's new JavaScript Engine.

src/shiki/index.ts
// import from shiki/core
import { createHighlighterCore } from "shiki/core";
import { createOnigurumaEngine } from "shiki/engine/oniguruma";
import { createJavaScriptRegexEngine } from "shiki/engine/javascript";
// import only the languages and themes we need
import bash from "shiki/langs/bash.mjs";
import tsx from "shiki/langs/tsx.mjs";
import ts from "shiki/langs/typescript.mjs";
import tokyoNight from "shiki/themes/tokyo-night.mjs";

// @ts-expect-error
let engine = await createOnigurumaEngine(import("shiki/onig.wasm"));
let jsEngine = createJavaScriptRegexEngine();

export let highlighter = await createHighlighterCore({
  engine,
  engine: jsEngine,
  langs: [bash, ts, tsx],
  themes: [tokyoNight],
});

By swapping only the regex engine while everything else remains constant, total bundle size is reduced by 23% (39% gzipped). Here are the results from wrangler deploy --dry-run:

  • WASM: Total Upload: 1481.81 KiB / gzip: 319.77 KiB
  • jsEngine: Total Upload: 1144.72 KiB / gzip: 194.18 KiB

Unsurprisingly, bundle size can be expected to correlate inversely with worker startup time.

Why Markdoc?

In some scenarios we may just want to get back the pure, freshly highlighted HTML from our service, without the addition of something else such as Markdoc.

For writing documentation or posting articles, however, it's usually more convenient to write the content in markdown format with the code placed inside code fences. MDX is a popular choice as it allows React components to be incorporated into the markdown as a way to provide interactivity.

Markdoc was created by Stripe to power their documentation. It takes a different approach and in doing so brings some workflow and performance optimizations over MDX to the party.

Our Shiki syntax highlighter can be configured to integrate with Markdoc something like this:

src/markdoc/markdoc.config.ts
import Markdoc, { type Config, type Node } from "@markdoc/markdoc";
import {
  transformerNotationDiff,
  transformerNotationHighlight,
} from "@shikijs/transformers";
import { highlighter } from "../shiki";

export let config: Config = {
  nodes: {
    document: {
      // render markdoc document into html section
      render: "section",
    },
    fence: {
      render: "Codeblock",
      transform(node: Node) {
        let { content, language } = node.attributes;
        let html = highlighter.codeToHtml(content, {
          colorReplacements: {
            // improve accessibility (color contrast ratio)
            "#51597d": "#8a98d5",
            "#9d7cd8": "#a789dc",
          },
          lang: language,
          theme: "tokyo-night",
          transformers: [
            transformerNotationDiff(),
            transformerNotationHighlight(),
          ],
        });
        return new Markdoc.Tag("Codeblock", {
          ...node.attributes,
          innerHtml: html,
        });
      },
    },
  },
  // nothing special going on here for this example
  tags: {
    aside: {
      attributes: {
        children: {},
        type: {},
      },
      render: "Aside",
    },
  },
};

Since we're serving our content from Cloudflare Workers, we can improve performance by storing Markdoc documents as plain text strings in an edge database such as D1 or Turso in order to benefit from colocation of database and server.