Gnuxie & Draupnir: late summer 2024.

Table of Contents

Introduction

Draupnir development has been building momentum and is now full steam ahead. If you didn't hear, Draupnir is now being supported with a grant from NLnet.

https://marewolf.me/posts/draupnir/24-nlnet-goals.html

We've now cleared the first two goals of the grant, the command system library and refactor.

The library lives in a repository on the Draupnir GitHub organisation: @the-draupnir-project/interface-manager. I don't think that is particularly helpful to anyone getting started writing TypeScript bots, so I plan to spend some time soon to create a simple bot using the interface-manager to act as a project template and example.

We've had two releases for Draupnir since the previous update, beta.5 and beta.6.

In this update I'm going to be talking about development on Draupnir's command system and its history, becoming what we now call the interface-manager1.

Please click on and read the footnotes if you need extra context or something is confusing. They're there for a reason.

Draupnir Commands

How commands work in Mjolnir

First we're going to look at how commands work in Mjolnir. Mjolnir is the older sister of Draupnir and as Draupnir started as a fork of Mjolnir, so Mjolnir will provide us with both the beginning and pre-history of Draupnir's command system.

The command handler2 in Mjolnir is simplistic. When Mjolnir receives a message in the management room, Mjolnir will use split(" ")3 to tokenize a command into an array of words. For a ban command, this would look something like ["!mjolnir", "ban", "coc", "@foo:example.com", "spam"]4.

The command handler then dispatches from ["!mjolnir", "ban"] to find a handle for the command.

Each individual command handle then has to to process and parse this argument array on their own. This is done using fragile ad-hoc parsing code attempting to parse each argument into some type relevant to the command.

let argumentIndex = 2; // where the arguments start in the arguments array
let ruleType: string | null = null;
let entity: string | null = null;
let list: PolicyList | null = null;
let force = false;
while (argumentIndex < 7 && argumentIndex < parts.length) {
    const arg = parts[argumentIndex++];
    if (!arg) break;
    if (["user", "room", "server"].includes(arg.toLowerCase())) {
        // set the rule type from the explicit argument for the rule type...
    } else if (!entity && (arg[0] === '@' || arg[0] === '!' || arg[0] === '#' || arg.includes("*"))) {
        entity = arg;
        // infer the rule type from the entity argument
    } else if (!list) {
        // set the list or infer it from the default....
    }
    if (entity) break;
}

if (parts[parts.length - 1] === "--force") {
    // set the force flag...
}

if (!entity) {
    // more code to figure out where the entity is in this arguments array...
}

if (!list) {
    list = mjolnir.policyListManager.lists.find(b => b.listShortcode.toLowerCase() === defaultShortcode) || null;
}

Here is an edited example from Mjolnir's ban command, you should really look at the real things if you want to see the other glue5.

This isn't necessarily bad on its own, this parsing code is ok if it is covered by unit tests. However, it is a problem if we need to write this code every time we create a command, including unit tests to make sure that it works correctly.

It's also not the great foundations for fostering an ecosystem of protections or bots, and sets a high bar for creating any sort of command6.

So, it is clear that we need to use a command parser that is aware of the different types that Draupnir uses, such as references to rooms in the form of raw room identifiers or matrix.to urls.

Originally my first attempt at refactoring this just separated out the parsing step from the command execution step, since currently in mjolnir both steps are bailed into one handle7.

But what was really needed was a presentation interface.

The presentation interface

I wanted Draupnir to have a presentation interface8. Presentation interfaces are command-oriented interfaces that associate semantics with input and output9. That doesn't mean anything yet, but basically in Draupnir we associate a type with each command argument when it is read by our command parser. So a room identifier would become associated with a MatrixRoomIDPresentationType.

interface PresentationType<ObjectType = unknown> {
  name: string;
  validator: (value: unknown) => value is ObjectType;
  wrap: (object: unknown extends ObjectType ? never : ObjectType) => Presentation<ObjectType>;
}

interface Presentation<ObjectType = unknown> {
  object: ObjectType;
  presentationType: PresentationType<ObjectType>;
}

When defining Draupnir commands, we can then describe the presentation types (or presentation schema) for each parameter and the command parser will automatically match arguments up to each parameter for us10.

{
  parameters: tuple(
    {
      name: "entity",
      description:
        "The entity to ban. This can be a user ID, room ID, or server name.",
      acceptor: union(
        MatrixUserIDPresentationType,
        MatrixRoomReferencePresentationSchema,
        StringPresentationType
      ),
    },
    {
      name: "list",
      acceptor: union(
        MatrixRoomReferencePresentationSchema,
        StringPresentationType
      ),
      async prompt(draupnir: Draupnir) {
        return Ok({
          suggestions: draupnir.policyRoomManager
            .getEditablePolicyRoomIDs(
              draupnir.clientUserID,
              PolicyRuleType.User
            )
            .map((room) => MatrixRoomIDPresentationType.wrap(room)),
        });
      },
    }
  ),
  async executor(draupnir: Draupnir, entity, list) {
    // The body of the ban command...
  },
}

Because we use the presentation types and schema directly, TypeScript will automatically infer the types of entity and list in the command executor for us. Which is really powerful, as the presentation schema describe exactly what types are valid for each executor argument is parsed, and the type information is inferred from the same schema, so there isn't a way for the programmer to specify the wrong schema or types. The schema and types are always consistent and if we did override the type annotation for entity or list with an inaccurate type, TypeScript would even show us an error.

You may have also noticed the prompt method in the parameter description for list. Essentially if the command parser notices that there is a missing argument for list, the parser will return an error and provide the interface the option to prompt the user just for that argument specifically, rather than asking them to write out the entire command again. This even includes some helpful suggestions for what the argument should be, which in this case is one of the policy rooms that Draupnir is watching.

You can find what Draupnir's ban command looks like as of writing in this link.

Rendering in Mjolnir

Another major issue with Mjolnir's command system was how commands are rendered. Matrix events have two distinct formats, plain text and org.matrix.custom.html11. Because not all matrix clients include a html renderer, Mjolnir would have to render both text and html versions of a message at the same time, duplicating the work.

This was done adhoc in Mjolnir's code base, and as you can also see escaping html from content needs to be handled explicitly, which in risky. The example below shows how this code would work, although you should take a look at the whole file to get a good picture.

const renderPolicyLists = (header: string, lists: PolicyList[]) => {
    html += `<b>${header}:</b><br><ul>`;
    text += `${header}:\n`;
    for (const list of lists) {
        const ruleInfo = `rules: ${list.serverRules.length} servers, ${list.userRules.length} users, ${list.roomRules.length} rooms`;
        html += `<li>${htmlEscape(list.listShortcode)} @ <a href="${list.roomRef}">${list.roomId}</a> (${ruleInfo})</li>`;
        text += `* ${list.listShortcode} @ ${list.roomRef} (${ruleInfo})\n`;
    }
    if (lists.length === 0) {
        html += "<li><i>None</i></li>";
        text += "* None\n";
    }
    html += "</ul>";
}

Even once two string buffers of text and html had been built, Mjolnir would still need to arbitrarily send an event from the command itself. Which while provides some freedom, it creates a lot of room for inconsistency and programmer error.

const reply = RichReply.createFor(roomId, event, text, html);
reply["msgtype"] = "m.notice";
await mjolnir.client.sendMessage(roomId, reply);

As an aside, we can see in the example above that there's no real error handling of the call to client.sendMessage. Which is another pattern that is seen throughout the Mjolnir code base, and TypeScript code more generally. Draupnir and the matrix-protection-suite use a Result type to force type safe error handling, see @gnuxie/typescript-result. This is largely the reason why Draupnir now rarely experiences random crashes that Mjolnir users will be familiar with.

Rendering in Draupnir

One of the very first things that I did when developing Draupnir was to allow messages to be rendered using JSX templates. If you don't know, JSX is the same template technology that is used to create the well known React framework. But did you know that in TypeScript you can create your own JSXFactory to render whatever you like12? The JSXFactory that I developed would work with org.matrix.custom.html and work backwards to create a plain text fallback (usually with markdown).

Each document is then carefully rendered to Matrix events, and are automatically split across multiple if the message is too large.

const renderedLists = lists.map((list) => {
  return (
    <li>
      <a href={list.revision.room.toPermalink()}>
        {list.revision.room.toRoomIDOrAlias()}
      </a>{" "}
      ({list.revision.shortcode ?? "<no shortcode>"}) propagation:{" "}
      {list.watchedListProfile.propagation} {" "} (rules:{" "}
      {list.revision.allRulesOfType(PolicyRuleType.Server).length} servers,{" "}
      {list.revision.allRulesOfType(PolicyRuleType.User).length} users,{" "}
      {list.revision.allRulesOfType(PolicyRuleType.Room).length} rooms)
    </li>
  );
});

Draupnir commands describe renderers separately to the commands themselves, against an interface adaptor.

In theory, a command could have different renderers based on what adaptor called the command. So for example, it would be possible to make Draupnir commands accessible to a web API without too much work. All that would be needed is to create an interface adaptor.

DraupnirInterfaceAdaptor.describeRenderer(DraupnirStatusCommand, {
  JSXRenderer(result) {
    if (isError(result)) {
      return Ok(undefined);
    }
    return Ok(renderStatusInfo(result.ok));
  },
});

The DraupnirInterfaceAdaptor is an instance of the MatrixInterfaceAdaptor. This will automatically make the right client API calls to render the JSX document to Matrix events in the management room. It will even relay error messages if the command fails and handle sending emoji (this is what we are deferring to by not doing anything the result is an error).

The renderer description can still provide an arbitrary renderer if the command needs to more than just render messages, by providing an arbritraryRenderer method.

We can use the JSXFactory provided by Draupnir's interface-manager anywhere too, not just in commands. This is how protections, such as the BanPropagationProtection render their messages.

Testing commands

Previously I have talked about how both Mjolnir and Draupnir's commands were hard to write unit tests for13. Mjolnir had some unit tests for parsing command arguments, but not for the code responsible for the functionality of the commands themselves.

In Draupnir we had made some progress here, since the matrix-protection-suite meant that the different components used by the commands were able to be unit tested, but still not the commands themselves.

This is because we gave the commands too much context, usually the entirety of Draupnir, that they would then destructure from and call various APIs. Just by the nature of that, any test would then become an integration test unless we could figure out how to fake Draupnir14.

What I ended up doing instead is just allowing commands to describe glue code to destructure what they need from Draupnir into a command specific context. Then when we test the main executor of a command, we just need to give the command what it is actually going to use. This works really well with the matrix-protection-suite's ClientPlatform which breaks down the various client server APIs into granular capabilities that can then be handed out liberally without implicitly introducing dependencies into code on the entirety of Draupnir or a Matrix client.

DraupnirContextToCommandContextTranslator.registerTranslation(
  DraupnirKickCommand,
  function (draupnir) {
    return {
      roomKicker: draupnir.clientPlatform.toRoomKicker(),
      roomResolver: draupnir.clientPlatform.toRoomResolver(),
      setMembership: draupnir.protectedRoomsSet.setMembership,
      taskQueue: draupnir.taskQueue,
      noop: draupnir.config.noop,
    };
  }
);

One of the commands that I know has been broken since the introduction of the matrix-protection-suite is the kick command. So it made a good candidate for exploring the unit testing features.

Below is a simple test case for the kick command. It might not be obvious what is going on here so let me break it down.

  1. createProtectedRooms is a function that uses a utility from the matrix-protection-suite called describeProtectedRooms, which allows us to fake an entire protected rooms set without a Matrix homeserver or client. This includes a model of the room state, as it would appear if Draupnir was really running.
  2. roomKicker is just a simple mock of the matrix-protection-suite's granular RoomKicker capability15. We are creating this mock to make sure that the command will only kick the users from the server that we told it to.
  3. We call the executor of the kick command, the second argument is the context specific to the kick command that we showed earlier. This context is where the kick command will get all of the dependencies that it is going to use.
it("Will kick users from protected rooms when a glob is used", async function () {
  const { protectedRoomsSet } = await createProtectedRooms();
  const roomKicker = createMock<RoomKicker>({
    async kickUser(_room, userID, _reason) {
      // We should only kick users that match the glob...
      expect(userServerName(userID)).toBe("testserver.example.com");
      return Ok(undefined);
    },
  });
  const kickResult = await CommandExecutorHelper.execute(
    DraupnirKickCommand,
    {
      taskQueue,
      setMembership: protectedRoomsSet.setMembership,
      roomKicker,
      roomResolver,
      noop: false,
    },
    {
      keywords: { glob: true },
    },
    MatrixUserID.fromUserID(`@*:testserver.example.com` as StringUserID)
  );
  const usersToKick = kickResult.expect("We expect the kick command to succeed");
  expect(usersToKick.size).toBe(51);
});

Well it's a good thing that we wrote a test for the kick command.

Ever since we migrated Draupnir to use the matrix-protection-suite, the kick command has been broken because of a simple bug where we forgot to initialise the ThrottlingQueue. Which would be used to process and kicks over time in the background.

Unfortunately, when I recently fixed this bug as part of the unit testing work, I discovered a much worse bug. If you were to use a glob kick, then Draupnir would have removed every single user, including itself, from either the target room or every single protected room. Fortunately there was no way to trigger this bug in any of the existing releases. But it goes to show the value of this work16.

Interface manager library

When Draupnir's command system was written, I didn't really understand entirely what it should be, and I didn't understand how TypeScript's type inference worked.

So for example, in legacy Draupnir code, you would define presentation types and commands by interning them into a hash table with a name. Then retrieving them with a function findPresentationType("MatrixRoomReference"). This is a habit from my time working with Common Lisp, and this pattern is how you do any kind of meta-programming over there. It didn't occur to me that I could just export the presentation types like so export const MatrixRoomReference = { ... }; and then refer to them directly. This is exactly why type inference on command executor parameters now works.

As a result of this debt I did spend a little more time than I had anticipated cleaning things up but I think that is fine, the command system is important and one of the places any new or potential contributor will touch first. So it does need to be in good condition not to scare them away.

You can view the library here on GitHub @the-draupnir-project/interface-manager.

Footnotes:

1

If you're wondering why we skipped the fourth update (2404.html), lets just say that I burnt out from everything while I was waiting to get the details of the NLnet grant.

4

And yes the order of the arguments for the ban command is the opposite way around to Draupnir.

6

Although this is how the dreaded UNIX argv works and that never stopped anyone apparently.

8

It's worth again mentioning that it would have been very unlikely that I could convince Element to let me work on the command refactor in the first place. The work is ideological in a way that is opposed to the way command line interfaces work, and the presentation interface is not a command line interface, its a command oriented interface sure. I did make a note of this previously. https://marewolf.me/posts/draupnir/2403.html#fn.22

10

A presentation schema is just a way of describing which presnetation types are valid for a given parameter. The interface-manager has three different types of PresentationSchema: single, union, and top. Where single is just a presentation type, union is a union of presentation types, and top can be any presentation type, equivalent to TypeScript's unknown.

14

Technically, we could have also broken commands into smaller functions and methods that we could then unit test and leave the actual command as only glue. But this would require an equal amount of work and possibly disorganise the code base.

15

Damn I regret not calling this RoomMemberRemover but ok lol.

16

The PR where I discovered this can be found here: https://github.com/the-draupnir-project/Draupnir/pull/553