Gnuxie & Draupnir: late summer 2024.
Table of Contents
Introduction
Draupnir development has been building momentum and is now full steam ahead. If you didn't hear, Draupnir is now being supported with a grant from NLnet.
https://marewolf.me/posts/draupnir/24-nlnet-goals.html
We've now cleared the first two goals of the grant, the command system library and refactor.
The library lives in a repository on the Draupnir GitHub organisation: @the-draupnir-project/interface-manager. I don't think that is particularly helpful to anyone getting started writing TypeScript bots, so I plan to spend some time soon to create a simple bot using the interface-manager to act as a project template and example.
We've had two releases for Draupnir since the previous update, beta.5 and beta.6.
In this update I'm going to be talking about development on Draupnir's command system and its history, becoming what we now call the interface-manager1.
Please click on and read the footnotes if you need extra context or something is confusing. They're there for a reason.
Draupnir Commands
How commands work in Mjolnir
First we're going to look at how commands work in Mjolnir. Mjolnir is the older sister of Draupnir and as Draupnir started as a fork of Mjolnir, so Mjolnir will provide us with both the beginning and pre-history of Draupnir's command system.
The command handler2 in Mjolnir is simplistic. When Mjolnir
receives a message in the management room, Mjolnir will use split("
")
3 to tokenize a command into an array of words. For a ban
command, this would look something like ["!mjolnir", "ban", "coc",
"@foo:example.com", "spam"]
4.
The command handler then dispatches from ["!mjolnir", "ban"]
to find
a handle for the command.
Each individual command handle then has to to process and parse this argument array on their own. This is done using fragile ad-hoc parsing code attempting to parse each argument into some type relevant to the command.
let argumentIndex = 2; // where the arguments start in the arguments array let ruleType: string | null = null; let entity: string | null = null; let list: PolicyList | null = null; let force = false; while (argumentIndex < 7 && argumentIndex < parts.length) { const arg = parts[argumentIndex++]; if (!arg) break; if (["user", "room", "server"].includes(arg.toLowerCase())) { // set the rule type from the explicit argument for the rule type... } else if (!entity && (arg[0] === '@' || arg[0] === '!' || arg[0] === '#' || arg.includes("*"))) { entity = arg; // infer the rule type from the entity argument } else if (!list) { // set the list or infer it from the default.... } if (entity) break; } if (parts[parts.length - 1] === "--force") { // set the force flag... } if (!entity) { // more code to figure out where the entity is in this arguments array... } if (!list) { list = mjolnir.policyListManager.lists.find(b => b.listShortcode.toLowerCase() === defaultShortcode) || null; }
Here is an edited example from Mjolnir's ban
command, you should
really look at the real things if you want to see the other
glue5.
This isn't necessarily bad on its own, this parsing code is ok if it is covered by unit tests. However, it is a problem if we need to write this code every time we create a command, including unit tests to make sure that it works correctly.
It's also not the great foundations for fostering an ecosystem of protections or bots, and sets a high bar for creating any sort of command6.
So, it is clear that we need to use a command parser that is aware of the different types that Draupnir uses, such as references to rooms in the form of raw room identifiers or matrix.to urls.
Originally my first attempt at refactoring this just separated out the parsing step from the command execution step, since currently in mjolnir both steps are bailed into one handle7.
But what was really needed was a presentation interface.
The presentation interface
I wanted Draupnir to have a presentation
interface8. Presentation interfaces are command-oriented
interfaces that associate semantics with input and output9. That
doesn't mean anything yet, but basically in Draupnir we associate a
type with each command argument when it is read by our command
parser. So a room identifier would become associated with a
MatrixRoomIDPresentationType
.
interface PresentationType<ObjectType = unknown> { name: string; validator: (value: unknown) => value is ObjectType; wrap: (object: unknown extends ObjectType ? never : ObjectType) => Presentation<ObjectType>; } interface Presentation<ObjectType = unknown> { object: ObjectType; presentationType: PresentationType<ObjectType>; }
When defining Draupnir commands, we can then describe the presentation types (or presentation schema) for each parameter and the command parser will automatically match arguments up to each parameter for us10.
{ parameters: tuple( { name: "entity", description: "The entity to ban. This can be a user ID, room ID, or server name.", acceptor: union( MatrixUserIDPresentationType, MatrixRoomReferencePresentationSchema, StringPresentationType ), }, { name: "list", acceptor: union( MatrixRoomReferencePresentationSchema, StringPresentationType ), async prompt(draupnir: Draupnir) { return Ok({ suggestions: draupnir.policyRoomManager .getEditablePolicyRoomIDs( draupnir.clientUserID, PolicyRuleType.User ) .map((room) => MatrixRoomIDPresentationType.wrap(room)), }); }, } ), async executor(draupnir: Draupnir, entity, list) { // The body of the ban command... }, }
Because we use the presentation types and schema directly, TypeScript
will automatically infer the types of entity
and list
in the
command executor for us. Which is really powerful, as the presentation
schema describe exactly what types are valid for each executor
argument is parsed, and the type information is inferred from the same
schema, so there isn't a way for the programmer to specify the wrong
schema or types. The schema and types are always consistent and if we
did override the type annotation for entity
or list with an
inaccurate type, TypeScript would even show us an error.
You may have also noticed the prompt method in the parameter
description for list
. Essentially if the command parser notices that
there is a missing argument for list
, the parser will return an
error and provide the interface the option to prompt the user just for
that argument specifically, rather than asking them to write out the
entire command again. This even includes some helpful suggestions for
what the argument should be, which in this case is one of the policy
rooms that Draupnir is watching.
You can find what Draupnir's ban command looks like as of writing in this link.
Rendering in Mjolnir
Another major issue with Mjolnir's command system was how commands are
rendered. Matrix events have two distinct formats, plain text and
org.matrix.custom.html
11. Because not all matrix clients include
a html renderer, Mjolnir would have to render both text
and html
versions of a message at the same time, duplicating the work.
This was done adhoc in Mjolnir's code base, and as you can also see escaping html from content needs to be handled explicitly, which in risky. The example below shows how this code would work, although you should take a look at the whole file to get a good picture.
const renderPolicyLists = (header: string, lists: PolicyList[]) => { html += `<b>${header}:</b><br><ul>`; text += `${header}:\n`; for (const list of lists) { const ruleInfo = `rules: ${list.serverRules.length} servers, ${list.userRules.length} users, ${list.roomRules.length} rooms`; html += `<li>${htmlEscape(list.listShortcode)} @ <a href="${list.roomRef}">${list.roomId}</a> (${ruleInfo})</li>`; text += `* ${list.listShortcode} @ ${list.roomRef} (${ruleInfo})\n`; } if (lists.length === 0) { html += "<li><i>None</i></li>"; text += "* None\n"; } html += "</ul>"; }
Even once two string buffers of text
and html
had been built,
Mjolnir would still need to arbitrarily send an event from the
command itself. Which while provides some freedom, it creates a lot of
room for inconsistency and programmer error.
const reply = RichReply.createFor(roomId, event, text, html); reply["msgtype"] = "m.notice"; await mjolnir.client.sendMessage(roomId, reply);
As an aside, we can see in the example above that there's no real
error handling of the call to client.sendMessage
. Which is another
pattern that is seen throughout the Mjolnir code base, and TypeScript
code more generally. Draupnir and the matrix-protection-suite use a
Result
type to force type safe error handling, see
@gnuxie/typescript-result. This is largely the reason why Draupnir now
rarely experiences random crashes that Mjolnir users will be familiar
with.
Rendering in Draupnir
One of the very first things that I did when developing Draupnir was
to allow messages to be rendered using JSX templates. If you don't
know, JSX is the same template technology that is used to create the
well known React framework. But did you know that in TypeScript you
can create your own JSXFactory
to render whatever you like12?
The JSXFactory that I developed would work with
org.matrix.custom.html
and work backwards to create a plain text
fallback (usually with markdown).
Each document is then carefully rendered to Matrix events, and are automatically split across multiple if the message is too large.
const renderedLists = lists.map((list) => { return ( <li> <a href={list.revision.room.toPermalink()}> {list.revision.room.toRoomIDOrAlias()} </a>{" "} ({list.revision.shortcode ?? "<no shortcode>"}) propagation:{" "} {list.watchedListProfile.propagation} {" "} (rules:{" "} {list.revision.allRulesOfType(PolicyRuleType.Server).length} servers,{" "} {list.revision.allRulesOfType(PolicyRuleType.User).length} users,{" "} {list.revision.allRulesOfType(PolicyRuleType.Room).length} rooms) </li> ); });
Draupnir commands describe renderers separately to the commands themselves, against an interface adaptor.
In theory, a command could have different renderers based on what adaptor called the command. So for example, it would be possible to make Draupnir commands accessible to a web API without too much work. All that would be needed is to create an interface adaptor.
DraupnirInterfaceAdaptor.describeRenderer(DraupnirStatusCommand, { JSXRenderer(result) { if (isError(result)) { return Ok(undefined); } return Ok(renderStatusInfo(result.ok)); }, });
The DraupnirInterfaceAdaptor
is an instance of the
MatrixInterfaceAdaptor
. This will automatically make the right
client API calls to render the JSX document to Matrix events in the
management room. It will even relay error messages if the command
fails and handle sending emoji (this is what we are deferring to by
not doing anything the result is an error).
The renderer description can still provide an arbitrary renderer
if the command needs to more than just render messages,
by providing an arbritraryRenderer
method.
We can use the JSXFactory
provided by Draupnir's interface-manager
anywhere too, not just in commands. This is how protections,
such as the BanPropagationProtection
render their messages.
Testing commands
Previously I have talked about how both Mjolnir and Draupnir's commands were hard to write unit tests for13. Mjolnir had some unit tests for parsing command arguments, but not for the code responsible for the functionality of the commands themselves.
In Draupnir we had made some progress here, since the matrix-protection-suite meant that the different components used by the commands were able to be unit tested, but still not the commands themselves.
This is because we gave the commands too much context, usually the entirety of Draupnir, that they would then destructure from and call various APIs. Just by the nature of that, any test would then become an integration test unless we could figure out how to fake Draupnir14.
What I ended up doing instead is just allowing commands to describe
glue code to destructure what they need from Draupnir into a command
specific context. Then when we test the main executor of a command, we
just need to give the command what it is actually going to use. This
works really well with the matrix-protection-suite's ClientPlatform
which breaks down the various client server APIs into granular
capabilities that can then be handed out liberally without implicitly
introducing dependencies into code on the entirety of Draupnir or a
Matrix client.
DraupnirContextToCommandContextTranslator.registerTranslation( DraupnirKickCommand, function (draupnir) { return { roomKicker: draupnir.clientPlatform.toRoomKicker(), roomResolver: draupnir.clientPlatform.toRoomResolver(), setMembership: draupnir.protectedRoomsSet.setMembership, taskQueue: draupnir.taskQueue, noop: draupnir.config.noop, }; } );
One of the commands that I know has been broken since the introduction of the matrix-protection-suite is the kick command. So it made a good candidate for exploring the unit testing features.
Below is a simple test case for the kick command. It might not be obvious what is going on here so let me break it down.
createProtectedRooms
is a function that uses a utility from the matrix-protection-suite calleddescribeProtectedRooms
, which allows us to fake an entire protected rooms set without a Matrix homeserver or client. This includes a model of the room state, as it would appear if Draupnir was really running.roomKicker
is just a simple mock of the matrix-protection-suite's granularRoomKicker
capability15. We are creating this mock to make sure that the command will only kick the users from the server that we told it to.- We call the
executor
of the kick command, the second argument is the context specific to the kick command that we showed earlier. This context is where the kick command will get all of the dependencies that it is going to use.
it("Will kick users from protected rooms when a glob is used", async function () { const { protectedRoomsSet } = await createProtectedRooms(); const roomKicker = createMock<RoomKicker>({ async kickUser(_room, userID, _reason) { // We should only kick users that match the glob... expect(userServerName(userID)).toBe("testserver.example.com"); return Ok(undefined); }, }); const kickResult = await CommandExecutorHelper.execute( DraupnirKickCommand, { taskQueue, setMembership: protectedRoomsSet.setMembership, roomKicker, roomResolver, noop: false, }, { keywords: { glob: true }, }, MatrixUserID.fromUserID(`@*:testserver.example.com` as StringUserID) ); const usersToKick = kickResult.expect("We expect the kick command to succeed"); expect(usersToKick.size).toBe(51); });
Well it's a good thing that we wrote a test for the kick command.
Ever since we migrated Draupnir to use the matrix-protection-suite,
the kick command has been broken because of a simple bug where
we forgot to initialise the ThrottlingQueue
. Which would be used
to process and kicks over time in the background.
Unfortunately, when I recently fixed this bug as part of the unit
testing work, I discovered a much worse bug. If you were to use a
glob
kick, then Draupnir would have removed every single user,
including itself, from either the target room or every single
protected room. Fortunately there was no way to trigger this bug in
any of the existing releases. But it goes to show the value of this
work16.
Interface manager library
When Draupnir's command system was written, I didn't really understand entirely what it should be, and I didn't understand how TypeScript's type inference worked.
So for example, in legacy Draupnir code, you would define presentation
types and commands by interning them into a hash table with a name.
Then retrieving them with a function
findPresentationType("MatrixRoomReference")
. This is a habit from
my time working with Common Lisp, and this pattern is how you do any
kind of meta-programming over there. It didn't occur to me that I could
just export the presentation types like so export const
MatrixRoomReference = { ... };
and then refer to them directly.
This is exactly why type inference on command executor parameters
now works.
As a result of this debt I did spend a little more time than I had anticipated cleaning things up but I think that is fine, the command system is important and one of the places any new or potential contributor will touch first. So it does need to be in good condition not to scare them away.
You can view the library here on GitHub @the-draupnir-project/interface-manager.
Closing
Footnotes:
If you're wondering why we skipped the fourth update (2404.html), lets just say that I burnt out from everything while I was waiting to get the details of the NLnet grant.
And yes the order of the arguments for the ban
command is the
opposite way around to Draupnir.
Although this is how the dreaded UNIX argv
works and that
never stopped anyone apparently.
It's worth again mentioning that it would have been very unlikely that I could convince Element to let me work on the command refactor in the first place. The work is ideological in a way that is opposed to the way command line interfaces work, and the presentation interface is not a command line interface, its a command oriented interface sure. I did make a note of this previously. https://marewolf.me/posts/draupnir/2403.html#fn.22
A presentation schema is just a way of describing which
presnetation types are valid for a given parameter. The
interface-manager has three different types of PresentationSchema
:
single, union, and top. Where single is just a presentation type,
union is a union of presentation types, and top can be any
presentation type, equivalent to TypeScript's unknown
.
Technically, we could have also broken commands into smaller functions and methods that we could then unit test and leave the actual command as only glue. But this would require an equal amount of work and possibly disorganise the code base.
Damn I regret not calling this RoomMemberRemover
but ok lol.
The PR where I discovered this can be found here: https://github.com/the-draupnir-project/Draupnir/pull/553