Under the Hood

Why Auto-Typing on Reddit is a Nightmare(And How Chrome's Debugger API Saved My Sanity)

DEV
Karma Builder Engineering
7 min read

When I set out to build Reddit Karma Builder—a Chrome extension that uses AI to generate and auto-type contextual replies into Reddit threads—I thought it would be a weekend project.

Find the text box, fetch the post context, generate a clever AI draft, and inject the text. Easy, right?

Wrong.

What followed was a descent into the absolute madness of modern web frameworks, Shadow DOMs, and a rich-text editor that fought back at every single turn. Here is the story of how a "simple" DOM manipulation task turned into a low-level hardware event simulation.

01Roadblock 1: The Shadow DOM Labyrinth

First, I had to find the comment box. If you've inspected Reddit's new UI recently, you know it's a nested doll of Web Components. It isn't just a <textarea>. It's a shreddit-composer, which has a Shadow Root, which contains a shreddit-composer-editor, which has another Shadow Root, which finally houses a div[contenteditable="true"].

Oh, and the "AI Reply" button? If you inject it on the main feed, the extension crashes because there's no comment tree context. I had to strictly gate the injection to /comments/ routes.

But traversing the Shadow DOM basically just took a recursive query selector. Annoying, but standard. The real boss fight was yet to come.

02Roadblock 2: The Lexical Editor Boss Fight

Reddit uses Lexical, Meta's highly extensible text editor framework. Lexical is incredibly powerful, but it rigidly controls its internal state.

My first naive attempt:

element.textContent = "AI generated reply";

Lexical ignored it completely. When you clicked "Reply", the box submitted empty.

Okay, fine. I needed to trigger events. I wrote a script to focus the element and use document.execCommand("insertText").

Here is where the ghost in the machine appeared: The first character typed perfectly. The rest of the string vanished.

Why? Because of Lexical's DOM reconciliation. When you insert a character, Lexical intercepts the beforeinput event, updates its internal abstract syntax tree, and then destroys and recreates the DOM nodes.

Because the DOM node was thrown away, my browser's Selection (the text cursor) was suddenly pointing into the void. The next execCommand silently failed because there was nowhere to insert it.

I spent hours writing increasingly unhinged workarounds:

  • Yielding to the microtask queue after every single letter (await new Promise(r => setTimeout(r, 0))) so Lexical could finish reconciling.
  • Writing a function to re-traverse the DOM, find the deepest text node, and forcefully re-place the caret before every keystroke.
  • Firing synthetic InputEvent fallbacks.
  • Simulating full mousedown, pointerup, and focusin events just to wake the editor up.

It was brittle, slow, and randomly confused the comment box UI. Lexical is built to ignore synthetic, fake UI events (isTrusted: false). I was fighting a framework designed specifically to stop what I was doing.

03The Breakthrough: Dropping to the Hardware Level

I realized front-end JavaScript wasn't going to cut it. I needed to convince Chrome that a human being was sitting at a keyboard, physically hitting keys.

Enter the Chrome DevTools Protocol (CDP).

By adding the "debugger" permission to my manifest.json, I could attach a local debugging session to the user's active tab and send raw CDP commands. Specifically: Input.dispatchKeyEvent.

Instead of fighting the DOM, I built a background worker that:

  1. Attaches the Chrome debugger to the active tab.
  2. Loops through the AI-generated text.
  3. Sends a rawKeyDown, then a char event, then a keyUp event for every single character.
  4. Detaches the debugger.
// Sending a keystroke from the exact same API that Chrome's internal engine uses
await chrome.debugger.sendCommand({ tabId }, "Input.dispatchKeyEvent", {
    type: "char",
    text: char,
    unmodifiedText: char,
    windowsVirtualKeyCode: char.charCodeAt(0),
    modifiers: 0
});

Because these come directly from the browser engine, they arrive at Reddit's frontend with isTrusted: true. Lexical has absolutely no idea it's a script. It just sees perfect, sequential keystrokes.

To make it completely bulletproof, I added a randomized distribution algorithm to the typing speed—pausing longer on spaces, commas, and formatting breaks—so it visually types out exactly like a human thinking through a sentence.

04The UX Polish: Don't Leave The User Hanging

Once the typing was solved, I ran into a UX issue. To generate a high-quality context-aware reply, the extension silently downloads the thread's .json data to parse the conversation tree.

On massive threads (like a 5,000-comment sticky post), this network request and parsing step took 2-3 seconds. Users were clicking "AI Reply" and sitting in silence, thinking the extension was broken.

Instead of making it faster (you can't speed up Reddit's API), I made it feel faster. I built an immediate, snappy Shadow DOM overlay that throws up a stepper:

Checking authentication...15ms
Fetching post data...Downloading post & comments...
Parsing comments...

By giving users a visual anchor and elapsed time counters, a 3-second wait went from "is this thing broken?" to "oh wow, this is doing heavy lifting."

Conclusion

Building browser extensions in 2026 is wild. You aren't just writing JavaScript; you are writing adapters to inject yourself into Single Page Applications that actively reject external interference.

But cracking the puzzle—watching the Chrome Debugger seamlessly puppet a Shadow-DOM-protected Lexical editor exactly like a ghost at the keyboard—is one of the most satisfying "Aha!" moments I've had as a dev in a long time.

Sometimes, you just have to stop knocking on the DOM's front door and learn how to use the DevTools backdoor.

Want to see it in action?

Get Karma Builder and experience the hardware-level automation yourself.

Get the Extension