Why Canvas Breaks Your Screen Recorder (And What to Do Instead) — SendRec


If you’re building a screen recorder with a webcam overlay — the picture-in-picture webcam bubble that Loom popularized — you’ll probably reach for canvas first. It’s the obvious approach: draw both video streams onto a canvas, record the canvas output.

It works great in testing. Then your users switch to another tab while recording, and the video turns into a slideshow.

Here’s why that happens and what we did instead.

The obvious approach: canvas compositing

The standard way to combine two video streams in the browser looks something like this:

const canvas = document.createElement('canvas');
const ctx = canvas.getContext('2d');

function draw() {
  ctx.drawImage(screenVideo, 0, 0, canvas.width, canvas.height);
  ctx.drawImage(webcamVideo, x, y, pipWidth, pipHeight);
  requestAnimationFrame(draw);
}

const stream = canvas.captureStream(30);
const recorder = new MediaRecorder(stream);

You grab the screen via getDisplayMedia(), the webcam via getUserMedia(), draw both onto a canvas every frame, and record the canvas output with captureStream(). Clean, elegant, runs entirely in the browser.

It also breaks the moment your user does what screen recordings are for — switching to the app they’re demonstrating.

Why it breaks: background tab throttling

Browsers aggressively throttle background tabs to save CPU and battery. Chrome reduces requestAnimationFrame callbacks to roughly 1 per second in background tabs. Firefox does something similar. Safari is even more aggressive.

When the user clicks away from your recorder tab to demonstrate their app, the draw loop that was running at 30fps drops to 1fps. The canvas stops updating. The MediaRecorder keeps recording, but it’s recording a nearly static canvas. The result: a video where the webcam overlay freezes for minutes at a time, occasionally flickering to life for a single frame.

This isn’t a bug you can work around with clever scheduling. It’s a deliberate browser optimization. requestAnimationFrame is designed for visual updates, and there’s nothing visual to update in a hidden tab. The browser is doing the right thing — it just happens to destroy your compositing pipeline.

You might think setInterval would help. It doesn’t. Chrome throttles setInterval in background tabs to once per second too. setTimeout gets the same treatment. There’s no reliable way to run a high-frequency draw loop in a background tab across browsers.

Other dead ends

OffscreenCanvas with Web Workers. OffscreenCanvas runs in a worker thread that isn’t subject to background tab throttling. Problem: there’s no captureStream() on OffscreenCanvas. You can draw frames, but you can’t record them. As of early 2026, no browser supports this combination.

fix-webm-duration and similar libraries. MediaRecorder-produced WebM files sometimes have incorrect duration metadata. Libraries exist to fix this by rewriting the WebM container. We tried this and found it corrupted the container structure — playback broke in some browsers. WebM’s container format (based on Matroska) is finicky about post-processing.

setTimeout with aggressive intervals. Even a 16ms setTimeout gets throttled to 1000ms in background tabs. You can’t beat the browser’s power management.

The solution: separate recordings, server-side compositing

Instead of combining streams in the browser, we record them separately and composite on the server.

The browser runs two independent MediaRecorder instances — one for the screen, one for the webcam:

// Screen recorder — records getDisplayMedia() output directly
const screenRecorder = new MediaRecorder(screenStream, {
  mimeType: "video/webm;codecs=vp9,opus",
});

// Webcam recorder — separate getUserMedia() stream
const webcamRecorder = new MediaRecorder(webcamStream, {
  mimeType: "video/webm;codecs=vp9",
});

// Start both
screenRecorder.start(1000);
webcamRecorder.start(1000);

Neither recorder uses canvas. They’re recording their respective MediaStream objects directly. MediaRecorder doesn’t care about tab visibility — it records whatever the stream produces, regardless of whether the tab is in the foreground. The screen stream keeps producing frames because getDisplayMedia() operates at the OS level, not the tab level.

When recording stops, the client uploads both files to S3. The server downloads them and composites with ffmpeg:

ffmpeg -i screen.webm -i webcam.webm \
  -filter_complex "[1:v]scale=240:-1,pad=iw+8:ih+8:(ow-iw)/2:(oh-ih)/2:color=black@0.3[pip];[0:v][pip]overlay=W-w-20:H-h-20" \
  -map 0:a? -c:a copy -c:v libvpx-vp9 -y output.webm

The filter chain does three things:

  1. Scales the webcam to 240px wide (maintaining aspect ratio)
  2. Adds a subtle semi-transparent border via pad
  3. Overlays the result in the bottom-right corner with 20px padding

The screen audio track is copied through unchanged. The video gets re-encoded to VP9. The webcam audio is discarded (screen recording already captures system audio).

After compositing, the server replaces the original screen recording with the composited version and deletes the webcam file. A video that was “processing” becomes “ready,” and the thumbnail is generated from the composited output.

Syncing pause and resume

When you have two recorders, pause and resume need to stay synchronized. If the screen pauses but the webcam keeps recording, the streams drift out of sync and the overlay timing is wrong in the final video.

function pauseRecording() {
  screenRecorder.pause();
  webcamRecorder.pause();
}

function resumeRecording() {
  screenRecorder.resume();
  webcamRecorder.resume();
}

MediaRecorder.pause() and resume() are synchronous calls — they take effect immediately on the stream. Both recorders pause in the same event loop tick, which keeps them in sync. In practice, we haven’t seen drift issues even with multiple pause/resume cycles.

Stop needs care too. If the screen recorder fires onstop before the webcam recorder, you need to wait for both:

const webcamBlobPromise = new Promise((resolve) => {
  webcamRecorder.onstop = () => {
    resolve(new Blob(webcamChunks, { type: "video/webm" }));
  };
});

screenRecorder.onstop = async () => {
  const screenBlob = new Blob(screenChunks, { type: "video/webm" });
  const webcamBlob = await webcamBlobPromise;
  onRecordingComplete(screenBlob, duration, webcamBlob);
};

A React gotcha: callback refs

If you’re showing a webcam preview during recording with React, there’s a subtle bug waiting for you. When the component transitions from “idle” to “recording” state, React may remount the element. If you’re using a plain useRef, the new element doesn’t get the stream attached:

// This breaks — ref.current points to the old element after remount
const videoRef = useRef(null);

The fix is a callback ref that reattaches the stream whenever the element mounts:

const webcamVideoRef = useCallback((node) => {
  if (node && webcamStream.current) {
    node.srcObject = webcamStream.current;
  }
}, []);

This guarantees the preview works regardless of when React decides to mount, unmount, or remount the video element.

Graceful fallback

Server-side compositing introduces a failure point. ffmpeg might crash, the server might run out of disk space, the download from S3 might fail. Any of these would leave the user with no video at all if we weren’t careful.

Our compositing function wraps every step with a fallback: if anything fails, set the video status to “ready” with just the screen recording. The user gets their screen recording without the webcam overlay, which is better than losing the video entirely. A screen-only recording is still useful. A failed upload is not.

setReadyFallback := func() {
    db.Exec(ctx,
        `UPDATE videos SET status="ready", webcam_key = NULL WHERE id = $1`,
        videoID,
    )
}

Every error path calls this fallback before returning. The webcam key is cleared so the system doesn’t try to composite again.

How it works in practice

On our Hetzner CX33 server (4 vCPU, 8GB RAM, €16/month), compositing takes about 40 seconds for a short recording. The video is available for viewing immediately in screen-only mode while compositing runs in the background. When it finishes, the composited version with the webcam overlay replaces the original.

The tradeoff is clear: canvas compositing is instant but unreliable. Server-side compositing adds latency but works every time, regardless of what the user does with their browser tabs.

For a screen recorder, reliability wins. The whole point is to record while doing something else.

Try it

SendRec is open source (AGPL-3.0) and self-hostable. The webcam overlay feature is live at app.sendrec.eu. The compositing code is in composite.go if you want to see the full implementation.



Source link

Leave a Reply

Your email address will not be published. Required fields are marked *