What it is
MIT-licensed UPenn GitHub repo that uses headless Chrome and ffmpeg to render steady-speed scrolling (and cue-driven) webpage recordings into H.264 MP4 files.
Gabriel’s notes
Quick take: A small, practical “make the browser do a thing, then turn it into video” repo—built to work as a Codex skill, but also usable as a plain Node CLI. If you’ve ever wanted a clean scroll-capture MP4 without janky real-time screen recording, this is the vibe.
upenn/web-scroll-video is a University of Pennsylvania–published GitHub repository (“Web Scroller Tool”) that opens a URL in headless Chrome, captures viewport screenshots at fixed scroll offsets, and pipes frames into ffmpeg to produce an H.264 MP4. Defaults are 1920×1080 at 30 fps, and it supports a cue-sheet workflow for pauses, clicks, typing, zooms, highlights, and optional cursor rendering.
I bookmarked this because it’s the rare automation repo that’s both (a) boring-in-a-good-way and (b) immediately useful. It’s essentially a deterministic “scroll narrative renderer” for the web—great for product demos, documentation clips, portfolio reels, and internal walkthroughs.
My original note, cleaned up: This repo contains an AI-adjacent “skill” that lets tools like OpenAI’s Codex install and run a local pipeline to capture an MP4 video of a web page that scrolls at a steady speed. Under the hood, it opens a URL in headless Chrome, captures viewport screenshots at fixed scroll offsets, and streams frames into ffmpeg. The documented default output is 1080p (1920×1080), 30 fps, H.264 MP4. I haven’t used it yet, but it’s on my “I will absolutely tinker with this” list. It’s also a UPenn repo, which is reassuring—while still firmly in the “trust, but verify” category that all GitHub code lives in.
I saved this under Dev & code because it’s basically a programmable video renderer built on standard developer tooling (Node + headless Chrome + ffmpeg), not a SaaS screen recorder.
Good fit if you want to:
- Create consistent scroll-through MP4s of web pages (no real-time screen-recording hiccups).
- Generate “producer-style” sequences (pause, click, type, zoom, highlight) via simple cue sheets.
- Produce mobile-shaped or custom viewport videos by setting width/height explicitly.
- Integrate a repeatable capture workflow into an agent runbook (Codex skill) or a CI-ish script.
- Debug failures with artifacts like error screenshots/reports (when a step can’t find a target).
Pricing snapshot (auto-enriched):
The repository is MIT-licensed and free to use/modify/distribute. Your real costs are runtime dependencies (Chrome/Chromium/Edge, ffmpeg, Node 22+) and whatever compute you run it on. If you use it through Codex, Codex-related costs depend on your OpenAI account/plan and model usage: Unknown / not confirmed.
Work-use / compliance snapshot (auto-enriched):
This tool programmatically loads webpages and records what’s rendered. That means: treat it like automated browsing plus video capture. Ensure you have the right to capture/redistribute the content you’re recording (copyright, site ToS, client permissions, and any login-gated material). Also assume anything visible in the viewport (PII, tokens, internal URLs) could end up in the MP4. The repo notes it launches headless Chrome with a temporary profile and DevTools bound to 127.0.0.1; in locked-down environments you may need additional permissions or sandboxing. Any formal compliance guarantees: Unknown / not confirmed.
Alternatives (auto-enriched):
- WebReel: A tool focused on recording scripted browser demos to MP4/GIF/WebM; it auto-downloads Chrome and ffmpeg, which can be convenient if you want a more “productized” workflow than a raw repo.
- heygen-com/hyperframes: A repo aimed at rendering video from HTML (agent-friendly) rather than capturing arbitrary live sites; better if you control the scene, less direct if you need “record this existing webpage as-is.”
Before you adopt it:
- Run it first against a “boring” public page and confirm output metadata (fps, resolution, codec) matches your needs.
- Use a dedicated Chrome profile / clean environment for captures—especially if you’re touching authenticated pages.
- Decide whether you want the one-shot scroll mode (simple) or cue sheets (powerful, but you’ll iterate).
Sources:
- https://github.com/upenn/web-scroll-video
- https://github.com/upenn/web-scroll-video/blob/main/SKILL.md
- https://github.com/upenn/web-scroll-video/blob/main/LICENSE
- https://help.openai.com/en/articles/11096431-openai-codex-cli-getting-started
- https://webreel.dev/