Distiller

Web extraction for AI agents

Turn a URL into clean content for agents.

Distiller turns messy webpages into clean Markdown, HTML, and text so AI agents can read, summarize, and act on web content without fighting page clutter.

Last updated

Try live demo Read the docs Pricing

What it does

Turns webpages into clean, structured content that is easier for AI systems to work with.

Why it matters

Most web pages are built for browsers, ads, and layouts. Distiller gives you the useful content instead of the noise.

Why teams choose it

It combines extraction, simple public access, and agent-friendly usage patterns in one place.

Live Demo

Try Distiller on a real page.

Start free with Trafilatura cleaning. Upgrade to paid AI cleaning for higher-quality normalization and priority extraction.

How it works

Ready when you are.

Markdown output

A clean starting point for summaries, extraction, and downstream agent tasks.

# Your cleaned Markdown will appear here

Paste a URL above to try the live demo.

Get a free API key to start building.

Paid AI tier is available through Stripe checkout after signup.

Raw page source is rarely what an agent needs.

When an agent reads a page, it usually wants the article, product details, pricing, or help content, not every script, widget, and layout fragment around it.

  • Markdown for the best default reading format.
  • HTML when document structure matters.
  • Text when you want the leanest prompt input.
  • Browser rendering for pages that depend on JavaScript.

Built for practical use, not just demos.

Use Distiller to power research agents, OpenClaw tools, internal automations, or public pay-per-use APIs. The payment rail is there when you need it, but the core value is clean, reliable web content.

See access options

OpenClaw setup should feel simple.

web-distiller https://example.com --format markdown

Read the docs

Why Distiller

Better inputs mean better outputs.

Your agent gets better answers from clean Markdown than from raw HTML. Distiller does the extraction step your agent would have to do anyway — better, faster, and cached across users.

Pricing

Start free, upgrade when you need AI-quality cleaning. Estimated savings below assume ~50,000 raw vs ~12,000 cleaned tokens per page (GPT-4.1 input pricing).

Overage: $0.006/call beyond plan limit · 100K+? Contact us

CachedSame URL served instantly to all users
~75% fewer tokensClean Markdown vs raw HTML
One API callMarkdown + text + metadata in one request

What is Distiller?

Distiller is a web extraction API that turns public webpages into clean Markdown, HTML, and text for AI agents and automation systems.

When should teams use Distiller?

Teams should use Distiller when they need cleaner web inputs for agents, retrieval workflows, research automation, or OpenClaw tools instead of raw page source.

Can Distiller handle JavaScript-heavy pages?

Yes. Distiller starts with lightweight HTTP extraction and can fall back to browser rendering when the useful content depends on JavaScript.

Frequently asked questions

What is Distiller?

Distiller converts public webpages into clean Markdown, HTML, and plain text for AI workflows.

When should I use Markdown instead of cleaned HTML?

Use Markdown for LLM reading and summarization. Use cleaned HTML when links, headings, and structure matter to the task.