pure.md - global cache between LLMs and the web

                                               __     
                                              /\ \    
 _____   __  __  _ __    __        ___ ___    \_\ \   
/\ '__`\/\ \/\ \/\`'__\/'__`\    /' __` __`\  /'_` \  
\ \ \L\ \ \ \_\ \ \ \//\  __/  __/\ \/\ \/\ \/\ \L\ \ 
 \ \ ,__/\ \____/\ \_\\ \____\/\_\ \_\ \_\ \_\ \___,_\
  \ \ \/  \/___/  \/_/ \/____/\/_/\/_/\/_/\/_/\/__,_ /
   \ \_\                                              
    \/_/

Global cache between LLMs and the Web

pure.md is a REST API that lets AI agents and developers reliably access and cache web content. With pure.md, you can:

Avoid bot detection by mimicking real user behavior
Render JavaScript-heavy websites, PDFs, images, and files
Scrape web pages into markdown optimized for an LLM
Crawl search engines for up-to-date knowledge
Extract JSON from web pages using natural language

▶ ▶ ▶ Prefix any URL with `pure.md/` ◀ ◀ ◀

Send HTTP requests like a human

Avoid getting flagged as a bot. Our proxy mimics real browser fingerprints and rotates egress IP addresses on every request. If a site can't be reached, we seamlessly fall back to fetching responses from Common Crawl and Internet Archive datasets.



Request

   │
   ╰─────▶ Regional cache ─────╮
   │                           │
   │                           ▼
   ╰───▶ Datacenter proxies ───╮
   │                           │
   │                           ▼
   ╰──▶ Residential proxies ───╮
   │                           │
   │                           ▼
   ╰──────▶ Common Crawl ──────╮
   │                           │
   │                           ▼
   ╰─────▶ Wayback Machine ────╮
                               │
                               ▼

                           Response

Headless content rendering

Single-page applications (SPAs, such as those built with React) require JavaScript to render the content of each page — a process known as DOM hydration. A direct curl or fetch of these websites will just leave you with an empty shell of HTML.

Fetching through pure.md, on the other hand, hydrates the DOM of SPAs in the background so that pages render completely.

Similarly, PDFs are parsed as pure markdown automatically. Images run through AI models for object detection and summarization. Excel and Numbers spreadsheet documents can also be converted into markdown.

Markdown written for LLMs

Cut your inference costs and speed up your agents' workflows. Powered by HTMLRewriter, our URL-to-markdown service is optimized for low latency and low token output. We remove superfluous fluff from web pages — while also adding page metadata as frontmatter — so that LLMs have the most context in the fewest characters possible.



           |
 r.jina.ai |░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░  143K tokens
           |
           |
tavily.com |░░░░░░░░░░░░░░░ 55K tokens
           |
           |
   pure.md |████████ 28K tokens ✅
           |_______̩_______̩_______̩_______̩_______̩_______̩_______̩___

                 25K           75K           125K          175K

Input tokens from the Wikipedia article on Artificial Intelligence



           |
 r.jina.ai | ❌ Verifying you are human...
           |
           |
tavily.com | ❌ Failed to fetch content
           |
           |
   pure.md |█████ 22K tokens ✅
           |_______̩_______̩_______̩_______̩_______̩_______̩_______̩___

                 25K           75K           125K          175K

Input tokens from a science.org article

Knowledge in real-time

Make your AI apps aware of recent events. With our built-in search engine result page (SERP) crawling, you can turn user queries into a concatenated markdown string of answers that you can feed directly into your prompts. Your model will think it was trained yesterday.

Inference when you need it

Extract data from any page or search simply by changing from GET to POST. pure.md offers a selection of generative AI models for extracting structured or unstructured data from web pages. Stream back responses in markdown for tasks like summarization, or generate JSON that conforms to a custom schema.

POST https://pure.md/reuters.com

{
  "prompt": "What are the top 5 headlines from today?",
  "model": "meta/llama-3.1-8b",
  "schema": {
    "type": "object",
    "properties": {
      "headlines": {
        "type": "array",
        "items": {"type": "string"}
      }
    },
    "required": ["headlines"]
  }
}

Pricing as a feature

Simple, easy-to-understand pricing for projects of any size. All plans are available for commercial use. Pay for what you need, and cancel at any time.

Starter

Pay as you go

60 requests/minute
$0.003/fetch
$0.005/search
No GenAI extraction
Email support
$1 free credit

Get started

Growth

$19/mo + metered usage

600 requests/minute
$0.002/fetch
$0.003/search
GenAI extraction
Slack/email support
$20/mo free credit

Get started

Business

$99/mo + metered usage

3000 requests/minute
$0.001/fetch
$0.002/search
GenAI extraction
Slack/email support
$100/mo free credit

Get started

Do I need a credit card to sign up?

No, you can sign up without inputting a credit card; however, you will have a strict rate limit imposed until you have an active subscription.

How do the free credits work?

Each month, you pay a flat fee up-front (except on the Starter plan, which is $0/mo). Over the course of the month, your usage deducts from your allotment of credits. Once you use up your credits, you are billed based on usage on your next payment period. Unused credits do not roll over to the next month.

How does my payment get processed?

We use Stripe as a payment processor. All credit card transactions take place on a subdomain of stripe.com.

How much does data extraction cost?

Data extraction uses generative AI to format answers in JSON or raw streams of text. Pricing varies by model — see the table below. These costs are only incurred on the POST endpoints, not the GET endpoints.

Model	Cost per million tokens
meta/llama-3.1-8b	$0.09/M input, $0.19/M output
mistral/hermes-2-pro-7b	$0.19/M input, $0.19/M output
meta/llama-3.3-70b	$0.49/M input, $0.99/M output
deepseek/r1-distill-qwen-32b	$0.89/M input, $2.49/M output

Can pure.md access content behind a login?

Yes, just include your authorization cookies in the request. pure.md passes request headers along to the target URL.

Is pure.md safe to use in production?

Yes, pure.md is safe to use in production. Our infrastructure runs on a combination of Cloudflare, AWS, and Railway servers, and is designed to autoscale with demand. Visit our status page for uptime history.

What file types are supported for markdown conversion?

HTML, PDF, images, and spreadsheet file types are supported — specifically,

.csv
.et
.htm
.html
.jpeg
.jpg
.numbers
.pdf
.png
.svg
.webp
.xls
.xlsb
.xlsm
.xlsx
.xml

Can I use pure.md as an MCP server?

Yes, pure.md supports the Model Context Protocol (MCP) developed by Anthropic. Read the instructions at puremd/puremd-mcp to teach Cursor, Windsurf, Claude Desktop, and other MCP clients how to route traffic through the pure.md network.