pure.md is a REST API for converting web pages and web searches into easily-digestible input for LLMs. With pure.md, you can:
Generate a unique API token in your dashboard.
Then include that token in the x-puremd-api-token
request header for all requests.
Rate limits vary by subscription plan. See pricing for details.
Subscription type | Requests per minute |
---|---|
Logged out / anonymous | 6 |
Logged in, no subscription | 10 |
Starter plan | 60 |
Growth plan | 600 |
Business plan | 3000 |
All request headers pass through to the target URL, except ones that begin with x-puremd-
.
Original headers from the origin are returned in the response.
Retrieves the content of a given URL in markdown format. Use this endpoint to scrape text content from a web page without getting blocked.
url required | string The URL |
<WebPage url="https://example.com"> title: Example Domain access_date: Wed, 05 Mar 2025 22:27:19 GMT --- # Example Domain This domain is for use in illustrative examples in documents. You may use this domain in literature without prior coordination or asking for permission. More information... </WebPage>
This endpoint is only available on paid plans.
Runs inference on the content of a given URL. Use this endpoint to extract structured JSON from a webpage.
url required | string The URL |
prompt required | string The user message |
model | string Enum: "meta/llama-3.1-8b" … 3 more The generative AI model to use. Smaller models are faster, while larger models are more accurate. Default model: |
schema | object JSON schema of the desired response. Omit this property to get a response in plaintext. |
{- "prompt": "What are the top 5 headlines from today?",
- "model": "meta/llama-3.1-8b",
- "schema": {
- "type": "object",
- "properties": {
- "headlines": {
- "type": "array",
- "items": {
- "type": "string"
}
}
}, - "required": [
- "headlines"
]
}
}
{- "type": "object",
- "properties": {
- "headlines": {
- "type": "array",
- "items": {
- "type": "string"
}
}
}, - "required": [
- "headlines"
]
}
This endpoint is only available on paid plans.
Crawls the top results from a search engine query and concatenates the web content from all pages into markdown. Use this endpoint to gather knowledge of news, current events, or specific topics.
q required | string The URL-encoded search query |
# Title of the Page ## Introduction This is the introduction text from the webpage, purified and optimized for LLM processing. ## Main Content The main content of the page converted to clean markdown format, with unnecessary elements removed. ### Subsection Content organized in logical subsections with proper hierarchy. ## Conclusion The concluding information from the webpage.
This endpoint is only available on paid plans.
Crawls the top results from a search engine query and runs inference on their web content. Use this endpoint to answer questions about news, current events, or general user queries that require searching.
url required | string The URL |
prompt required | string The user message |
model | string Enum: "meta/llama-3.1-8b" … 3 more The generative AI model to use. Smaller models are faster, while larger models are more accurate. Default model: |
schema | object JSON schema of the desired response. Omit this property to get a response in plaintext. |
{- "prompt": "Who won the baseball game last night?",
- "model": "meta/llama-3.1-8b",
- "schema": {
- "type": "object",
- "properties": {
- "headlines": {
- "type": "array",
- "items": {
- "type": "string"
}
}
}, - "required": [
- "headlines"
]
}
}
"string"