pure.md api (1.0.0)

Download OpenAPI specification:

Introduction

pure.md is a REST API for converting web pages and web searches into easily-digestible input for LLMs. With pure.md, you can:

  • Convert URLs into markdown optimized for an LLM
  • Crawl search engines for up-to-date knowledge
  • Extract data from web pages using natural language
  • Scrape social media content from supported providers
  • Avoid bot detection by mimicking real user behavior

Authentication

Generate a unique API token in your dashboard. Then include that token in the x-puremd-api-token request header for all requests.

Rate limits

Rate limits vary by subscription plan. See pricing for details.

Subscription type Requests per minute
Logged out / anonymous 6
Logged in, no subscription 10
Starter plan 60
Growth plan 600
Business plan 3000

Headers pass through

All request headers pass through to the target URL, except ones that begin with x-puremd-.

Original headers from the origin are returned in the response.


Fetch web content

Retrieves the content of a given URL in markdown format. Use this endpoint to scrape text content from a web page without getting blocked.

Authorizations:
APIToken
path Parameters
url
required
string

The URL

Responses

Response samples

Content type
text/plain
<WebPage url="https://example.com">

title: Example Domain
access_date: Wed, 05 Mar 2025 22:27:19 GMT

---

# Example Domain

This domain is for use in illustrative examples in documents. You may use this domain in literature without prior coordination or asking for permission.

More information...

</WebPage>

Fetch and extract data

This endpoint is only available on paid plans.

Runs inference on the content of a given URL. Use this endpoint to extract structured JSON from a webpage.

Authorizations:
APIToken
path Parameters
url
required
string

The URL

Request Body schema: application/json
prompt
required
string

The user message

model
string
Enum: "meta/llama-3.1-8b" … 3 more

The generative AI model to use. Smaller models are faster, while larger models are more accurate. Default model: meta/llama-3.1-8b

schema
object

JSON schema of the desired response. Omit this property to get a response in plaintext.

Responses

Request samples

Content type
application/json
{
  • "prompt": "What are the top 5 headlines from today?",
  • "model": "meta/llama-3.1-8b",
  • "schema": {
    }
}

Response samples

Content type
{
  • "type": "object",
  • "properties": {
    },
  • "required": [
    ]
}

Search the web

This endpoint is only available on paid plans.

Crawls the top results from a search engine query and concatenates the web content from all pages into markdown. Use this endpoint to gather knowledge of news, current events, or specific topics.

Authorizations:
APIToken
query Parameters
q
required
string

The URL-encoded search query

Responses

Response samples

Content type
text/plain
# Title of the Page

## Introduction
This is the introduction text from the webpage, purified and optimized for LLM processing.

## Main Content
The main content of the page converted to clean markdown format, with unnecessary elements removed.

### Subsection
Content organized in logical subsections with proper hierarchy.

## Conclusion
The concluding information from the webpage.

Search and extract data

This endpoint is only available on paid plans.

Crawls the top results from a search engine query and runs inference on their web content. Use this endpoint to answer questions about news, current events, or general user queries that require searching.

Authorizations:
APIToken
path Parameters
url
required
string

The URL

Request Body schema: application/json
prompt
required
string

The user message

model
string
Enum: "meta/llama-3.1-8b" … 3 more

The generative AI model to use. Smaller models are faster, while larger models are more accurate. Default model: meta/llama-3.1-8b

schema
object

JSON schema of the desired response. Omit this property to get a response in plaintext.

Responses

Request samples

Content type
application/json
{
  • "prompt": "Who won the baseball game last night?",
  • "model": "meta/llama-3.1-8b",
  • "schema": {
    }
}

Response samples

Content type
"string"