> For the complete documentation index, see [llms.txt](https://docs.useagentdock.xyz/llms.txt). Markdown versions of documentation pages are available by appending `.md` to page URLs; this page is available as [Markdown](https://docs.useagentdock.xyz/resources/supported-file-types.md).

# Supported File Types

AgentDock accepts the following file formats for knowledge base uploads. Files that do not match a supported format are rejected at upload time with an error indicating the unsupported type.

***

## Document formats

| Format        | Extension       | Notes                                                                                                |
| ------------- | --------------- | ---------------------------------------------------------------------------------------------------- |
| PDF           | `.pdf`          | Text-layer PDFs only. Scanned or image-based PDFs must be pre-processed with OCR before upload.      |
| Word Document | `.docx`         | Standard formatting is preserved. Complex layouts (multi-column, sidebars) may not chunk cleanly.    |
| HTML          | `.html`, `.htm` | Navigation, header, and footer elements are stripped during parsing. Upload multiple pages as a ZIP. |
| Markdown      | `.md`, `.mdx`   | Full CommonMark support. Frontmatter fields are preserved as chunk metadata.                         |

***

## Structured data formats

| Format | Extension | Notes                                                                                                                                            |
| ------ | --------- | ------------------------------------------------------------------------------------------------------------------------------------------------ |
| JSON   | `.json`   | Must be an array of objects or a single object. Each top-level item is treated as a discrete document. Nested fields are flattened for chunking. |
| CSV    | `.csv`    | Each row is treated as a chunk. Column headers become metadata keys. Rows with empty required fields are skipped.                                |

***

## Pre-built embeddings

If you have already generated embeddings for your content with a supported model, you can upload them directly. This avoids double-embedding and is recommended for large knowledge bases or when you want precise control over the embedding model.

**Format:** JSONL (one JSON object per line), each with the following structure:

```json
{ "text": "The original chunk text", "embedding": [0.012, -0.034, ...], "metadata": {} }
```

The `embedding` field must be a float array matching the dimensionality of the declared embedding model. The `metadata` field is optional.

**Supported embedding models for direct upload:**

| Model                              | Dimensions |
| ---------------------------------- | ---------- |
| `text-embedding-3-large` (OpenAI)  | 3072       |
| `text-embedding-3-small` (OpenAI)  | 1536       |
| `embed-english-v3.0` (Cohere)      | 1024       |
| `embed-multilingual-v3.0` (Cohere) | 1024       |

If you used a different model, contact support to discuss compatibility.

***

## Bulk upload

For uploads larger than 500MB or containing more than 1,000 files, use the bulk upload API instead of the dashboard. Files should be archived as a ZIP.

```bash
curl -X POST https://api.useagentdock.xyz/v1/upload \
  -H "Authorization: Bearer <your_api_key>" \
  -F "files=@archive.zip" \
  -F 'config={"chunking":"recursive","embedding_model":"text-embedding-3-large"}'
```

***

## Size limits

| Limit                                       | Value   |
| ------------------------------------------- | ------- |
| Single file (dashboard)                     | 50MB    |
| Total upload per knowledge base (dashboard) | 500MB   |
| Bulk upload archive                         | 10GB    |
| Chunks per knowledge base                   | 500,000 |

Knowledge bases exceeding these limits require a creator plan upgrade or direct support engagement.


---

# Agent Instructions
This documentation is published with GitBook. GitBook is the documentation platform designed so that both humans and AI agents can read, navigate, and reason over technical content effectively. Learn more at gitbook.com.

## Querying This Documentation
If you need additional information that is not directly available in this page, you can query the documentation dynamically by asking a question.

Perform an HTTP GET request on the current page URL with the `ask` query parameter:

```
GET https://docs.useagentdock.xyz/resources/supported-file-types.md?ask=<question>
```

The question should be specific, self-contained, and written in natural language.
The response will contain a direct answer to the question and relevant excerpts and sources from the documentation.

Use this mechanism when the answer is not explicitly present in the current page, you need clarification or additional context, or you want to retrieve related documentation sections.
