Supported File Types
| Format | Extensions | Notes |
|---|---|---|
.pdf | Text is extracted from all pages | |
| Word | .docx, .doc | Full document content extracted |
| Text | .txt | Plain text files |
| Markdown | .md | Markdown formatting preserved |
Upload a Document
Uploading documents requires Admin access.
- Navigate to Knowledge Base
- Open the dataset where you want to add documents
- Click Upload Documents
-
Either:
- Drag and drop files into the upload area, or
- Click Browse to select files from your computer
- Click Upload
Document Processing
After upload, documents go through automatic processing:| Status | Meaning |
|---|---|
| Pending | Queued for processing |
| Processing | Being indexed (chunking, embedding generation) |
| Completed | Ready for search |
| Failed | Error during processing (check file format) |
Processing typically completes within a few minutes. Larger documents may take longer.
What Happens During Processing
- Text Extraction: Content is extracted from the document
- Semantic Chunking: Content is split into meaningful segments (up to 2000 characters each)
- Embedding Generation: Each chunk is converted to a vector embedding using AI
- Indexing: Embeddings are stored in the database for fast similarity search
Delete a Document
- Open the dataset containing the document
- Find the document in the table
- Click the trash icon
- Confirm the deletion
- Select multiple documents using the checkboxes
- Click Delete in the action bar
- Confirm the deletion
Troubleshooting
Document stuck in 'Processing'
Document stuck in 'Processing'
Processing usually completes within minutes. If a document stays in “Processing” for an extended period:
- The document may contain complex formatting
- Try re-uploading the document
- Contact support if the issue persists
Document shows 'Failed' status
Document shows 'Failed' status
Common causes:
- Corrupted file: Try opening the file locally to verify it’s readable
- Unsupported format: Ensure the file is a supported type (PDF, DOCX, TXT, MD)
- Password protection: Remove password protection and re-upload
- Scanned PDF without OCR: PDFs must contain extractable text, not just images
File exceeds size limit
File exceeds size limit
If your file is larger than 10 MB:
- Split large documents into smaller sections
- Remove embedded images or unnecessary formatting
- Compress the file before uploading
Best Practices
- Keep documents focused on single topics when possible
- Update documents by deleting the old version and uploading the new one
- Review document status to ensure processing completed successfully
- Organize related documents in the same dataset