Supported file types
All document formats you can upload to the platform.
Fully supported
| Format | Extensions | Notes |
|---|---|---|
| Native PDFs give best results. Scanned PDFs work too, quality depends on scan clarity. | ||
| Microsoft Word | .doc, .docx | Both old (.doc) and new (.docx) formats. |
| Microsoft Excel | .xls, .xlsx | Good for pricing tables, technical matrices, evaluation grids. |
| Microsoft PowerPoint | .ppt, .pptx | For presentation-format proposals or briefings. |
| Markdown | .md | Plain text with formatting markup. |
| EDOC archives | .edoc | Latvia-specific digitally signed archive. All contained files are extracted and parsed automatically. |
Best practices by format
PDFs
Native PDFs (exported from Word, Excel, or other software) parse with near-perfect accuracy. The text is already digital.
Scanned PDFs go through OCR (optical character recognition). Results depend on:
- Scan resolution (300 DPI or higher recommended)
- Paper quality (no wrinkles, folds, stains)
- Text clarity (sharp print, standard fonts)
- Orientation (straight, not skewed)
If you have the source file (the Word or Excel document), upload that instead of a scanned PDF.
Excel files
Excel files are parsed sheet by sheet. The system handles:
- Multiple sheets
- Tables and data ranges
- Headers and formulas (values are extracted, not formulas)
- Basic formatting
Complex pivot tables, charts, and conditional formatting may not translate fully. If a requirement is hidden in a complex Excel formula, consider extracting that data to a simpler format.
EDOC archives
EDOC is a digitally signed document archive format used in Latvia. Upload the .edoc file directly - the system extracts all documents inside and parses each one individually. You'll see each extracted file listed separately with its own parsing status.
File size limits
- Maximum per file: 100 MB
- No limit on number of files per procurement or bid
If a file exceeds 100 MB, try:
- Splitting multi-part documents into separate files
- Removing large embedded images that aren't relevant
- Compressing the PDF (many PDF tools can reduce file size)
Unsupported formats
The following are not supported for content parsing:
- Image files (.jpg, .png, .gif, .tiff) - use a PDF scanner/OCR tool first
- AutoCAD (.dwg, .dxf) - convert drawings to PDF
- Video and audio files
- Compressed archives (.zip, .rar) other than .edoc