# Rust Engine API and Worker ## Overview - HTTP API (warp) under /api for file management and query lifecycle - MySQL for metadata, Qdrant for vector similarity - Background worker resumes queued work and re-queues stale InProgress jobs at startup ## Environment variables - DATABASE_URL: mysql://USER:PASS@HOST:3306/DB - QDRANT_URL: default - GEMINI_API_KEY: used for Gemini content generation (optional in demo) - DEMO_DATA_DIR: path to the folder containing PDF demo data (default resolves to `demo-data` under the repo or `/app/demo-data` in containers) - ASTRA_STORAGE: directory for uploaded file blobs (default `/app/storage`) - AUTO_IMPORT_DEMO: set to `false`, `0`, `off`, or `no` to disable automatic demo import at startup (defaults to `true`) ## Endpoints (JSON) - POST /api/files (multipart) - Form: file=@path - Response: {"success": true} - GET /api/files/list - Response: {"files": [{"id","filename","path","storage_url","description"}]} - POST /api/files/import-demo[?force=1] - Copies PDFs from the demo directory into storage and queues them for analysis. - Response: {"imported": N, "skipped": M, "files_found": K, "source_dir": "...", "attempted_paths": [...], "force": bool} - `force=1` deletes prior records with the same filename before re-importing. - GET /api/files/delete?id= - Response: {"deleted": true|false} - POST /api/query/create - Body: {"q": "text", "top_k": 5} - Response: {"id": "uuid"} - GET /api/query/status?id= - Response: {"status": "Queued"|"InProgress"|"Completed"|"Cancelled"|"Failed"|"not_found"} - GET /api/query/result?id= - Response (Completed): { "result": { "summary": "Found N related files", "related_files": [ {"id","filename","path","description","score"} ], "relationships": "...", "final_answer": "..." } } - GET /api/query/cancel?id= - Response: {"cancelled": true} ## Worker behavior - Ensures Qdrant collection exists (dim 64, cosine) - Re-queues InProgress older than 10 minutes - Processing stages: 1) Set InProgress 2) Embed query text (demo now; pluggable Gemini later) 3) Search Qdrant top_k (default 5) 4) Join file metadata (MySQL) 5) Gemini step: relationship analysis (strictly from provided files) 6) Gemini step: final answer (no speculation; say unknown if insufficient) 7) Persist result (JSON) and set Completed - Checks for cancellation between stages ## Local quickstart 1. docker compose up -d mysql qdrant 2. set env DATABASE_URL and QDRANT_URL 3. cargo run 4. (optional) import demo PDFs - Populate a folder with PDFs under `rust-engine/demo-data` (or point `DEMO_DATA_DIR` to a custom path). The server auto-resolves common locations such as the repo root, `/app/demo-data`, and the working directory when running in Docker. When the engine boots it automatically attempts this import (can be disabled by setting `AUTO_IMPORT_DEMO=false`). - Call the endpoint: - POST - Optional query `?force=1` to overwrite existing by filename. The JSON response also echoes where the engine looked (`source_dir`, `attempted_paths`) and how many PDFs were detected (`files_found`) so misconfigurations are easy to spot. Imported files are written to the shared `/app/storage` volume; the web-app container mounts this volume read-only and serves the contents at `/storage/`. - Or run the PowerShell helper: - `./scripts/import_demo.ps1` (adds all PDFs in demo-data) - `./scripts/import_demo.ps1 -Force` (overwrite existing) ## Notes - Replace demo embeddings with real Gemini calls for production - Add auth to endpoints if needed (API key/JWT)