CodeRED-Astra/rust-engine
2025-10-19 10:04:34 -05:00
..
demo-data Prepared demo files and demo explanation file. Added debug only button to trigger demo file ingest on the server to queue and prepare the files. Added small expressjs server for talking between the web app and the rust engine containers. 2025-10-19 05:55:41 -05:00
src Correct demo data importing. Add significant debugging. 2025-10-19 10:04:34 -05:00
Cargo.lock Add and prepare rust worker management system for file information processing and knowledge base framework 2025-10-19 03:53:02 -05:00
Cargo.toml Add and prepare rust worker management system for file information processing and knowledge base framework 2025-10-19 03:53:02 -05:00
DEMODETAILS.md Prepared demo files and demo explanation file. Added debug only button to trigger demo file ingest on the server to queue and prepare the files. Added small expressjs server for talking between the web app and the rust engine containers. 2025-10-19 05:55:41 -05:00
Dockerfile Correct demo data importing. Add significant debugging. 2025-10-19 10:04:34 -05:00
README.md Correct demo data importing. Add significant debugging. 2025-10-19 10:04:34 -05:00
rust-toolchain.toml Bring chain versions up to proper version and bring dependencies up to current version. 2025-10-18 22:25:55 -05:00

Rust Engine API and Worker

Overview

  • HTTP API (warp) under /api for file management and query lifecycle
  • MySQL for metadata, Qdrant for vector similarity
  • Background worker resumes queued work and re-queues stale InProgress jobs at startup

Environment variables

  • DATABASE_URL: mysql://USER:PASS@HOST:3306/DB
  • QDRANT_URL: default http://qdrant:6333
  • GEMINI_API_KEY: used for Gemini content generation (optional in demo)
  • DEMO_DATA_DIR: path to the folder containing PDF demo data (default resolves to demo-data under the repo or /app/demo-data in containers)
  • ASTRA_STORAGE: directory for uploaded file blobs (default /app/storage)
  • AUTO_IMPORT_DEMO: set to false, 0, off, or no to disable automatic demo import at startup (defaults to true)

Endpoints (JSON)

  • POST /api/files (multipart)

    • Form: file=@path
    • Response: {"success": true}
  • GET /api/files/list

    • Response: {"files": [{"id","filename","path","storage_url","description"}]}
  • POST /api/files/import-demo[?force=1]

    • Copies PDFs from the demo directory into storage and queues them for analysis.
    • Response: {"imported": N, "skipped": M, "files_found": K, "source_dir": "...", "attempted_paths": [...], "force": bool}
    • force=1 deletes prior records with the same filename before re-importing.
  • GET /api/files/delete?id=<file_id>

    • Response: {"deleted": true|false}
  • POST /api/query/create

    • Body: {"q": "text", "top_k": 5}
    • Response: {"id": "uuid"}
  • GET /api/query/status?id=<query_id>

    • Response: {"status": "Queued"|"InProgress"|"Completed"|"Cancelled"|"Failed"|"not_found"}
  • GET /api/query/result?id=<query_id>

    • Response (Completed): { "result": { "summary": "Found N related files", "related_files": [ {"id","filename","path","description","score"} ], "relationships": "...", "final_answer": "..." } }
  • GET /api/query/cancel?id=<query_id>

    • Response: {"cancelled": true}

Worker behavior

  • Ensures Qdrant collection exists (dim 64, cosine)
  • Re-queues InProgress older than 10 minutes
  • Processing stages:
    1. Set InProgress
    2. Embed query text (demo now; pluggable Gemini later)
    3. Search Qdrant top_k (default 5)
    4. Join file metadata (MySQL)
    5. Gemini step: relationship analysis (strictly from provided files)
    6. Gemini step: final answer (no speculation; say unknown if insufficient)
    7. Persist result (JSON) and set Completed
    • Checks for cancellation between stages

Local quickstart

  1. docker compose up -d mysql qdrant
  2. set env DATABASE_URL and QDRANT_URL
  3. cargo run
  4. (optional) import demo PDFs
  • Populate a folder with PDFs under rust-engine/demo-data (or point DEMO_DATA_DIR to a custom path). The server auto-resolves common locations such as the repo root, /app/demo-data, and the working directory when running in Docker. When the engine boots it automatically attempts this import (can be disabled by setting AUTO_IMPORT_DEMO=false).
  • Call the endpoint:
  • Optional query ?force=1 to overwrite existing by filename. The JSON response also echoes where the engine looked (source_dir, attempted_paths) and how many PDFs were detected (files_found) so misconfigurations are easy to spot. Imported files are written to the shared /app/storage volume; the web-app container mounts this volume read-only and serves the contents at /storage/<filename>.
  • Or run the PowerShell helper:
    • ./scripts/import_demo.ps1 (adds all PDFs in demo-data)
    • ./scripts/import_demo.ps1 -Force (overwrite existing)

Notes

  • Replace demo embeddings with real Gemini calls for production
  • Add auth to endpoints if needed (API key/JWT)