MATRIX-AI
Stage-1 • Observe + Assist

MATRIX-AI

The AI planning microservice for the Matrix EcoSystem. It generates short, low-risk, auditable remediation plans for Matrix-Guardian and offers a compact, RAG-assisted chat over Matrix docs.

Service: matrix-ai Version: 1.0.0 API: /v1/plan · /v1/chat
🛠️

Plan Engine

Given a compact health context from Matrix-Guardian, matrix-ai produces a JSON plan with bounded steps, a risk label, and a brief rationale. Strict schema, PII redaction, short timeouts, and retries ensure safe operation.

• Strict JSON schema (plan_id, steps[], risk, explanation)
• Bounded tokens, low temperature
• Robust parsing + safe fallback
📚

RAG Assist

Lightweight retrieval over curated Matrix docs. We assemble a compact CONTEXT and instruct the model to answer only from those facts (or say it doesn’t know).

• FAISS + embeddings (MiniLM)
• Optional re-ranking for accuracy
• Sources returned with answers
🛡️

Guardrails

Production-safe defaults: JSON logs, request IDs, in-memory rate-limit, idempotent writes (where applicable), and ETag support for caches.

• PII redaction pre-inference
• 429 limits, 5xx propagation
• Trace-friendly logging

How it fits

Matrix-Guardian
/v1/plan
matrix-ai
HF Router
Upstream LLM
Operators
/v1/chat
matrix-ai
RAG
KB (FAISS)
Endpoints
POST /v1/plan   # JSON plan for Guardian (non-stream)
POST /v1/chat   # Q&A over Matrix docs (RAG, stream or non-stream)
GET  /healthz   # Liveness
Quick start (local)
# 1) Export your token (or use Space Secret)
export HF_TOKEN="hf_xxx"

# 2) Run the service
uvicorn app.main:app --host 0.0.0.0 --port 7860

# 3) Try chat (non-stream)
curl -s -X POST localhost:7860/v1/chat \
  -H 'content-type: application/json' \
  -d '{"query":"What is MatrixHub?"}' | jq

# 4) Try plan
curl -s -X POST localhost:7860/v1/plan \
  -H 'content-type: application/json' \
  -d '{"mode":"plan","context":{"entity_uid":"matrix-ai"},"constraints":{"max_steps":3,"risk":"low"}}' | jq