Stage-1 • Observe + Assist

MATRIX-AI

The AI planning microservice for the Matrix EcoSystem. It generates short, low-risk, auditable remediation plans for Matrix-Guardian and offers a compact, RAG-assisted chat over Matrix docs.

Open Chat Plan Playground

Service: matrix-ai Version: 1.0.0 API: /v1/plan · /v1/chat

🛠️

Plan Engine

Given a compact health context from Matrix-Guardian, matrix-ai produces a JSON plan with bounded steps, a risk label, and a brief rationale. Strict schema, PII redaction, short timeouts, and retries ensure safe operation.

• Strict JSON schema (plan_id, steps[], risk, explanation)

• Bounded tokens, low temperature

• Robust parsing + safe fallback

📚

RAG Assist

Lightweight retrieval over curated Matrix docs. We assemble a compact CONTEXT and instruct the model to answer only from those facts (or say it doesn’t know).

• FAISS + embeddings (MiniLM)

• Optional re-ranking for accuracy

• Sources returned with answers

🛡️

Guardrails

Production-safe defaults: JSON logs, request IDs, in-memory rate-limit, idempotent writes (where applicable), and ETag support for caches.

• PII redaction pre-inference

• 429 limits, 5xx propagation

• Trace-friendly logging

How it fits

Matrix-Guardian

/v1/plan

matrix-ai

HF Router

Upstream LLM

Operators

/v1/chat

matrix-ai

RAG

KB (FAISS)

Endpoints

POST /v1/plan   # JSON plan for Guardian (non-stream)
POST /v1/chat   # Q&A over Matrix docs (RAG, stream or non-stream)
GET  /healthz   # Liveness

Quick start (local)

# 1) Export your token (or use Space Secret)
export HF_TOKEN="hf_xxx"

# 2) Run the service
uvicorn app.main:app --host 0.0.0.0 --port 7860

# 3) Try chat (non-stream)
curl -s -X POST localhost:7860/v1/chat \
  -H 'content-type: application/json' \
  -d '{"query":"What is MatrixHub?"}' | jq

# 4) Try plan
curl -s -X POST localhost:7860/v1/plan \
  -H 'content-type: application/json' \
  -d '{"mode":"plan","context":{"entity_uid":"matrix-ai"},"constraints":{"max_steps":3,"risk":"low"}}' | jq