Architecture Overview¶

Intelis Insights is made up of five services that work together. This page explains what each one does and how they connect.

System Diagram¶

graph TB
    User([User / Browser])

    subgraph "Intelis Insights"
        App["PHP App<br/>(Slim 4)<br/>:8080"]
        MySQL_App["MySQL<br/>intelis_insights<br/>(app data)"]
        MySQL_Lab["MySQL<br/>vlsm<br/>(lab data)"]
        Sidecar["LLM Sidecar<br/>(Bun/Hono)<br/>:3100"]
        RAG["RAG API<br/>(FastAPI)<br/>:8089"]
        Qdrant["Qdrant<br/>(vector DB)<br/>:6333"]
    end

    LLM["LLM Provider<br/>(Anthropic, OpenAI, etc.)"]

    User --> App
    App --> MySQL_App
    App --> MySQL_Lab
    App --> Sidecar
    App --> RAG
    Sidecar --> LLM
    RAG --> Qdrant

Services¶

PHP App (port 8080)¶

The main application. Serves the dashboard, chat interface, and REST API. Built with Slim 4 and Eloquent ORM.

Renders the frontend (Tailwind CSS, Alpine.js, Chart.js)
Handles the full query pipeline — from question to SQL to chart
Reads and writes to both MySQL databases
Calls the LLM Sidecar and RAG API

MySQL (port 3306)¶

A single MySQL 8.0 server hosts two databases. See Two Databases for details.

intelis_insights — Application data (reports, chat sessions, audit logs)
vlsm — Laboratory data (the InteLIS database that LLM-generated SQL queries run against)

LLM Sidecar (port 3100)¶

A lightweight gateway that routes LLM requests to the configured provider (Anthropic, OpenAI, DeepSeek, Google, Groq, or local Ollama). Written in TypeScript, runs on Bun.

The PHP app never calls LLM APIs directly — it goes through the sidecar
Supports streaming, structured output, tool calling, and embeddings
Manages API keys and model aliases (e.g. sonnet → anthropic:claude-sonnet-4-20250514)

See LLM Sidecar Reference for details.

RAG API (port 8089)¶

A Python FastAPI service that handles text embedding and semantic search. It converts text into vectors and searches for similar content in Qdrant.

Embeds text using the BAAI/bge-small-en-v1.5 model (runs locally, no API calls)
On first start, downloads the embedding model (~200 MB) — this is cached for future starts
Provides /v1/upsert (store snippets) and /v1/search (find similar snippets)

Qdrant (port 6333)¶

An open-source vector database. Stores the embedded RAG snippets and performs fast similarity search.

Holds ~2,400 snippets derived from the database schema, business rules, and field guide
When the app needs context for SQL generation, it searches Qdrant for relevant snippets
Has a built-in web dashboard at localhost:6333/dashboard

Two Query Modes¶

The application has two distinct ways to query data:

1. Dashboard (No LLM)¶

Pre-defined national indicators. The SQL is hard-coded — no AI involved.

User → Dashboard → PHP App → MySQL (vlsm) → Charts

Fast, deterministic, always produces the same results.

2. Conversational Chat (LLM-Powered)¶

The user asks a question in plain English. The system runs a multi-step pipeline to generate, validate, and execute SQL.

User Question
  → Validate against business rules (privacy, forbidden patterns)
  → Check conversation context (references to prior turns)
  → Detect intent via LLM (+ domain relevance check — rejects non-lab questions)
  → Select relevant tables via LLM
  → RAG API (find relevant schema/rules from Qdrant)
  → Build strict RAG allowlist (grounding constraints)
  → Generate SQL via LLM (constrained to allowlist)
  → Validate SQL (privacy + grounding)
  → Execute SQL against MySQL vlsm
  → Store conversation context (for follow-up questions)
  → Suggest chart type (heuristic + optional LLM)
  → JSON Response with data + chart

See Query Pipeline for a detailed step-by-step walkthrough.

Data Flow¶

sequenceDiagram
    participant U as User
    participant A as PHP App
    participant R as RAG API
    participant Q as Qdrant
    participant L as LLM Sidecar
    participant P as LLM Provider
    participant D as MySQL (vlsm)

    U->>A: "VL suppression rate by district?"
    A->>A: Validate business rules (privacy, forbidden patterns)
    A->>A: Check conversation context (prior turns)
    A->>L: Detect intent (classify question)
    L->>P: API call
    P-->>L: Intent + domain relevance
    L-->>A: Intent analysis
    Note over A: Reject if domain_relevance = low
    A->>L: Select relevant tables
    L->>P: API call
    P-->>L: Table list
    L-->>A: Tables (e.g. form_vl, facility_details)
    A->>R: Search for relevant schema/rules
    R->>Q: Vector similarity search
    Q-->>R: Matching snippets
    R-->>A: Schema context + business rules
    A->>A: Build strict RAG allowlist
    A->>L: Generate SQL (with allowlist + context)
    L->>P: API call (Claude/GPT/etc.)
    P-->>L: Generated SQL + confidence
    L-->>A: SQL + verification
    A->>A: Validate SQL (privacy + grounding)
    A->>D: Execute validated SQL
    D-->>A: Query results
    A->>A: Store conversation context
    A->>A: Suggest chart type
    A-->>U: Data + chart + explanation