System Architecture

A modular, redundant platform built for pharmacy operations. 50+ API modules, 37 pharmacy-specific endpoints, 3-node on-premises processing, automatic failover.

Patient Touchpoints

Phone Calls

SMS / Text

Web Portal

Walk-in

AI Processing Layer (On-Premises)

Voice Pipeline
Whisper STT + TTS

Intent Engine
Classify + Route

Tier Classifier
T0/T1/T2/T3

Local LLM
Llama 70B

Core Services

Patient Service
CRUD + Verify

RX Service
Workflow Engine

PA Service
Prior Auth Flow

Inventory
Stock + Reorder

Integrations

PMS Connector
Pioneer Rx / Liberty

Twilio
Voice + SMS

Claude API
Complex Reasoning

PDMP
Controlled Substance

Data Layer

Encrypted Storage
AES-256 + JSON

HIPAA Audit Log
Append-Only, 6yr

Call Recordings
Encrypted Archive

8 Core Modules

1. Voice Pipeline

Twilio + Whisper STT + TTS

Natural language call handling
Multi-turn conversation
DTMF fallback
Encrypted recordings
"Transfer to pharmacist" always available

2. Intent Engine

NLU classification + routing

Refill detection
Status inquiries
Transfer handling
Insurance routing
Emergency triggers

3. Patient Verification

Identity confirmation

Name + DOB matching
Phone number verification
Address confirmation
Cannot confirm meds to unverified callers

4. PMS Integration

Pharmacy management system connector

Pioneer Rx adapter
Liberty Software adapter
Computer-Rx, QS/1, McKesson
HL7 / NCPDP protocols

5. Tier Classification

Automation routing engine

10 default classification rules
Controlled substance detection
DUR alert severity routing
Configurable rule priorities

6. Staff Dashboard

Real-time operations interface

Live call queue
Approval workflow (T1-T3)
Transcript viewer
One-click actions
Analytics + metrics

7. Notifications

Patient communication system

SMS ready alerts
Refill reminders (3 days before)
Pickup reminders (7 days)
Quiet hours enforcement
Caregiver copy option

8. Audit Module

HIPAA compliance engine

All PHI access logged
6-year append-only retention
Anomaly detection
On-demand compliance reports
Actor + IP + timestamp tracking

AI / LLM Stack

Hybrid approach: local LLM handles 90% of requests. Cloud AI handles complex reasoning with anonymized context.

On-Premises (Primary)

Llama 3.1 70B on Mac Mini M4 Pro

Intent classification
Routine patient queries
Data extraction
Response validation
No PHI leaves premises
No per-token cost
~15 tokens/sec inference

Cloud (Fallback)

Claude API with BAA

Complex reasoning
Nuanced patient questions
Sentiment analysis
Document generation
Anonymized context only
GPT-4 as secondary fallback
99.9% SLA

3-Node Redundancy

Scenario	What Happens	Downtime
Primary server crash	Hot standby takes over automatically	<30 seconds
Both active nodes fail	Cold standby activated manually	<5 minutes
Internet outage at pharmacy	Phone system continues (Twilio hosted externally)	0 seconds
All LLM providers down	Local Llama handles calls with simpler responses, escalates more to staff	0 seconds
Power outage	UPS keeps Mac Minis running for 15-30 min. Twilio continues externally.	0 seconds (calls)

Total redundancy cost: ~$85/month (standby VPS + monitoring + backups). The previous vendor's system went down because they had a single point of failure. We have zero.

Modular Architecture

50+ isolated modules. Each one runs independently. Update one without touching the rest. No full reimaging. No full rollbacks. No downtime.

Voice Pipeline

Intent Engine

Patients

Prescriptions

Prior Auth

Inventory

Clinical Svc

Notifications

Tier Engine

Automations

Audit Trail

Analytics

PMS Bridge

PDMP Check

Insurance

Drug DB

Auth / MFA

Security

Scheduler

Events Bus

Each box is an independent module. Pull one out, the rest keep running.

Why Modularity Matters for Reliability

Hot-Swap Updates

Update one module at a time across the 3-node cluster

Update the Inventory module on Node B while Node A handles traffic
Verify it works on Node B, then roll it to Node A
Node C stays untouched as the known-good fallback
If the update breaks something: roll back that one module, not the whole system

Fault Isolation

One module failing never takes down the system

If the PA module has a bug, prescriptions still process
If the voice pipeline has an issue, the dashboard still works
If inventory throws an error, patient records are untouched
Each module has its own error handling and recovery

No Full Reimaging

Traditional systems require full OS restores on failure

Old way: system goes down, reimage the whole server, restore from backup, pray
Our way: identify the broken module, swap it with the known-good version from another node
Minutes to fix, not hours
Data layer is separate from logic layer -- your patient data is never at risk

Modular + 3-Node = Rolling Updates

Step	Node A (Primary)	Node B (Hot Standby)	Node C (DR)
1. Prep update	Serving traffic	Receives update	Unchanged
2. Test on B	Serving traffic	Testing updated module	Unchanged
3. Promote B	Standby (old version)	Now serving traffic	Unchanged
4. Update A	Receives same update	Serving traffic	Unchanged
5. Both current	Back to primary	Hot standby (current)	Snapshot updated

Total downtime during the entire update process: zero seconds. Traffic never stops. If the update fails on Node B, Node A is still running the proven version. Node C is always available as the last-known-good backup.

The Previous Vendor's Mistake

Monolithic systems go down because everything is coupled. One bad update, one crashed service, one corrupted config -- and the whole thing stops. With modular architecture, you're swapping a single Lego brick while the rest of the structure stands. With 3 nodes, you always have a working copy to fall back to. This is how enterprise systems are built. This is why your phones will never go down.

Action	T0 Auto	T1 Tech	T2 RPh	T3 Mgr
Status inquiry	AUTO
Refill request (eligible)	AUTO
Store hours / location	AUTO
Transfer request		TECH
Insurance question		TECH
PA form submission		TECH
New Rx verification			RPh
Controlled substance			RPh
Drug interaction override			RPh
Clinical consultation			RPh
Price adjustment				MGR
Refund processing				MGR
Patient complaint				MGR