TinyAI is our family of purpose-built AI models. Each one is under 300MB, runs on standard CPU hardware with no GPU, and deploys entirely inside your environment. The architecture is a 3-layer stack where specialized models work together to turn data into auditable decisions.
The architecture
TinyAI deployment tends to be a 3-layer stack. NLP feeds ML. ML feeds Logic. The orchestration layer coordinates everything, running multiple specialized models in parallel with sub-300ms latency.
Extraction, recognition, and classification. Converts unstructured documents, contracts, forms, and scanned records into structured data fields that the layers above can process.
Pattern recognition, scoring, and anomaly detection. Trained on your historical data with a minimum of 200 labelled datasets in under one hour. Compares new inputs against learned patterns to generate predictions with confidence scores.
Deterministic decision chains where one model calls the next. Policy rules, routing logic, threshold checks, and human-in-the-loop checkpoints. Aggregates outputs across decision dimensions and produces a single structured decision with a full audit trail.
<300MB
Per model
No GPU required
<1hr
Training time
On 200+ labelled datasets
<300ms
Inference latency
End-to-end, no round trips
93%+
Accuracy
After fine-tuning on your data
Explainability
TinyAI produces structured, deterministic outputs, not free-form text. Every decision is explainable and audit-ready by design.
A single-variable structured output per model. Each decision carries its confidence score and the features that drove it. Per-decision audit record: input features, weight contributions, model version, timestamp, confidence threshold.
NLP models use contextual attribution. Vision models produce spatial heatmaps. Both operate on actual model weights, not surrogate approximations. Results persisted alongside inference logs.
Every decision produces a structured audit record: input features, weight contributions, model version, timestamp, and confidence threshold. Designed to support regulatory requirements across banking, pharma, and insurance without additional dependencies.
Deployment
All with full data residency and sub-300ms latency. Zero data egress. No PII transmitted to Synapze or any third party.
Customer-managed servers. Zero outbound calls. Air-gapped compatible. Full business continuity even disconnected from the internet.
Inside your own AWS, Azure, or GCP account. No data leaves the cloud boundary. Same performance guarantees as on-premise.
EU-only or jurisdiction-specific data centres. Satisfies GDPR Art. 44 to 49. Compatible with your existing DLP and network controls.
Get started
A 30-minute discovery call, a no-obligation ROI assessment, then a pilot on a single process. Prove it works before you commit.
No GPU procurement · No core banking disruption · Live in 12 weeks