Under 300MB per model

How TinyAI works.

TinyAI is our family of purpose-built AI models. Each one is under 300MB, runs on standard CPU hardware with no GPU, and deploys entirely inside your environment. The architecture is a 3-layer stack where specialized models work together to turn data into auditable decisions.

The architecture

Three Layers. One decision.

TinyAI deployment tends to be a 3-layer stack. NLP feeds ML. ML feeds Logic. The orchestration layer coordinates everything, running multiple specialized models in parallel with sub-300ms latency.

L1

NLP
(Foundation)

Extraction, recognition, and classification. Converts unstructured documents, contracts, forms, and scanned records into structured data fields that the layers above can process.

L2

Machine Learning (Intelligence)

Pattern recognition, scoring, and anomaly detection. Trained on your historical data with a minimum of 200 labelled datasets in under one hour. Compares new inputs against learned patterns to generate predictions with confidence scores.

L3

Logic and Orchestration (Decisions)

Deterministic decision chains where one model calls the next. Policy rules, routing logic, threshold checks, and human-in-the-loop checkpoints. Aggregates outputs across decision dimensions and produces a single structured decision with a full audit trail.

<300MB

Per model

No GPU required

<1hr

Training time

On 200+ labelled datasets

<300ms

Inference latency

End-to-end, no round trips

93%+

Accuracy

After fine-tuning on your data

Explainability

Every decision, explained in two layers.

TinyAI produces structured, deterministic outputs, not free-form text. Every decision is explainable and audit-ready by design.

L1

Structured Decision Output

A single-variable structured output per model. Each decision carries its confidence score and the features that drove it. Per-decision audit record: input features, weight contributions, model version, timestamp, confidence threshold.

L2

Context Attribution

NLP models use contextual attribution. Vision models produce spatial heatmaps. Both operate on actual model weights, not surrogate approximations. Results persisted alongside inference logs.

Audit-Ready by Design

Every decision produces a structured audit record: input features, weight contributions, model version, timestamp, and confidence threshold. Designed to support regulatory requirements across banking, pharma, and insurance without additional dependencies.

Deployment

Three topologies. One performance promise.

All with full data residency and sub-300ms latency. Zero data egress. No PII transmitted to Synapze or any third party.

01

On-Premises

Customer-managed servers. Zero outbound calls. Air-gapped compatible. Full business continuity even disconnected from the internet.

02

Private Cloud (VPC)

Inside your own AWS, Azure, or GCP account. No data leaves the cloud boundary. Same performance guarantees as on-premise.

03

Sovereign Cloud Regions

EU-only or jurisdiction-specific data centres. Satisfies GDPR Art. 44 to 49. Compatible with your existing DLP and network controls.

Get started

Start with a one-model pilot.

A 30-minute discovery call, a no-obligation ROI assessment, then a pilot on a single process. Prove it works before you commit.

No GPU procurement · No core banking disruption · Live in 12 weeks