Aina logoAina

Push an ONNX file.
Roll it across your edge fleet.

Cross-compile for each target with Apache TVM, distribute across your KubeEdge fleet, and hot-swap the running model with one API call. No per-device SSH loop.

AGPL-3.0Self-hostablek8s-native
Aina dashboard — edge fleet overview showing devices, deployments, and live metrics

Roll a new model across the fleet.

yolov8n v3.1v3.2 · rolling, one device at a time
edge-01
jetson-nano
yolov8n@v3.1
edge-02
rpi-4 / arm64
yolov8n@v3.1
edge-03
arm64 server
yolov8n@v3.1
edge-04
x86_64
yolov8n@v3.1
fleet active · 0/4 on new version
Cross-compiles for the targets you actually deploy to.One ONNX file → TVM-autotuned, op-fused builds per target.
NVIDIA Jetson
arm64
Raspberry Pi
arm64 · neon
ARM64 server
aarch64
x86_64
avx2 · llvm

The full deployment lifecycle, end to end.

Most teams stitch together a registry, a build farm, an OTA system, and a metrics stack. Aina is one coherent platform for all of it.

Model hot-swap, no SSH

Deploy a new model and the device swaps what it serves on its own. The edge runtime reloads the new graph and resumes. No scp, no SSH, no manual restart.

RUNTIME

Hardware-aware compilation

Cross-compile your ONNX model — classification, detection, super-resolution, even speech encoders — with Apache TVM for x86 or ARM64. MetaSchedule autotunes the kernels to the device (~7× faster inference, measured) before you ship.

BUILD

Durable job pipeline

Compilation and deployment run as NATS JetStream jobs with retries. If the backend restarts mid-flight, stuck jobs recover on their own. Submit and walk away.

QUEUE

Kubernetes-native

Compilation runs as Kubernetes Jobs; the edge runtime ships as a DaemonSet on your KubeEdge nodes. The whole stack installs from one Helm chart.

CONTROL_PLANE

Edge-first execution

Built for intermittent links and constrained devices. Runs on KubeEdge nodes; each device pulls its model on demand through a short-lived presigned URL.

DISTRIBUTION

Prometheus observability

Per-device inference latency, throughput, errors, model-load time, and CPU/memory/disk, exported in Prometheus format and rendered on the dashboard.

TELEMETRY

See your fleet.

Live device health, per-node inference metrics, and one-click compile — all in one place.

Aina dashboard — edge fleet overview with device, deployment, and resource metrics
Edge devices page — per-node arch, IP, OS, runtime, live CPU/memory/disk gauges and inference metrics
Per-device telemetry
Compile modal — target architecture, measure-on-device, and autotune trial count
One-click compile + autotune

From aina push to live inference, in one flow.

Four stages, one flow. Click through to see what each one does.

01

Upload

Push a model artifact to Aina. It is stored in object storage and versioned by name.

02

Compile

Aina spawns a Kubernetes Job that compiles the model with Apache TVM and autotunes the kernels to your target with MetaSchedule. Progress streams live.

03

Deploy

Create a deployment and start it. A NATS JetStream job hands the device a short-lived presigned URL; the device pulls the model over HTTP — a path that survives flaky uplinks and offline windows.

04

Hot-swap

The device reloads the new model and resumes serving, no SSH required. Roll back at any time with aina deploy stop / start.

~/projects/edge-fleet · aina@0.1.0
# illustrative$ aina model push ./yolov8n.onnx --name yolov8n --version v3.2→ uploading 23.4 MB to object storage✓ registered yolov8n v3.2

Every device, every prediction, every model load — exported.

The edge runtime exports a Prometheus endpoint on every node. Prometheus scrapes it; the dashboard queries it. Plug it into Grafana or scrape it raw. Nothing proprietary, nothing locked in.

edge_predictions_total{device="edge-01"}
edge_prediction_latency_seconds{device="edge-01"}
edge_prediction_errors_total{device="edge-01"}
edge_model_load_seconds{device="edge-01"}
fleet:prod / live
● example data
edge_predictions_total
counter
edge_prediction_latency_seconds
histogram
edge_model_load_seconds
gauge
inference latency · example trace
model load duration · example rollout
We kept watching teams hand-roll the same fragile glue — a registry, a build farm, an OTA system, scripts that SSH into devices to swap models with a 3am restart. So we built the platform we wanted instead.
// the aina maintainers

Get early access to hosted Aina.

We're building a managed version — control plane, registry, and observability hosted for you. No cluster required. Drop your email and we'll reach out when it's ready.

No spam. Unsubscribe any time.

Self-host on GitHub →