Aina — Distributed Edge ML Deployment

// Platform

The full deployment lifecycle, end to end.

Most teams stitch together a registry, a build farm, an OTA system, and a metrics stack. Aina is one coherent platform for all of it.

Model hot-swap, no SSH

Deploy a new model and the device swaps what it serves on its own. The edge runtime reloads the new graph and resumes. No scp, no SSH, no manual restart.

RUNTIME

Hardware-aware compilation

Cross-compile your ONNX model — classification, detection, super-resolution, even speech encoders — with Apache TVM for x86 or ARM64. MetaSchedule autotunes the kernels to the device (~7× faster inference, measured) before you ship.

BUILD

Durable job pipeline

Compilation and deployment run as NATS JetStream jobs with retries. If the backend restarts mid-flight, stuck jobs recover on their own. Submit and walk away.

QUEUE

Kubernetes-native

Compilation runs as Kubernetes Jobs; the edge runtime ships as a DaemonSet on your KubeEdge nodes. The whole stack installs from one Helm chart.

CONTROL_PLANE

Edge-first execution

Built for intermittent links and constrained devices. Runs on KubeEdge nodes; each device pulls its model on demand through a short-lived presigned URL.

DISTRIBUTION

Prometheus observability

Per-device inference latency, throughput, errors, model-load time, and CPU/memory/disk, exported in Prometheus format and rendered on the dashboard.

TELEMETRY

Every device, every prediction, every model load — exported.

The edge runtime exports a Prometheus endpoint on every node. Prometheus scrapes it; the dashboard queries it. Plug it into Grafana or scrape it raw. Nothing proprietary, nothing locked in.

edge_predictions_total{device="edge-01"}

edge_prediction_latency_seconds{device="edge-01"}

edge_prediction_errors_total{device="edge-01"}

edge_model_load_seconds{device="edge-01"}

Get early access to hosted Aina.

We're building a managed version — control plane, registry, and observability hosted for you. No cluster required. Drop your email and we'll reach out when it's ready.

Self-host on GitHub →

Push an ONNX file.
Roll it across your edge fleet.

Roll a new model across the fleet.

The full deployment lifecycle, end to end.

Model hot-swap, no SSH

Hardware-aware compilation

Durable job pipeline

Kubernetes-native

Edge-first execution

Prometheus observability

See your fleet.

From `aina push` to live inference, in one flow.

Upload

Compile

Deploy

Hot-swap

Every device, every prediction, every model load — exported.

fleet:prod / live

Get early access to hosted Aina.

Push an ONNX file.Roll it across your edge fleet.

Roll a new model across the fleet.

The full deployment lifecycle, end to end.

Model hot-swap, no SSH

Hardware-aware compilation

Durable job pipeline

Kubernetes-native

Edge-first execution

Prometheus observability

See your fleet.

From aina push to live inference, in one flow.

Upload

Compile

Deploy

Hot-swap

Every device, every prediction, every model load — exported.

fleet:prod / live

Get early access to hosted Aina.

Push an ONNX file.
Roll it across your edge fleet.

From `aina push` to live inference, in one flow.