🎮 Mask-Lock Inference Chip

A physical AI cartridge. Embed model weights directly into silicon at manufacture. Plug it in. Get intelligence. No drivers. No cloud. No software stack.

80-150

Tokens/second on 2B model

2-3W

Active power draw

50×

Tokens/watt vs Jetson Orin

$35

Unit BOM at scale

12wk

FPGA prototype timeline

KV260

Target FPGA platform

What Is a Mask-Locked Chip?

Normally, a neural network runs on a GPU or NPU by loading weights from memory. The memory bus is the bottleneck — data moves from DRAM to compute, and that physical movement costs energy and time.

A mask-locked chip flips this. The weights are etched into the silicon itself, at the last metal layer of fabrication (the mask step). The chip doesn't load weights — the weights are the chip. This eliminates the memory bottleneck entirely.

The result: a physical AI cartridge, like a Nintendo game pak. Insert it, get inference. Swap cartridges for different models. No drivers. No software. The hardware is the model.

Why Cartridges?

🎯 Zero Config

Plug it in. It works. No CUDA, no Docker, no Python environment. A Raspberry Pi boots it in under a second.

🔒 Physically Locked

Weights are mask-embedded in metal. They cannot be extracted, copied, or tampered with. The model is the hardware.

🔄 Swappable

Different cartridge for different tasks. A navigation cartridge. A vision cartridge. A language cartridge. Like loading a different game.

🐚 Swarm Native

Multiple cartridges coordinate automatically. Each handles what it's best at. The fleet math runs on the host CPU.

What Exists Now

FPGA Complete prototype implementation guide for AMD KV260 — TLMM architecture, ternary weights, KV cache optimization
Research 54 GPU experiments on RTX 4050, 61M differential test inputs, zero mismatches
Simulation Chip floorplan v2, optimization framework in Python, BitNet b1.58-2B target model
Business $26B→$69B edge AI chip market analysis, competitive positioning vs Taalas/Hailo/NVIDIA
Framework Deterministic scoring platform for training agents on constraint-based games

The Stack

┌─────────────────────────────────────┐
│  Application (your code)            │
├─────────────────────────────────────┤
│  Cartridge Runtime  ←── Host CPU    │
├─────────────────────────────────────┤
│  Mask-Locked Inference Silicon      │
│  (ternary weights in metal layers)  │
├─────────────────────────────────────┤
│  FPGA Prototype  ←── KV260          │
└─────────────────────────────────────┘

Why Now

Three forces converge in an 18-month window:

Taalas validated the architecture — $219M raised for data center mask-locked chips. We own the sub-5W edge.
Model maturation — BitNet and iFairy enable ternary inference without quality loss. Not production-ready 18 months ago.
Privacy regulation — GDPR, HIPAA, and emerging AI governance mandate local processing for sensitive data.

The edge AI chip market grows from $26B (2025) to $69B (2030). The sub-$100, sub-5W LLM segment is unclaimed.

Resources

📦 Lucineer repo — full research package, FPGA guide, executive summary, investor pitch
🔬 DO-178C evidence — 26 Coq theorems for safety-critical certification
🧮 Fleet coordination math — Laman rigidity, ZHC, Pythagorean48
🎮 Demos — Eisenstein integer playground, Narrows autopilot

← Back to superinstance.ai