A physical AI cartridge. Embed model weights directly into silicon at manufacture. Plug it in. Get intelligence. No drivers. No cloud. No software stack.
Normally, a neural network runs on a GPU or NPU by loading weights from memory. The memory bus is the bottleneck โ data moves from DRAM to compute, and that physical movement costs energy and time.
A mask-locked chip flips this. The weights are etched into the silicon itself, at the last metal layer of fabrication (the mask step). The chip doesn't load weights โ the weights are the chip. This eliminates the memory bottleneck entirely.
The result: a physical AI cartridge, like a Nintendo game pak. Insert it, get inference. Swap cartridges for different models. No drivers. No software. The hardware is the model.
Plug it in. It works. No CUDA, no Docker, no Python environment. A Raspberry Pi boots it in under a second.
Weights are mask-embedded in metal. They cannot be extracted, copied, or tampered with. The model is the hardware.
Different cartridge for different tasks. A navigation cartridge. A vision cartridge. A language cartridge. Like loading a different game.
Multiple cartridges coordinate automatically. Each handles what it's best at. The fleet math runs on the host CPU.
โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ โ Application (your code) โ โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโค โ Cartridge Runtime โโโ Host CPU โ โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโค โ Mask-Locked Inference Silicon โ โ (ternary weights in metal layers) โ โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโค โ FPGA Prototype โโโ KV260 โ โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ
Three forces converge in an 18-month window:
The edge AI chip market grows from $26B (2025) to $69B (2030). The sub-$100, sub-5W LLM segment is unclaimed.