Noise-Conditioned State Space Model

Voice AI on a
$5 Chip

NC-SSM-20K achieves 96.4% keyword spotting accuracy in 10.4ms on a Cortex-M7 MCU — matching DS-CNN-S accuracy with 5x lower latency and 10x fewer MACs.

20K

Parameters

10.4ms

Latency

2.44M

MACs

96.4%

Accuracy

Chip Cost

⚡

Ultra-Low Latency

7.1ms inference on Cortex-M7 @ 480MHz. Stateful SSM hidden state enables progressive classification without rebuffering.

🔋

Tiny Footprint

7,443 parameters, 720 bytes hidden state. INT8 quantized model fits in 7.3KB. Runs on $1.50 FPGA or $3.50 MCU.

🎤

Noise-Conditioned

Learned noise embedding (sigma) adapts the model to real-world SNR conditions. No manual noise calibration needed.

🚀

Streaming Native

SSM processes audio frame-by-frame with stateful hidden state. Short keywords detected in ~350ms vs 1053ms for CNN.

🔒

Patented (US/KR)

Noise-conditioned SSM architecture protected under US and KR patent applications. Royalty licensing available.

📦

SDK Ready

pip install nano-ssm. PyTorch training + C SDK for Cortex-M deployment. ONNX/TFLite export in one line.

Live Keyword Detection

Real-time voice command recognition running NC-SSM inference

Tap to Start

Browser microphone required

Yes

Down

Left

Right

Off

Stop

[system] NC-SSM demo ready. Click microphone to begin.

Performance Benchmark

NC-SSM vs industry-standard CNN models on Cortex-M7 @ 480MHz

OURS

NC-SSM

SSM Family

Parameters7,443

MACs (Model)0.86M

Latency7.1ms

Accuracy95.3%

BC-ResNet-1

CNN Family

Parameters7,464

MACs (Model)4.99M

Latency12.8ms

Accuracy95.0%

DS-CNN-S

CNN Family

Parameters23,756

MACs (Model)24.3M

Latency53.5ms

Accuracy96.4%

Model	Params (K)	Full MACs (M)	Model MACs (M)	Latency (ms)	Clean Acc (%)	Efficiency
BC-ResNet-1	7.5	6.15	4.99	12.8	95.0	15.4
DS-CNN-S	23.8	25.67	24.32	53.5	96.4	3.8
NC-SSM	7.4	3.42	0.86	7.1	95.3	27.8
NC-SSM-Large	10.2	3.80	1.24	7.9	95.6	25.2
NC-SSM-15K	15.8	4.48	1.92	9.3	96.2	21.5
NC-SSM-20K	20.0	5.00	2.44	10.4	96.4	19.3

Accuracy vs Parameters

Latency vs MACs

Streaming Advantage

Why SSM architecture fundamentally outperforms CNN in real-time voice AI

SSM

NC-SSM Streaming

Stateful Sequential Processing

Mic → VAD → Onset Detection → Progressive Classify
  +300ms → 1st attempt (short words)
  +500ms → 2nd attempt (medium)
  +750ms → 3rd attempt (confirm)
confidence > threshold → DETECTED!

⚡

Stateful hidden state — h_t accumulates temporal context naturally.

🎯

Progressive classification — Short words at 300ms, long at 500ms.

🔋

Zero feature map memory — Only 720 bytes hidden state.

💡

Energy-aware — 0 MACs during silence. 10x battery life.

CNN Streaming (DS-CNN-S)

Stateless Window Processing

Mic → Buffer 1.0s → Full Inference → Result

Every 1s: re-buffer → re-compute entire window
No state carried between windows
Short words may be split across windows

🔄

Stateless — Must recompute 24.3M MACs every window.

⏳

Fixed 1s buffer — Minimum latency = 1000ms.

💾

196 KB feature maps — Cannot fit on low-cost FPGA.

🔋

Always computing — Burns MACs on silence.

Detection Timeline: "Go" (200ms word)

NC-SSM

onset → classify → DETECTED

~350ms

DS-CNN-S

buffer 1000ms ... full inference 53ms ... DETECTED

~1053ms

NC-SSM detects 3x faster for short keywords

~350ms

Short Word Detection

720 Bytes

Hidden State Memory

0 MACs

During Silence

10x

Battery Life

Business Model

Multiple revenue streams from a single core technology

🔒

IP Licensing

US + KR patents on noise-conditioned SSM architecture. Per-chip royalty model for semiconductor companies and OEMs.

$0.01-0.05 / chip

📦

Nano AI SDK

pip install nano-ssm. PyTorch training, C SDK for edge deployment, ONNX export. Community free, Pro $500/mo.

SaaS / B2B

🎤

Custom Wake Word

Train custom keywords for enterprise. "Hey Samsung", "OK LG" on NC-SSM with full edge deployment package.

$10K-50K per project

⚡

Edge AI Module

All-in-one KWS module: STM32H7 + MEMS mic + NC-SSM firmware. BOM $3-5, sell $15-30. 60-70% margin.

Hardware Product

Hardware Module

All-in-one edge AI voice module for mass production

NC-SSM

Edge Voice Module v1.0

STM32H743 + MEMS Mic

25mm x 25mm x 5mm

⚙

STM32H743 (Cortex-M7)

480MHz, 1MB RAM, 2MB Flash, FPU

🎤

ICS-43434 MEMS Microphone

I2S digital output, 65dB SNR

⚡

Ultra-Low Power

Active: 100mW | Listen: 15mW | Sleep: 0.1mW

🔌

Interface

UART/SPI/I2C, GPIO wake, 3.3V supply

🚀

NC-SSM Inference

7.1ms, 7,443 params, INT8 (7.3KB)

Bill of Materials (Unit Cost @ 10K qty)

$3.50

STM32H743

$0.80

MEMS Mic

$0.50

Passives + PCB

$4.80

Total BOM

Why NC-SSM Wins

7.5x

Faster than CNN

5.6x

Fewer Model MACs

Target Chip Cost

0.1W

Power Consumption

Voice AI on a$5 Chip

Ultra-Low Latency

Tiny Footprint

Noise-Conditioned

Streaming Native

Patented (US/KR)

SDK Ready

Live Keyword Detection

Performance Benchmark

Accuracy vs Parameters

Latency vs MACs

Streaming Advantage

NC-SSM Streaming

CNN Streaming (DS-CNN-S)

Detection Timeline: "Go" (200ms word)

Business Model

IP Licensing

Nano AI SDK

Custom Wake Word

Edge AI Module

Hardware Module

STM32H743 (Cortex-M7)

ICS-43434 MEMS Microphone

Ultra-Low Power

Interface

NC-SSM Inference

Bill of Materials (Unit Cost @ 10K qty)

Why NC-SSM Wins

Voice AI on a
$5 Chip