Find Top SoC Solutions
for AI, Automotive, IoT, Security, Audio & Video...

AIoT Chip Slashes Power Consumption for Person Detection

eetimes.eu, Jun. 23, 2020 – 

A proof-of-concept chip from French research institutes CEA-Leti and LIST, presented at VLSI Symposium 2020, incorporates a low-power IoT node and an AI accelerator and demonstrates ultra-fast wake up time with a 15,000X peak-to-idle power consumption reduction. The node delivers up to 1.3 tera operations per second per Watt (TOPS/W) or 36 GOPS for machine learning tasks.

The chip, named SamurAI, was tested in an occupancy detection system with off-the-shelf components including a PIR sensor, 224×224 pixel black and white camera, FeRAM and a low power radio. The daily average system power consumption was 105µW, with SamurAI consuming 26% of that budget. The system used the PIR sensor with a 5s interval during room occupation 8 hours per day, the camera at 1 frame per second and the radio 10x per day.

SamurAI System

SamurAI uses two on-chip sub-systems: a low-power clockless event-driven wake-up controller which can start up in 207 ns, and an on-demand subsystem comprising a RISC-V CPU with deep sleep mode plus PNeuro AI accelerator and cryptography accelerators.

This dual subsystem scheme enables a 15,000X peak-to-idle power ratio. The figure below shows the power consumption during different modes; idle mode consumes just 6.4 µW. With the CPU and AI accelerator running, the power consumption is 96 mW.

The chip is built on STMicro's 28 nm fully depleted silicon on insulator (FD-SOI) process, and power figures are given without body biasing. The silicon is 4.5 mm2 and has 6 switchable power domains.

SamurAI power consumption measurements by power modes (the modes are L-R: idle, wake-up controller (WuC) only, wake-up controller and wake-up radio (WuR), wake-up controller and peripherals, and CPU running (Image: CEA-Leti)

AI accelerator

The chip's AI accelerator, a design the team calls PNeuro, is a single instruction, multiple data (SIMD) programmable accelerator. It is comprised of 2 clusters of 32x 8-bit processing elements with 264kB multi-banked SRAM. It can perform up to 64 multiply-accumulates (MACs) per cycle. The PNeuro block can achieve 1.3 TOPS/W at 2.8 GOPS/0.48V. It can do up to 36 GOPS at 0.9V for 8-bit fully-connected neural network layers.

Click here to


Partner with us

Visit our new Partnership Portal for more information.

List your Products

Suppliers, list and add your products for free.

More about D&R Privacy Policy

© 2020 Design And Reuse

All Rights Reserved.

No portion of this site may be copied, retransmitted, reposted, duplicated or otherwise used without the express written permission of Design And Reuse.