www.design-reuse-embedded.com
Find Top SoC Solutions
for AI, Automotive, IoT, Security, Audio & Video...

Key Metrics for Evaluating an Inferencing Engine

Advice on how to compare inferencing alternatives and the characteristics of an optimal inferencing engine.

by Geoff Tate, CEO of Flex Logix, Jan. 30, 2019 – 

In the last six months, we've seen an influx of specialized processors to handle neural inferencing in AI applications at the edge and in the data center. Customers have been racing to evaluate these neural inferencing options, only to find out that it's extremely confusing and no one really knows how to measure them. Some vendors talk about TOPS and TOPS/Watt without specifying models, batch sizes or process/voltage/temperature conditions. Others use the ResNet-50 benchmark, which is a much simpler model than most people need so its value in evaluating inference options is questionable.

As a result, as we head into 2019, most companies don't know how to compare inferencing alternatives. Many don't even know what the characteristics of an optimal inferencing engine are. This article will address both those points.

1. MACs: how many do you need?

Neural network models involve primarily matrix multiplication with billions of multiply-accumulates. Thus, you need MACs (multiply-accumulators) and lots of them.

Unless you want to explore analog multiplication, you need to find an inferencing engine that does the kind of integer multiplication you feel is most appropriate for your needs for precision and throughput. For the bulk of customers today, the choice is integer 8x8 multiplication, with 16-bit activations for a 16x8 multiplication for some layers where this is critical for precision. Accumulations need to be done with 32-bit adders because of the size of the matrix multiplies.

Click here to read more...

 Back

Partner with us

List your Products

Suppliers, list and add your products for free.

More about D&R Privacy Policy

© 2024 Design And Reuse

All Rights Reserved.

No portion of this site may be copied, retransmitted, reposted, duplicated or otherwise used without the express written permission of Design And Reuse.