- AI IP for Cybersecurity monitoring - Smart Monitor
- ARC EV Processors are fully programmable and configurable IP cores that are optimized for embedded vision applications
- Enhanced Neural Processing Unit for safety providing 32,768 MACs/cycle of performance for AI applications
- EV74 processor IP for AI vision applications with 4 vector processing units
- EV7x Vision Processors
- EV7xFS Vision Processors for Functional Safety
- More Products...
IP-SOC DAYS 2025 IP-SOC DAYS 2024 IP-SOC DAYS 2023 IP-SOC DAYS 2022 IP-SOC DAYS 2021 IP-SOC 2024 IP-SOC 2023 IP-SOC 2022 IP-SOC 2021
|
|||||||
![]() |
|

Alliance Aims to Deliver Memory-Optimized AI for Inferencing
- See the 2025 Best Edge AI Processor IP at the Embedded Vision Summit
- Cadence Unveils Millennium M2000 Supercomputer with NVIDIA Blackwell Systems to Transform AI-Driven Silicon, Systems and Drug Design
- Cadence Accelerates Physical AI Applications with Tensilica NeuroEdge 130 AI Co-Processor
- Kyocera Licenses Quadric's Chimera GPNPU AI Processor IP
- The future of AI runs on the GPU
- Perforce Partners with Siemens for Software-Defined, AI-Powered, Silicon-Enabled Design (May. 16, 2025)
- Semidynamics: From RISC-V with AI to AI with RISC-V (May. 16, 2025)
- TSMC Board of Directors Meeting Resolutions (May. 16, 2025)
- Arm Evolves Compute Platform Naming for the AI Era (May. 16, 2025)
- Secafy Licenses Menta's eFPGA IP to Power Chiplet-Based Secure Semiconductor Designs (May. 15, 2025)
- See Latest News>>
Swedish memory optimization IP company ZeroPoint Technologies today announced a strategic alliance with Rebellions to develop what they said will be the next generation of memory optimized AI accelerators for AI inferencing. The companies plan to unveil new products in 2026, claiming "unprecedented tokens-per-second-per-watt performance."
Apr. 08, 2025 –
As part of the collaboration, the two companies aim to increase effective memory bandwidth and capacity for foundational model inference workflows, using ZeroPoint Technologies’ memory compression, compaction and memory management technologies. This hardware-based memory optimization can help increase addressable storage capacity in data center environments to work nearly 1,000× faster than using software compression, according to ZeroPoint Technologies’ CEO Klas Moreau.
As a result, the company hopes to enhance tokens-per-second-per-watt without sacrificing accuracy, using lossless model compression to reduce model size and energy required to move model components.
“At Rebellions, we’re pushing the boundaries of state-of-the-art AI acceleration with an unwavering focus on efficiency,” said Rebellions CEO Sunghyun Park in the companies’ joint announcement. “Our partnership with ZeroPoint enables us to redefine what’s possible in inference performance per watt— delivering smarter, leaner and more sustainable AI infrastructure for the generative AI era.”
“We are convinced that memory acceleration will rapidly evolve from a competitive edge to an indispensable component of every advanced inference accelerator solution, and we’re proud that Rebellions shares our commitment to making AI datacenters far more efficient,” Moreau added in the statement.
Percentage of data stored in memory is redundant
In a briefing earlier this year with EE Times, Moreau highlighted that over 70% of data that is stored in memory is redundant. “This means you can get rid of it entirely and still provide lossless compression. However, for this to work seamlessly, the technology has to do three very specific things within that nanosecond scale window (which corresponds to just a few system clock cycles).
“First, it needs to handle compression and decompression. Second, it also must compact the resulting data [putting together small chunks of compressed data into an individual cacheline to dramatically improve apparent memory bandwidth], and finally it must seamlessly manage the data to keep track of where all the combined pieces are located. To minimize latency, this kind of hardware-accelerated memory optimization approach typically must function at cacheline-granularity—compressing, compacting and managing data in 64-byte chunks [contrasted to the much larger 4-128kB data sizes used by more traditional compression methods, such as ZSTD and LZ4].”