- AI IP for Cybersecurity monitoring - Smart Monitor
- ARC EV Processors are fully programmable and configurable IP cores that are optimized for embedded vision applications
- Enhanced Neural Processing Unit for safety providing 32,768 MACs/cycle of performance for AI applications
- EV74 processor IP for AI vision applications with 4 vector processing units
- EV7x Vision Processors
- EV7xFS Vision Processors for Functional Safety
- More Products...
IP-SOC DAYS 2025 IP-SOC DAYS 2024 IP-SOC DAYS 2023 IP-SOC DAYS 2022 IP-SOC DAYS 2021 IP-SOC 2024 IP-SOC 2023 IP-SOC 2022 IP-SOC 2021
|
|||||||
![]() |
|

RaiderChip brings Meta Llama 3.2 LLM HW acceleration to low cost FPGAs
- See the 2025 Best Edge AI Processor IP at the Embedded Vision Summit
- ComputeRAM in AI accelerators: An LLM case study
- Cadence Unveils Millennium M2000 Supercomputer with NVIDIA Blackwell Systems to Transform AI-Driven Silicon, Systems and Drug Design
- Cadence Accelerates Physical AI Applications with Tensilica NeuroEdge 130 AI Co-Processor
- Kyocera Licenses Quadric's Chimera GPNPU AI Processor IP
- NVIDIA Unveils NVLink Fusion for Industry to Build Semi-Custom AI Infrastructure With NVIDIA Partner Ecosystem (May. 20, 2025)
- sureCore extends its sureFIT design service to include custom memory solutions for AI applications (May. 20, 2025)
- Versatile Whitebox 1G Ethernet PHY IP Core with BroadR-Reach™ for Connected Automotive and Industrial Systems (May. 19, 2025)
- Codasip: Toward Custom, Safe, Secure RISC-V Compute Cores (May. 19, 2025)
- Semidynamics: From RISC-V with AI to AI with RISC-V (May. 19, 2025)
- See Latest News>>
Oct. 01, 2024 –
The company incorporates the latest model, presented by Meta less than a week ago, into the catalog of LLMs already accelerated on a wide range of FPGAs
Spain -- October 1st, 2024 -- Just 6 days after its launch by Meta, RaiderChip has added support for the new Llama 3.2 model to the list of LLMs hardware accelerated by its GenAI v1 IP core for FPGAs. The supported models, which already included previous models from the same company (such as Llama 2, Llama 3, and Llama 3.1), as well as models from other providers (Microsoft’s Phi-2 and Phi-3), have been expanded again, following RaiderChip’s strategy of adding the most relevant models as they hit the market.
RaiderChip’s Generative AI hardware acceleration IP core is designed to accelerate any model built on Transformers technology, which underpins the vast majority of LLMs. The selection of additional supported models on FPGA is determined by customer choices of specific target devices and foundational LLMs. Supporting a foundational model enables any customer-specific fine-tuned derivative to be accelerated seamlessly, without the need to share its weights.
It is important to note that the various FPGA options available on the market have different sizes and variable logic and memory capacities. These technical factors, combined with commercial decisions such as unit cost, power consumption, or final functionality, mean that the ideal FPGA and LLM vary for each potential final product. For instance, using “smaller” models (such as Meta’s Llama 3.2 1B and Microsoft’s Phi-2 2.7B) or 4-bits Quantization are ideal for products based on simpler, more economical FPGAs; while larger LLMs in their original format, preserving the model’s full floating point precision, require larger and more expensive FPGAs,” explains Victor Lopez, the company’s CTO.
GenAI v1-Q running the Llama 3.2 1B LLM model with 4 bits Quantization on a low-cost Versal FPGA with LPDDR4 memory
Like previous models, the new Llama 3.2 can be tested through an interactive demo based on a Versal FPGA. “The demonstrator exposes not only a simple local and remote API access to the accelerated LLM, but may also be run directly through a chat terminal application that allows users to interact with the different models, while they are HW accelerated by RaiderChip’s IP core on the FPGA. At RaiderChip, we invite our customers to personally experience the real performance of our product, assessing key aspects like intelligence, latency, or tokens per second in a live demonstrator, beyond the usual performance tables based on theoretical or simulation data, which are often confusing,” the team comments.
Companies interested in trying the GenAI v1-Q IP core may reach out to RaiderChip for access to our FPGA demo or a consultation on how our IP cores can accelerate their AI workloads.
More information at https://raiderchip.ai/technology/hardware-ai-accelerators