How Universal Chiplet Interconnect Express Changes SoC Design 

Manuel Mota

Jul 31, 2022 / 6 min read

A self-driving car. A helicopter drone on Mars. A thermostat you can adjust from across the globe. What’ll they think of next? Science fiction writer and futurist, Arthur C. Clark, said that “any sufficiently advanced technology is indistinguishable from magic.” The thing is, it’s not magic. It’s engineering. And there are a lot of challenges to work through to make the proverbial “magic” happen. One of the biggest challenges is the continually encroaching power-performance-and-area (PPA) ceiling in traditional monolithic SoC design. To achieve the next big phase of innovation and break the PPA ceiling, you must design differently. And one trend to help you do just that is multi-die chip design.

A multi-die design consists of individual dies, also called chiplets, that support discrete functions and are assembled together—either side-by-side on 2D or 2.5D packages or vertically stacked in 3D packages. The chiplets could be manufactured on different process nodes in a heterogeneous fashion. Until now, employing a multi-die architecture has been difficult. To do it, early adopters have adapted monolithic chip design methodologies to internally defined design and verification flows and developed their own interface technologies. But to make the marketplace for disaggregated dies truly vibrant—one with plug-and-play-like flexibility and interoperability—industry standards and an ecosystem are essential. Enter the Universal Chiplet Interconnect Express (UCIe) specification that enables customizable, package-level integration of chiplets.

Chip Cubes (abstract)

Why Chiplets are Taking Off

As designers seek to pack ever-more transistors into smaller spaces, SoC size is nearing the reticle ceiling for manufacturing. In short, traditional monolithic SoCs are becoming too big and costly to produce for advanced designs, and yield risk grows along with design size. Disaggregating SoC components, manufacturing them separately, and then bringing those distinct functions together in a single package results in less waste. The goal is to reduce cost and greatly improve reliability by assembling only known good dies in the package.

Aside from supporting different components on different process nodes that are optimal for the particular function, a multi-die architecture also allows integration of dies from digital, analog, or high-frequency processes. You can also incorporate highly dense three-dimensional memory arrays, High-Bandwidth Memory (HBM), within your design.

Let’s say you are developing a device that might not need the most advanced process for your I/O interfaces, such as Ethernet interfaces, but perhaps you do for your co-processors. By manufacturing on the node that’s appropriate for the function, you optimize your PPA at a granular, form-follows-function level. And if you use the same I/O subsystem across devices, for example in products that differentiate features across tiers, you can get economy of scale, manufacturing all the I/O interfaces at once. Compare this with monolithic design, where the entire SoC is on the same die, regardless of the function. This means your I/O interfaces are running on the same process as your most advanced capabilities, and if one component of the design fails, it all fails.

The scale and modular flexibility will also help you meet narrowing time-to-market windows. Dies with standard functions can be mixed and matched—a kind of hard IP—allowing your engineering talent to focus on the differentiating factors of your design, speeding delivery to market.

While all of this may sound great, disaggregated dies introduce a higher level of complexity in terms of bandwidth, interoperability, and data integrity. Because of this, multi-die designs have been the purview of larger players who have the resources to support custom interconnect development between the dies. But as this newer design methodology has gained traction, the bespoke nature of die-to-die interconnects has been at odds with interoperability. Despite these challenges, the chiplet market is expected to grow to $50B by 2024. And UCIe is a key enabler for this growth.

Why UCIe Is the Standard of Choice for Multi-Die Design

You might wonder if there are any other standards already out there. In fact, several different standards have emerged to address the challenges of multi-die design. UCIe supports 2D, 2.5D and bridge packages with 3D packages expected in the future. And it is the only standard with a complete stack for the die-to-die interface. The other standards focus only on specific layers and, unlike UCIe, do not offer a comprehensive specification for the complete die-to-die interface for the protocol stack.

As a leader in EDA and IP solutions, Synopsys looks forward to our future contributions to the UCIe specification. Along with the promoting members AMD, Arm, ASE Group, Google Cloud, Intel, Meta, Microsoft, Qualcomm, Samsung, and TSMC, we are actively helping to promote a healthy ecosystem for UCIe. The deep experience and extensive work of its backers is foundational to the integrity of the specification and its widespread adoption. Because of this, you can be assured it’s a solid choice for your next design.

It’s All in the Stack: Future Proof Your Design

Not only does UCIe accommodate the bulk of designs today from 8 Gbps to 16 Gbps per pin but it also accommodates designs at 32 Gbps per pin for high-bandwidth applications from networking to hyperscale data centers. In other words, the standard supports bandwidth for now and for the future. UCIe is comprised of two package variants:

  • UCIe for advanced packages, such as silicon interposer, silicon bridge, or redistribution layer (RDL) fanout
  • UCIe for standard packages, such as organic substrate or laminate

The UCIe stack itself has three layers. The top Protocol Layer ensures maximum efficiency and reduced latency through flow-control-unit-based (FLIT-based) protocol implementation, supporting the most popular protocols, including PCI Express® (PCIe®), Compute Express Link (CXL), and/or user-defined streaming protocols. The second layer is where the protocols are arbitrated and negotiated and where the link management occurs through a die-to-die adapter. Based on cyclic redundancy check (CRC) and a retry mechanism, this layer also includes optional error correction functionality. The third layer, the PHY, specifies the electrical interface with the package media. This is where the electrical analog front end (AFE), transmitter and receiver, and sideband channel allow parameter exchange and negotiation between two dies. Logic PHY implements the link initialization, training and calibration algorithms, and test-and-repair functionality.

UCIe Protocol Stack | Synopsys

Whether your primary goal is high-energy efficiency, high-edge usage efficiency, low latency, or all of the above, the UCIe specification has very competitive performance targets.

Synopsys Multi-Die Solutions Ease the UCIe Design Journey

To help you in your journey of adoption, Synopsys has a complete UCIe Solution, so you can put the specification into practice with PHY, controller, and verification IP (VIP):

  • PHY—Supports both standard and advanced packaging options and available in advanced FinFET processes for high-bandwidth, low-power, and low-latency die-to-die connectivity.
  • Controller IP—Supports PCIe, CXL, and other widely used protocols for latency-optimized network-on-chip (NoC)-to NoC links with streaming protocols; for example, bridging to CXS interfaces and to AXI interfaces.
  • VIP—Supports various designs under test (DUT) at each layer of the full stack. Includes testbench interfaces with/without PCIe/CXL protocol stack, Application Programming Interface (API) for sideband service requests, and API for traffic generation. Protocol checks and functional coverage are at each stack layer and signaling interface. Enables scalable architecture and Synopsys-defined interoperability test suite.

Our solution enables robust and reliable die-to-die links with testability features for known good dies and CRC or parity checks for error correction. It enables you to build seamless interconnects between dies for the lowest latency and highest energy efficiency.

Verifying Your Design: How Simulation, Emulation, and Prototyping Play a Role in Overcoming System-Level Challenges

With multi-die designs, an increase in payloads due to multiple streaming protocols could take days or even months for simulations, limiting its usefulness. To verify your multi-die SoCs, first create various single-node and multi-node models, simulating these minimalistic systems to check the integrity of data. Once those scenarios are tested, you can then test in higher-level system scenarios with multi-protocol layers using the Synopsys ZeBu® emulation system, and then move to prototyping with the Synopsys HAPS® prototyping system. This flow from models to simulation to emulation to prototyping, using our verification IP, will help you ensure seamless interoperability pre-silicon.

Beyond Moore’s Law

Multi-die design is a great option to catapult us beyond the limitations of Moore’s law. With it, we can realize new levels of efficiencies and performance while reducing power and area footprints. UCIe is helping to fast track this new way of designing for advanced applications. To learn more about how UCIe facilitates multi-die designs, check out our article, Multi-Die SoCs Gaining Strength with Introduction of UCIe.

Continue Reading