Optics packaged together can accelerate generative AI computing

Optics packaged together can accelerate generative AI computing

Scientists at IBM Research have announced a series of new advances in chip assembly and packaging, called “co-packaged optics,” that promise to improve power efficiency and increase bandwidth by integrating optical interconnects into devices and within the walls of Bring data centers used for training and provision of large language models. This new process promises to increase the number of optical fibers that can be connected to the edge of a chip, a measure known as beachfront density, by six times. As artificial intelligence demands ever-increasing bandwidth, this innovation will leverage the world’s first successful polymer fiber optic cable to bring the speed and bandwidth of optics to the edge of chips.

Early results suggest that switching from traditional electrical connections to co-packaged optics will reduce the energy costs of training AI models, speed up model training, and dramatically increase the energy efficiency of data centers.

Today’s advanced chip and chip packaging technologies typically use electrical signals for the transistors in microelectronics that power phones, computers and almost everything we do. Transistors, on the other hand, have become many times smaller over the decades, allowing us to pack more power into a given space. But even the most powerful semiconductor devices are only as fast as the connections between them.

an IBM polymer fiber optic cablean IBM polymer fiber optic cable

IBM’s polymer optical waveguide prototypes bring the speed and bandwidth of fiber optic connections to the edge of chips, replacing sluggish electrical connectors.

These connections allow us to seamlessly use electronic devices in our daily lives – like driving our cars, which contain chips in almost every system, from the seats to the tires. “Even your refrigerator contains electronics that keep everything working properly,” says IBM research engineer John Knickerbocker, a respected chiplet and advanced packaging engineer.

However, Knickerbocker and his team are thinking smaller. Due to their lower cost and higher energy efficiency, optical connectors are excellent candidates for improving the performance of chip-to-chip and device-to-device communications in data centers, where generative AI computing requires increasingly higher bandwidths.

“Large language models have made AI very popular across the tech industry today,” says Knickerbocker. “And the resulting growth of LLMs – and generative AI more broadly – ​​will require exponential growth in high-speed interconnects between chips and data centers.”

IBM Research scientists look at an optics module under a microscope in a laboratoryIBM Research scientists look at an optics module under a microscope in a laboratory

Hsianghan Hsu (left) and John Knickerbocker (right) examine a polymer fiber optic module under a microscope at IBM Research global headquarters in Yorktown Heights, New York.

And while optical cables can carry data in and out of data centers, what happens inside is very different. Even today’s most advanced chips still communicate using copper-based wires that carry electrical signals. It takes quite a bit of energy to make the connection from the edge of a chip to a circuit board, then from the circuit board over miles of optical cables, and then back to another module and to another chip in a remote data center. Whether you’re transmitting data or a voice call, seamlessly transmitting a signal across all of these nodes takes energy. Low-bandwidth wired connections within servers also slow down GPU accelerators that are idle waiting for data.

Electrical signals use electrons to provide power and signal communication from one device to another. Optics, on the other hand, which has been used in communications technology for decades, uses light to transmit data. Fiber optic cables, hair-thin and sometimes thousands of kilometers long, can transmit hundreds of terabits of data per second. Bundled and isolated in cables that run under the sea, fiber optics carry nearly all global trade and communications traffic that flows between continents.

Knickerbocker and his colleagues have found that transferring the power of optical interconnects to circuit boards and down to chips results in a more than 80% reduction in energy consumption compared to electrical interconnects – a reduction from 5 picojoules per bit to less than 1. With thousands of chips and millions of operations, this represents huge savings.

Several polymer fiber optic modules are housed in a small plastic housingSeveral polymer fiber optic modules are housed in a small plastic housing

John Knickerbocker carefully handles polymer fiber optic modules in the laboratory. These connectors promise to reduce the time GPUs sit idle waiting for data during AI model training.

IBM Research’s Chiplet and Advanced Packaging team is attempting to streamline this system with co-packaged optics, an approach that promises to improve the efficiency and density of communications both within and between chips. Part of the optical connections on integrated circuit boards consists of installing transmitters and photodetectors for sending and receiving optical signals. Optical fibers are about 250 micrometers in diameter, about three times the width of a human hair. That may sound tiny, but four fibers add up to a millimeter, and when the millimeters add up, the edges of a chip quickly run out of space.

From the point of view of the scientists at IBM Research, the solution lies in the next generation of optical connections that enable much denser connections: the polymer optical fiber. This device allows high-density fiber optic bundles to be placed directly at the edge of a silicon chip, allowing them to communicate directly through the polymer fibers. High-fidelity optical connections require precise tolerances of half a micrometer or less between fiber and connector – a feat the team has now achieved.

Thanks to these approaches, the team has demonstrated the feasibility of a 50-micron optical channel pitch connected to silicon photonics waveguides and a connector that can be connected to single-mode fiber arrays (SMF), namely using standard assembly packaging processes. This represents an 80% size reduction compared to the traditional 250 micron pitch. However, tests have shown that this can be reduced even further, to 20 or 25 micrometers, which would correspond to an increase in bandwidth of 1,000 to 1,200%.

Exploded view of an IBM fiber optic moduleExploded view of an IBM fiber optic module

An exploded view of the prototype of the included optics module.

Photonic integrated circuit (PIC) insertion loss to SMF optical interconnect was typically 1.5 to 2 decibels (dB) per channel, but in this case it was demonstrated to be less than 1.2 dB per complete optical interconnect. Additionally, demonstrations with 18.4 micron pitch optical waveguides have shown crosstalk of less than 30 dB, suggesting that this co-packaged optics technology is scalable to very high bandwidth density for chip interconnection.

This means that, following the example of the telephone industry, they can transmit multiple wavelengths of light per optical channel, which has the potential to increase bandwidth by at least 4,000% and even 8,000%.

Beyond the fiber-to-chip and fiber-to-board connections, they also reinforce traditional optical fibers with high-strength polymers, a move that improves durability and efficiency but also requires advanced modeling simulations of optical lengths to ensure that light can transmit losslessly across multiple components – the “co-packaging” of everything.

Scientists and technicians walk through a clean room where electronic components are developedScientists and technicians walk through a clean room where electronic components are developed

The polymer fiber optic cables are being tested at IBM’s facility in Bromont, Quebec. There they are subjected to heat and cold cycles, high humidity and mechanical stress tests.

This development process also includes industry-standard reliability stress testing to ensure that all optical and electrical connections continue to function when subjected to the stresses encountered during manufacturing and use. The components are subjected to temperatures ranging from -40°C to 125°C as well as mechanical durability tests to confirm that the optical fibers can withstand bending without breaking or suffering data loss. These tests take place at IBM Research’s global headquarters in Yorktown Heights, New York, and at the IBM facility in Bromont, Quebec.

“The big deal is not only that we have this big density improvement for communications on the module, but we’ve also shown that this is compatible with stress tests that optical links have failed in the past,” says Knickerbocker. IBM’s modules are designed to be compatible with standard electronic passive advanced packaging assembly processes, which can result in lower production costs. This innovation allows IBM to produce co-packaged optical modules at its Bromont factory.

The team is creating a roadmap for the next steps this technology will take, including gathering feedback from IBM customers and enabling co-packaged optics to meet the business needs of generative AI computing. “We will also work with the component suppliers to position them for this next step in technology,” says Knickerbocker, “and to position them for the ability to support production volumes, not just prototypes.”

Leave a Reply

Your email address will not be published. Required fields are marked *