Transistors on the Edge: The Quest for Energy-Efficient Computing

We are at an inflection point with so many promising technological breakthroughs. In recent years, we have made meteoric advancements in terms of artificial intelligence and machine learning. From Deep Blue, to AlphaGo, to ChatGPT, these advancements show the shift from traditional computing paradigms. As we build more powerful AI systems, we require increasing amounts of computation. AI models are a lot of complex mathematical operations, such as matrix multiplication and nonlinear transformations. Training and inference in artificial intelligence models requires a lot of processing power. The question is: how do we get this computational power? Transistors. Billions of them.

A transistor is a tiny electronic switch that can be either on or off; they control the flow of electricity and are the fundamental building block of modern electronics. The invention of the transistor is among the most important in the entire history of the human race; in fact, it’s estimated that 13 sextillion transistors have been manufactured since their invention in 1947. This semiconductor device can be used to amplify or switch electrical signals; these transistors are packed en masse and used in computer chips to perform a variety of functions. There exists a tradeoff between performance, power, and area when designing computer chips. However, Moore’s law, an observation of technological trends in the semiconductor industry over the past five decades, shows the scaling improvements on computer chips. It states that the number of transistors on an integrated circuit will double about every two years. The reason Moore’s law works is due to Dennard scaling. As the size of transistors decreases, we are able to pack more transistors, which increases the density of the chip. Normally, an increase in transistor count would result in more energy expenditure; however, with the reduced size comes a reduced voltage and current. This has the effect of canceling out the increase in power. In other words, we are able to boost computation speed while maintaining constant power-consumption. This exponential growth has driven rapid technological advancements relating to computing power. However, Moore’s law and Dennard scaling are challenged by the physical limits; quantum effects, heat dissipation, and atomic limits are all challenges the industry is currently facing. In fact, since 2005, Dennard scaling has broken down. What are the implications of this? Leakage current is current flow in a transistor that is not in use; this decreases transistor reliability and increases power consumption. Power density is not able to keep up with the smaller transistors. Smaller transistors generate more heat per unit area. This could lead to thermal runaway, a condition in which temperature increases which releases energy that further increases temperature and the transistors stop working. Thus, as we approach the physical limits, thermal management becomes even more important. Another issue stems from the design of the chip. Traditional computers rely on the von Neumann architecture, which separates memory and computing. This leads to a bottleneck, as chips waste time and energy shuttling information back and forth from memory and the CPU. In other words, memory-related delays result in less-than-optimal computational power. Addressing these issues remains a crucial area of research and innovation.

In this Post-Moore’s era, the semiconductor industry has begun shifting away from transistor scaling and begun exploring new and exciting alternatives. One of the major points of focus is on multicore processing as the primary way to increase performance. This form of parallelism allows for each chip to have more cores, with each one operating at less than full capacity. In other words, this architectural solution works by having many instructions executed simultaneously, which improves chip performance.

At the forefront of this innovation is NVIDIA, a multinational tech company, that designs and sells GPUs, chips, and other technology. Their focus in recent years has been on fueling the AI revolution. As it stands, specialized hardware for artificial intelligence is in high demand and, NVIDIA has a near-monopoly on AI hardware due to its comparative advantage in chip production and graphics processing. On March 18th 2024, NVIDIA unveiled their Blackwell architecture, a commercial processor which signals the next chapter in AI computing power. Paraded as an AI superchip, NVIDIA Blackwell, boasts 208 billion transistors and promises up to 25x less cost and energy consumption and its predecessor. NVIDIA’s specialized AI-chip architecture is proof of the steps they are taking to continue improving hardware and addressing the challenges the semiconductor industry is facing.

The rest of the industry is also exploring novel solution beyond silicon. From quantum computing to neuromorphic computing to photonics to 3D integration, the sky is the ceiling for the future of technology. 

Quantum computing can process exponentially more data than regular computing. This is due to quantum entanglement, a phenomenon describing multiple particles with different quantum states existing simultaneously. This superposition allows different computational tasks to be done at the same time, thus producing exponential speed ups. Rather than using bits, it uses qubits, the basic unit of quantum information, and can represent both 0 and 1 at the same time. Quantum computers, then, have the potential of solving many tasks exponentially faster than classical computers.

Then there is neuromorphic computing, which attempts to mimic the analog nature of the human brain. Researchers at Intel have designed Loihi, a processor that operates more like a biological brain by receiving and transmitting signals through voltage spikes. Rather than relying on the von Neumann architecture, neuromorphic computers use artificial neurons to achieve further computing power.

Photonic computing is also an interesting avenue to consider, as it harnesses the power of light for data processing. The idea goes as follows: say you had two circuits, identical in functionality and power-consumption, except one uses electrical signals and the other one uses light. Replacing the electronic components with optical equivalents would result in the circuit running much faster and increased bandwidth. Though light propagation and modulation pose a challenge for photovoltaics, it remains a promising alternative to semiconductors. 

Another direction the semiconductor industry is looking at is stacking transistors in 3D. With the breakdown of Dennard scaling, planar transistors are approaching the physical limits of miniaturization. The advantage of 3D transistors over the 2D ones that are currently in use is the continued scaling. Stacking transistors in 3D provides faster operation and better power efficiency, thus circumventing the inefficiencies currently in place.

These innovations are transforming the semiconductor industry and the way chips are designed and fabricated. AI enhances manufacturing processes while manufacturing provides the foundation for AI. This interlacing of the two fields allows companies such as NVIDIA, AMD, Intel, TSMC, and other semiconductor companies to support Amazon, Meta, OpenAI, and countless others’ continued development of AI.The announcement of the Cerebras Wafer Scale Engine 3 (WSE-3) on March 13th 2024 is proof of this rapid expansion. The third generation of their wafer-scale AI megachips, the WSE-3 is purpose-built for cutting-edge AI work, with over 4 trillion transistors and 24 trillion parameter models. Transistors have revolutionized the modern world, and the breakthroughs associated with them have brought about immense technological change. The invention of the transistor marked an inflection point in human history, and artificial intelligence will be another. These innovations propel us towards an exciting and transformative future!

Author

Leave a Reply

This site uses Akismet to reduce spam. Learn how your comment data is processed.

You don't have permission to register

Discover more from Harvard Technology Review

Subscribe now to keep reading and get access to the full archive.

Continue reading