A paper titled MCM-GPU: Multi-Chip-Module GPUs for Continued Performance Scalability has been published by: Arizona State University, NVIDIA, University of Texas at Austin and Barcelona Supercomputing Center / Universitat Politecnica de Catalunya.
The paper discusses the prospect of packing multiple, interconnected, GPU modules onto a single graphics card as opposed to having a single, monolithic, GPU on the card. as shown below. They call this design the Multi-chip-module GPU or MCM for short.
The idea being that the putting lots of little GPU modules onto a PCB rather than one ‘Monolithic GPU’ would allow Nvidia to pack more SMs onto a graphics card. The paper discusses that many of todays GPGPU workloads scale very well with an increased number of SMs (shown in the figure below this paragraph) and how this, along with the slowing of Moores Law, are the main motivations for the MCM-GPU design.
The paper then goes on to discuss the drawbacks of increasing the total number of graphics cards in the users system when compared to having all the GPU interconnects, scheduling and communication occur on one card. They propose that multiple GPUs have the drawbacks of relying on programmer expertise and system connectivity in order to make the best use of them that the MCM design won’t suffer from because it will be seen as one GPU by the system.
There is a long period discussing cache architecture that I won’t go into great detail of in this summary. It is worth noting, however, that cache and scheduling of this design is what will make it different than a multi-GPU and if you are interested this part of the paper is worth reading in more detail.
At the end of the paper they go on to discuss the performance potential of this design (that they say is feasible on the next generation of manufacturing node) as shown below (it looks good).
In reading this paper I can’t help be reminded of something else. Specifically AMDs ZEN architecture. The modular GPU design seems to share certain design inspirations from the modular what that AMD now pack CCX modules onto a die (each one containing 4 cores). While the idea may be similar this does, potentially, bode well for Nvidia. Ryzen is proving to be an impressive processor series with people very excited about EPYC and Threadripper. Not only that but reports suggest that designing CPUs in this way is VERY cost efficient when compared to trying to make one massive CPU.
The design will likely have hurdles and potentially performance issues if scheduling and inter-GPU communication isn’t handled correctly but if Ryzen is anything to go by this MCM-GPU design is something people should consider getting excited about.