- GSI Gemini-I APU reduces fixed knowledge shuffling between the processor and reminiscence programs
- Completes retrieval duties as much as 80% sooner than comparable CPUs
- GSI Gemini-II APU will ship ten occasions increased throughput
GSI Technology is selling a brand new strategy to synthetic intelligence processing that locations computation straight inside reminiscence.
A brand new research by Cornell University attracts consideration to this design, generally known as the associative processing unit (APU).
It goals to beat long-standing efficiency and effectivity limits, suggesting it may problem the dominance of the very best GPUs presently utilized in AI instruments and knowledge facilities.
A brand new contender in AI {hardware}
Published within the ACM journal and introduced on the latest Micro ’25 convention, the Cornell analysis evaluated GSI’s Gemini-I APU in opposition to main CPUs and GPUs, together with Nvidia’s A6000, utilizing retrieval-augmented era (RAG) workloads.
The exams spanned datasets from 10 to 200GB, representing life like AI inference circumstances.
By performing computation inside static RAM, the APU reduces the fixed knowledge shuffling between the processor and reminiscence.
This is a key supply of vitality loss and latency in typical GPU architectures.
The outcomes confirmed the APU may obtain GPU-class throughput whereas consuming far much less energy.
GSI reported its APU used as much as 98% much less vitality than a regular GPU and accomplished retrieval duties as much as 80% sooner than comparable CPUs.
Such effectivity may make it interesting for edge units akin to drones, IoT programs, and robotics, in addition to for protection and aerospace use, the place vitality and cooling limits are strict.
Despite these findings, it stays unclear whether or not compute-in-memory expertise can scale to the identical degree of maturity and assist loved by the very best GPU platforms.
GPUs presently profit from well-developed software program ecosystems that enable seamless integration with main AI instruments.
For compute-in-memory units, optimization and programming stay rising areas that might sluggish broader adoption, particularly in giant knowledge heart operations.
GSI Technology says it’s persevering with to refine its {hardware}, with the Gemini-II era anticipated to ship ten occasions increased throughput and decrease latency.
Another design, named Plato, is in improvement to additional lengthen compute efficiency for embedded edge programs.
“Cornell’s independent validation confirms what we’ve long believed, compute-in-memory has the potential to disrupt the $100 billion AI inference market,” stated Lee-Lean Shu, Chairman and Chief Executive Officer of GSI Technology.
“The APU delivers GPU-class performance at a fraction of the energy cost, thanks to its highly efficient memory-centric architecture. Our recently released second-generation APU silicon, Gemini-II, can deliver roughly 10x faster throughput and even lower latency for memory-intensive AI workloads.”
Via TechPowerUp