Obscure startup wins prestigious CES 2024 award — you’ve probably never heard of it, but Panmnesia is the company that could make ChatGPT 6 (or 7) times faster

The highly coveted Innovation Award at the forthcoming Consumer Electronics Show (CES) 2024 event in January has been snapped up by a Korean startup for its AI accelerator.

Panmnesia has built its AI accelerator device on Compute Express Link (CXL) 3.0 technology, which allows an external memory pool to be shared with host computers, and components like the CPU, which can translate to near-limitless memory capacity. This is thanks to the incorporation of a CXL 3.0 controller into the accelerator chip.

CXL is used to connect system devices – including accelerators, memory expanders, processors, and switches. By linking up multiple accelerators and memory expanders using CXL switches, the technology can provide enough memory to an intensive system for AI applications.

What CXL 3.0 means for LLMs

The use of CXL 2.0 in devices like this would allow particular hosts access to their dedicated portion of pooled external memory, while the latest generation allows hosts to access the entire pool as and when needed.

“We believe that our CXL technology will be a cornerstone for next-generation AI acceleration system,” said Panmesia founder and CEO Myoungsoo Jung in a statement.

“We remain committed to our endeavor revolutionizing not only for AI acceleration system, but also other general-purpose environments such as data centers, cloud computing, and high-performance computing.”

Panmnesia’s technology works akin to how clusters of servers may share external SSDs to store data, and would be particularly useful for servers because they’ll often need to access more data that they can hold in the memory that’s in-built.

This device is built specifically for large-scale AI applications – and its creators claim it’s 101 times faster at performing AI-based search functions than conventional services, which use SSDs to store data, linked via networks. The architecture also minimizes energy costs and operational expenditure.

If used in the configuration of servers that the likes of OpenAI use to host its large language models (LLMs) such as ChatGPT, alongside hardware from other suppliers, it might drastically improve the performance of these models.