LAS VEGAS–(BUSINESS WIRE)–#Linux—Tachyum® today announced that its AI team has successfully demonstrated an algorithm to perform LLM training in the quantized 4-bit FP4 format, dramatically reducing memory and compute requirements while delivering faster, more cost-effective and energy-efficient training without sacrificing model accuracy or downstream task performance. This advancement, detailed in the company’s latest white paper, “Tachyum demonstrates supercharged LLM training in only 4 bits,” offers transformative potential for LLM development, accelerating innovation by reducing capital and operational costs associated with training state-of-the-art AI.
Tachyum’s AI team has demonstrated that a foundation model fine-tuned on a task-specific dataset in FP4 data type, which represents value using 4-bit floating-point format rather than standard FP32 or BF16, achieves parity with traditional FP32 training baselines.
Quantization reduces numerical precision of values by reducing the number of bits in each numerical representation. Quantized models thereby achieve greater compression, requiring fewer compute, storage and memory demands. FP4 promises up to 4x better memory efficiency than 16-bit formats and up to 8x better efficiency than 32-bit.
LLMs increase in size by as much as 10x generation to generation, creating excessive training times that introduce significant delays and impact overall costs. The ability to train in FP4 addresses these pain points by improving performance, shortening training times and further reducing TCO.
“With AI model sizes doubling every 3 to 6 months, there is an ever-growing need for greater efficiency to narrow the ratio between model sizes and processing times,” said Dr. Radoslav Danilak, founder and CEO of Tachyum. “With models moving from 1T to 10T parameters – with the goal to achieve human brain processing of about 100T synapses – traditional FP32 or BF16 formats simply are a barrier to the AI revolution. By quantizing to the 4-bit FP4 format, we continue to transform the economics of AI by delivering industry-leading performance, cost and power efficiency required to properly train LLMs.”
As a Universal Processor offering industry-leading performance for all workloads, Tachyum Prodigy®-powered data center servers can seamlessly and dynamically switch between computational domains (such as AI/ML, HPC, and cloud) with a single homogeneous architecture. By eliminating the need for expensive dedicated AI hardware and dramatically increasing server utilization, Prodigy reduces CAPEX and OPEX significantly while delivering unprecedented data center performance, power, and economics. Prodigy integrates 256 high-performance custom-designed 64-bit compute cores to deliver up to 18x the highest performing GPU for AI applications, 3x the performance of the highest-performing x86 processors for cloud workloads, and up to 8x that of the highest performing GPU for HPC.
Those interested in learning more about Tachyum’s unique approach in solving the most challenging barriers to AI can download the “Tachyum demonstrates supercharged LLM training in only 4 bits” white paper at https://www.tachyum.com/resources/whitepapers/2025/10/08/tachyum-demonstrates-supercharged-llm-training-in-only-4-bits/.
Follow Tachyum
https://x.com/tachyum
https://www.linkedin.com/company/tachyum
https://www.facebook.com/Tachyum/
About Tachyum
Tachyum is transforming the economics of AI, HPC, public and private cloud workloads with Prodigy, the world’s first Universal Processor. Prodigy unifies the functionality of a CPU, a GPU, and a TPU in a single processor to deliver industry-leading performance, cost and power efficiency for both specialty and general-purpose computing. As global data center emissions continue to contribute to a changing climate, with projections of their consuming 10 percent of the world’s electricity by 2030, the ultra-low power Prodigy is positioned to help balance the world’s appetite for computing at a lower environmental cost. Tachyum received a major purchase order from a US company to build a large-scale system that can deliver more than 50 exaflops performance, which will exponentially exceed the computational capabilities of the fastest inference or generative AI supercomputers available anywhere in the world today. Tachyum has offices in the United States, Slovakia and the Czech Republic. For more information, visit https://www.tachyum.com/.
Contacts
Mark Smith
JPR Communications
818-398-1424
[email protected]