Google Gemma 4 AI Models Get 3x Speed Boost with MTP Drafting

Gemini AI Google Gemma 4 Multi-Token Prediction AI speed boost edge AI local AI models speculative decoding Apache 2.0 license AI hardware optimization consumer GPU AI

Google launched its Gemma 4 open models this spring, promising a new level of power and performance for local AI. The tech giant’s approach to edge AI is set to become even faster with the release of Multi-Token Prediction (MTP) drafters for Gemma. Google describes these experimental models as leveraging a form of speculative decoding to predict future tokens, significantly accelerating generation compared to traditional token-by-token processing.

The latest Gemma models are built on the same underlying technology that powers Google’s frontier Gemini AI. However, they are specifically tuned to run locally rather than on Google’s cloud infrastructure. While Gemini is optimized for Google’s custom TPU chips—operating in high-performance clusters with ultra-fast interconnects and memory—the Gemma 4 models are designed to function on consumer-grade hardware. A single high-power AI accelerator can run the largest Gemma 4 model at full precision, and quantization techniques enable it to operate on a consumer GPU.

Gemma empowers users to experiment with AI directly on their own devices, reducing reliance on cloud-based AI systems that require sharing data with third parties. In a notable shift, Google updated the license for Gemma 4 to Apache 2.0, a far more permissive open-source license compared to the custom terms used for previous Gemma releases. Despite these advancements, local AI models still face hardware limitations for many users—an issue MTP aims to address.

Source: Ars Technica

← Previous

Poll: Trump Voters Overwhelmingly Reject His Jesus Post as Blasphemous...

Joe Scarborough Accuses Trump, Vance and GOP of Being 'Out of Touch' Over Iran War and White House Ballroom Costs

12:08 · 15 May 2026

UK Tax Authority Deploys AI to Combat Fraud with Human Oversight

Human staff members will still check the AI's findings.

11:00 · 15 May 2026

AI-Generated Citations Flood Academic Papers, Raising Concerns Among Researchers

Last summer, Peter Degen's postdoctoral supervisor came to him with an unusual problem: One of his papers was being cited too much. Citations are the...

10:30 · 15 May 2026

Anthropic’s Claude Code: No Long-Term Roadmap, Focus on Model Improvements and Developer Feedback

SAN FRANCISCO—Amid an ever-expanding array of surfaces, growing demand for tokens and compute, and a rapidly evolving user base, Anthropic doesn't hav...

06:08 · 15 May 2026

xAI Unveils Grok Build: New AI Coding Agent in Early Beta for Elite Users

It's in early beta and only available to SuperGrok Heavy subscribers right now.

22:46 · 14 May 2026

Honda Unveils Next-Gen Hybrid Accord and RDX Prototypes, Shifts EV Targets to Hybrids

Honda revealed prototypes of two new hybrid models, an Accord sedan and the Acura RDX SUV, during its annual business briefing this week, built on a p...

22:21 · 14 May 2026

Musk vs. Altman Trial: Closing Arguments Reveal Legal Missteps and Evidence Mount

Today was closing arguments in the Musk v. Altman trial, and I almost feel bad writing about the unbelievable demolition derby I just witnessed. Steve...

21:08 · 14 May 2026

Meta Ray-Ban Display Glasses Now Support Gesture-Based Messaging in WhatsApp, Messenger, and More

Meta is rolling out new features to its Meta Ray-Ban Display smart glasses, including bringing the ability to write messages just with hand gestures t...

20:59 · 14 May 2026

Elon Musk’s 'Jackass' Trophy Takes Center Stage in OpenAI Trial

Yesterday, in Musk v. Altman, before the jurors came in, Sam Altman's team passed up what looked - from a distance - like a little league trophy. It w...

Technology

Google's Gemma 4 AI Models Now 3x Faster with Multi-Token Prediction Drafting

Poll: Trump Voters Overwhelmingly Reject His Jesus Post as Blasphemous...

Joe Scarborough Accuses Trump, Vance and GOP of Being 'Out of Touch' O...

Technology

Google's Gemma 4 AI Models Now 3x Faster with Multi-Token Prediction Drafting

Poll: Trump Voters Overwhelmingly Reject His Jesus Post as Blasphemous...

Joe Scarborough Accuses Trump, Vance and GOP of Being 'Out of Touch' O...

Related articles

UK Tax Authority Deploys AI to Combat Fraud with Human Oversight

AI-Generated Citations Flood Academic Papers, Raising Concerns Among Researchers

Anthropic’s Claude Code: No Long-Term Roadmap, Focus on Model Improvements and Developer Feedback

xAI Unveils Grok Build: New AI Coding Agent in Early Beta for Elite Users

Honda Unveils Next-Gen Hybrid Accord and RDX Prototypes, Shifts EV Targets to Hybrids

Musk vs. Altman Trial: Closing Arguments Reveal Legal Missteps and Evidence Mount

Meta Ray-Ban Display Glasses Now Support Gesture-Based Messaging in WhatsApp, Messenger, and More

Elon Musk’s 'Jackass' Trophy Takes Center Stage in OpenAI Trial