Google Brings Gemma 4 12B and AI Edge Gallery to Mac: Frontier-Grade AI Now Runs on Your Laptop, Offline
Google launches Gemma 4 12B, a multimodal 12-billion-parameter model that matches its 26B MoE variant in performance, alongside the launch of AI Edge Gallery for macOS, the company's first native Mac app for running Gemma models entirely offline on consumer hardware.
Google DeepMind’s latest open model, Gemma 4 12B, is designed to bring agentic, multimodal intelligence directly to your laptop. Arriving alongside Google AI Edge Gallery for macOS, the company’s first native Mac app for running AI models entirely locally, and AI Edge Eloquent, a free on-device dictation tool.
Unlike cloud-based AI systems such as ChatGPT, local AI models run entirely on a user’s computer, no internet connection required, and no data sent to external servers.
The three-product release is Google’s clearest statement yet on what on-device AI should look like: genuinely useful, private by default, and fast on hardware most people already own.
What Gemma 4 12B Can Do on a Consumer Laptop
While most consumer-facing local models from frontier AI labs stay between 2 billion and 9 billion parameters, Gemma 4 12B delivers performance comparable to Google’s 26-billion-parameter MoE model while remaining small enough to run on a laptop with 16GB of RAM.
The fully multimodal model supports text, image, and audio inputs, handling tasks from content creation to software development and data analysis, making it the most capable multimodal model yet designed for consumer laptop hardware.
Combined with the Google AI Edge stack, it enables on-device capabilities ranging from autonomous data processing and rich visual insights to building functional webpages and executing everyday tool use.
The model runs on Apple Silicon Macs and Windows machines with discrete Nvidia GPUs, with Google recommending at least 16GB of unified memory for smooth inference.
AI Edge Gallery and the Walled Garden Worth Knowing About
AI Edge Gallery lets Mac users run Gemma models locally with no internet connection required, using the computer’s own processing power. The free app is available now on macOS.
Alongside AI Edge Gallery, Google launched AI Edge Eloquent for macOS, a dictation tool that records speech, transcribes it, and improves the text by removing filler words, correcting disfluencies, and enhancing readability; all processed locally on the device.
One limitation is worth noting. Platforms like Ollama and LM Studio, which dominate the local AI scene, let users run virtually any compatible open-weight model. Google AI Edge Gallery, by contrast, supports only Google’s models, at least for now.
As 9to5Mac noted, that makes it feel less like a general-purpose local AI platform and more like a showcase for Google’s model family.
Whether Google opens the gallery to third-party models, as it has with the more open-ended Google Gemini Mac app, could determine how widely developers adopt it beyond the Gemma ecosystem.
Why On-Device AI Is Having Its Moment Right Now
With AI Edge Gallery, Gemma 4 12B, and Eloquent, Google is making a strong push for decentralized AI computing: running advanced models directly on consumer hardware reduces reliance on cloud infrastructure while improving performance and privacy.
The timing is deliberate. Apple is just days away from WWDC 2026, where iOS 27 is expected to introduce third-party AI model support in Apple Intelligence through the Extensions framework, bringing AI models like Gemini and Claude to iPhone as system-level alternatives.
By launching Gemma 4 12B on Mac just days before that announcement, Google is positioning its local AI infrastructure ahead of a moment when AI model competition on Apple platforms becomes more direct and visible.
Source: Introducing Gemma 4 12B



