Privacy Benefits and Competitive Positioning
Many users appreciate the model as a private, local alternative to frontier APIs, noting its superior tool-calling abilities compared to previous Gemma versions.
While users value Gemma 4 for its local privacy and impressive vision capabilities, many express frustration regarding its heavy hardware requirements and occasional logical failures or repetitive hallucinations.
Many users appreciate the model as a private, local alternative to frontier APIs, noting its superior tool-calling abilities compared to previous Gemma versions.
Users are heavily debating VRAM constraints and slow tokens-per-second rates, specifically questioning how 31B models and various quantizations perform on consumer GPUs.
Discussion highlights issues with model consistency, including endless output loops and hallucinations during coding tasks or complex logical reasoning tests.
The community shows high enthusiasm for vision performance, specifically praising the model's accuracy in OCR tasks and its ability to handle bounding boxes.
Amazing! shrinking size while high intelligence, Apache 2, works on Phones. This is the 2026 news we need. Keep going Google.
I installed Gemma 4 (gemma4:e2b 7.2GB) running locally derived a Constraint-Dynamical Hamiltonian for a Clifford algebra project I am working on. And you have to trust me what it provided was amazing... So I have a thinking model running locally using RAG to read files and can do quite advanced math... that is the absolute bomb.. :)
Gemma 4 welcome! 🎉 And thanks to everyone behind Gemma 4's development. We all appreciate the incredible work you all do.
Not surprised. Gemma is just a mini Gemini, it's good with that stuff. Where GLM 5.1 shines is coding.
I don't know how you ran it, if you're running it locally using llama.cpp, use the b8660 llama.cpp build (more recent versions have a regression, another tokenization issue) and use --temp 0.3 --top-p 0.9 --min-p 0.1 --top-k 20 I am sure the 26B will do much better. Also, Claude might favor better formatting etc., a boolean test is not good. Try the below prompt for the judge: I am benchmarking many AIs in many tasks. You are a judge. Go through them question by question, not LLM by LLM. Go through each question and, for every question, give all AIs a score out of 10, and be sure to be fair with them. Later, rank them all by their total score. MAKE SURE to evaluate them correctly, not based on vibe alone (check for misinformation, hallucinations, if they are useful or not, and not on formatting). PROMPT= AI 1: ... AI 2: ....
LLM as judge = no thanks. It also depends how you're running Gemma 4 for the test. The new custom parser for gemma 4 in llama.cpp b8665 has fixed it for me. Before, it failed the test of just being given the image below. Now it solves it.
Super excited about the direction things are going. Next generation will be frontier quality for most daily uses and fit on a single solid GPU like the Intel B70. A couple more turbo quant type advances and we're there on SOTA phones, prob two generations. Genuinely concerned about the economy if the AI takeoff is entirely agents running on edge devices and the major labs' trillions in capital goes stale, but very glad we're leaning towards the good path where AI won't be controlled by the few.
Gemma 4 is the first actual leap AI did in a "long" time. It makes it smaller but also use less computing power. I am running it on my PC and while it takes up 20gb its equivalent to a 400gb model… insane and on Apache 2.0 so you can make and sell any product you make with it.
Even since Gemma 2 it's been useful for being good at interacting instead of being a 'yes man' (girl). Agreeableness is a flaw and I don't like it in Qwen. (I'm absolutely right)
qwen3 coder next losing to the 4b at actual game logic is the most demoralizing benchmark result i've seen this week, playwright mcp doing the heavy lifting probably explains a lot of the variance here.
Graph based on sampled comments per item (n≤30)
themanmaran
r/LocalLLaMA
r/LocalLLaMA
r/LocalLLaMA
r/LocalLLaMA
r/LocalLLaMA
r/LocalLLaMA
r/LocalLLaMA
r/LocalLLaMA
r/LocalLLaMA
r/LocalLLaMA
r/LocalLLaMA
NetworkChuck
Google for Developers
DIY Smart Code
Teacher's Tech
零度解说
Zero to MVP
ByteMonk
Prasadtechintelugu
零度解说
Zero to MVP
Bart Slodyczka
Ishan Sharma