Tutorialllama cppmultimodalarduino uno qedge inference
Arduino UNO Q Runs Local LLMs And VLMs
8.1
Relevance ScoreEdge Impulse engineer Marc Pous recently demonstrated running LLMs and VLMs locally on the Arduino UNO Q using yzma (a Go wrapper for llama.cpp) and compact GGUF models from Hugging Face. He runs a 135M SmolLM2 text model and a 500M SmolVLM2 multimodal model on the board's Debian Linux, enabling fully offline inference. This enables privacy-preserving edge applications for robotics and smart-home experiments.


