Tutorialllama cppmultimodalarduino uno qedge inference

Arduino UNO Q Runs Local LLMs And VLMs

hackster.io

|February 28, 2026

8.1

Relevance Score

Edge Impulse engineer Marc Pous recently demonstrated running LLMs and VLMs locally on the Arduino UNO Q using yzma (a Go wrapper for llama.cpp) and compact GGUF models from Hugging Face. He runs a 135M SmolLM2 text model and a 500M SmolVLM2 multimodal model on the board's Debian Linux, enabling fully offline inference. This enables privacy-preserving edge applications for robotics and smart-home experiments.

Arduino UNO Q Runs Local LLMs And VLMs

More AI & Data Science News

Coco Robotics Launches Coco 2 Delivery Bot

Generative Models Drive Scaled Content Abuse

Procore Technologies Presents Bull Case Investment Thesis

Legal Leaders Discuss AI Ethics And Judicial Morality

Scoring Rationale

Sources