homehome Home chatchat Notifications


This Engineer Put an AI Language Model on a USB Stick and It Actually Works

A Raspberry Pi-powered USB stick runs a lightweight AI model, making LLMs portable and plug-and-play.

Tibi Puiu
February 17, 2025 @ 9:45 pm

share Share

In a world where artificial intelligence often seems to demand the computing power of a small city, one engineer has managed to shrink it down to the size of a USB stick. Meet the pocket-sized language model, a feat of ingenuity that proves big ideas don’t always need big hardware.

Large language models (LLMs) like GPT and LLaMA have become the rock stars of the AI world, capable of generating human-like text, answering questions, and even writing code. But these models typically rely on billions of parameters and require massive data centers to function. Enter YouTuber Binh, a tinkerer who decided to challenge the status quo by cramming an LLM onto a USB stick.

This isn’t your average flash drive. Inside its custom 3D-printed case lies a Raspberry Pi Zero W, a tiny computer no bigger than a stick of gum. Running on this modest hardware is llama.cpp, a lightweight version of the LLaMA model from Meta. But getting the software to work on the Pi wasn’t easy. The latest version of llama.cpp is designed for ARMv8 processors, while the Raspberry Pi Zero W runs on the older ARMv6 architecture. So he had to painstakingly remove the ARMv8 optimizations.

His persistence paid off, and he successfully adapted the model to run on the older hardware. The result is a portable AI that fits in your pocket — no cloud computing required.

Plug-and-Play AI

The real magic of this project lies in its simplicity. Binh designed the USB stick to be a composite device, meaning it can interact with any computer without requiring special drivers. To use the LLM, all you need to do is plug in the USB stick, create an empty text file, and give it a name. The model automatically generates text and saves it to the file.

While it’s not as fast as its cloud-based counterparts, the USB-based LLM is a groundbreaking proof of concept, as first seen on Hackaday. “I believe this is the first plug-and-play USB-based LLM,” Binh said. And he’s probably right.

This project isn’t just a clever hack; it’s a glimpse into the future of AI accessibility. By making language models portable and easy to use, Binh has opened the door to new possibilities. Imagine students in remote areas using USB-based LLMs for homework help, or journalists in the field generating drafts without an internet connection.

It also raises questions about the environmental impact of AI. Large models require vast amounts of energy, contributing to carbon emissions. Smaller, more efficient models like this one could help reduce that footprint.

Of course, there are limitations. The Raspberry Pi Zero W has only 512MB of RAM, which restricts the size and complexity of the model it can run. But as hardware improves, so too will the capabilities of these pocket-sized AIs.

For now, Binh’s USB stick is a reminder that innovation doesn’t always mean building bigger and faster. Sometimes, it’s about thinking smaller. And in this case, small is mighty.

share Share

Scientists Turn Timber Into SuperWood: 50% Stronger Than Steel and 90% More Environmentally Friendly

This isn’t your average timber.

A Provocative Theory by NASA Scientists Asks: What If We Weren't the First Advanced Civilization on Earth?

The Silurian Hypothesis asks whether signs of truly ancient past civilizations would even be recognisable today.

Scientists Created an STD Fungus That Kills Malaria-Carrying Mosquitoes After Sex

Researchers engineer a fungus that kills mosquitoes during mating, halting malaria in its tracks

From peasant fodder to posh fare: how snails and oysters became luxury foods

Oysters and escargot are recognised as luxury foods around the world – but they were once valued by the lower classes as cheap sources of protein.

Rare, black iceberg spotted off the coast of Labrador could be 100,000 years old

Not all icebergs are white.

We haven't been listening to female frog calls because the males just won't shut up

Only 1.4% of frog species have documented female calls — scientists are listening closer now

A Hawk in New Jersey Figured Out Traffic Signals and Used Them to Hunt

An urban raptor learns to hunt with help from traffic signals and a mental map.

A Team of Researchers Brought the World’s First Chatbot Back to Life After 60 Years

Long before Siri or ChatGPT, there was ELIZA: a simple yet revolutionary program from the 1960s.

Almost Half of Teens Say They’d Rather Grow Up Without the Internet

Teens are calling for stronger digital protections, not fewer freedoms.

China’s Ancient Star Chart Could Rewrite the History of Astronomy

Did the Chinese create the first star charts?