The security advisory discusses the performance of various local language models (LLMs) in real-world agent tasks using OpenClaw on a Raspberry Pi 5 equipped with an RTX 3090 GPU and Ollama. The models were evaluated based on their ability to perform email management, meeting scheduling, error detection, and browser automation among other tasks. The qwen3.5:27b-q4_K_M model outperformed all others by a significant margin with a score of 59.4%, while the larger qwen3.5:35b version achieved only 23.2%. Other models performed poorly, with scores below 5%. The primary factor that influenced performance was the ability to locate and use command-line tools effectively, highlighting potential vulnerabilities in the underlying software configurations that could be exploited if not properly secured.
- Ollama
- OpenClaw
- Review the configuration files for each model, specifically focusing on how they interact with command-line tools.
- Update models to the most recent version available if patches or improvements have been released since this test.
- Implement security best practices such as sandboxing and least privilege access when running LLMs in a production environment.
The findings indicate that homelab setups using Raspberry Pi 5 with RTX 3090 GPUs may experience significant performance variances across different models. This could impact the reliability of automated tasks such as email management and error detection, potentially exposing systems to risks if not properly secured.