Self-hosting the workflow did not actually give us control over LLM calls

MEDIUM

The severity is rated MEDIUM due to the potential for unauthorized or unintended access to external LLM services, which could lead to data leakage or increased costs. Real-world exploitability in both homelab and production environments depends on how well internal controls are enforced; patches exist but require configuration changes rather than simple upgrades.

Despite self-hosting our workflow stack within our own infrastructure, we found that achieving true control over Large Language Model (LLM) calls remained elusive. The issue arose from multiple points of egress for prompts and scattered provider keys across various services, which resulted in a lack of visibility into model interactions and compliance with security policies. Specifically, this setup led to unintended usage of different LLM providers without proper oversight or cost control. This scenario highlights the need for stricter access controls and centralized logging mechanisms to ensure that LLM calls are auditable and aligned with organizational policies.

Affected Systems

Self-hosted workflow stack with distributed LLM service calls
Containers managed within the infrastructure

Affected Versions: All versions of the self-hosted workflow where provider keys and prompts are not centrally controlled

Remediation

Create a single entry point for all LLM model calls by redirecting traffic through a centralized service. For example, update Dockerfile configurations to route requests via this new service.
Implement strict access controls using IAM roles or equivalent mechanisms to ensure that only authorized services can make LLM calls. Modify the configuration files in /etc/iam/config.yaml to enforce these roles.
Centralize logging for all LLM interactions by configuring a unified log aggregator, such as ELK stack, to capture detailed information on each call's origin and purpose.
Audit current configurations and remove redundant provider keys from unauthorized services or environments. Use specific commands like `rm /path/to/provider/key.json` to ensure no key is left unattended.

Stack Impact

This issue directly impacts homelab stacks where multiple services independently manage LLM calls, potentially leading to inconsistent security policies and unexpected costs.

Source →