I use Opencode because it is open source, provider agnostic, and it speaks to local providers like Ollama or LM Studio.
For teams that need more control, there is a clean enterprise path with SSO and self-hosting. Rules live in the repo, not in my head.
Why enterprise setups matter
Running LLMs inside a company is not about chasing the newest model: it is about trust and control.
Enterprise use requires more than just an API key:
- Data privacy: especially in the EU, teams must ensure code and prompts never leave the company boundary.
- Model control: ability to pin model versions, roll back upgrades, and know what weights are running.
- Custom training and fine-tuning: adapting to domain language or code conventions.
- Authentication and SSO: align access with company identity systems.
- Monitoring and observability: track usage, latency, and errors like any other service.
- Support and continuity: someone needs to own updates, scaling, and security patches.
- Integration with CI/CD: interactive coding sessions should map cleanly to automated checks.
These requirements are what distinguish an enterprise pattern from a weekend experiment.
Quick demo: Local edge box (Jetson Orin + Ollama + Qwen Coder)
This is the easiest way to try the setup without hosting big infrastructure. Think of it as a proof-of-concept or a hackathon box, not an enterprise solution.
Why use it
- Fast feedback in the terminal.
- No egress: code and prompts stay on the device.
- Zero cluster work, so anyone can demo it.
Setup
- Install Ollama on Jetson Orin (native or Docker).
- Pull a coding model:
qwen2.5-coder:7borqwen3-coder:30bif VRAM allows. - Point Opencode to the local OpenAI-compatible endpoint.
Quick smoke test
opencode auth login
opencode models
opencode run "list all REST endpoints in this repo"
Limitations
- Not for multi-team scale or compliance.
- Context and runtime limits vary by device.
- Best used to explore, not to deploy.
Enterprise: Internal hosted service (vLLM + Kubernetes + Opencode)
This is the pattern that scales: it mimics a hosted LLM API, but it runs inside your network. Security teams can review it, and engineering teams can share it.
Why use it
- Consistent behavior across teams and CI.
- Centralized quotas, logs, and upgrades.
- Private DNS only: clean boundary for reviews.
- Meets enterprise needs: privacy, SSO, monitoring, and model control.
Setup
- Deploy vLLM with a GPU-backed Deployment and an internal Service.
- Expose an OpenAI-compatible endpoint inside the network.
- Point Opencode to that endpoint. Add SSO and a policy to disable sharing.
Design choices
- One namespace per environment.
- Token-based access. Rotate regularly.
- Per-model quotas: keep requests predictable.
- No public ingress by default.
Quick smoke test
opencode auth login
opencode models
opencode run "summarize the changes in this branch"
What I commit to the repo
AGENTS.mdwith a primary agent and one focused subagent (db-migrator). Tools are minimal. Rules are short..aider.conf.ymlif the team also uses Aider. Model, file globs, and provider url go here.- A tiny loop at the end of every session: ask the agent what it learned and how it would change the system prompt. If it is useful, fold one or two lines back into
AGENTS.mdin a small PR.
My end-of-session prompt
- From this session, what did you learn about this repo and our patterns?
- Rewrite the system prompt for this project with those insights in under 200 words.
- Name three concrete guardrails we should add to avoid mistakes next time.
Runbooks I actually use
Demo: Jetson Orin + Ollama + Qwen
- Install Ollama. Start the server.
ollama pull qwen2.5-coder:7bor larger coder model if VRAM allows.- Increase context to match your workflow. Confirm tool use support.
opencode auth login→ set local endpoint →opencode models- Commit
AGENTS.mdwith minimal tools. Work in feature branches.
Enterprise: vLLM + Kubernetes + Opencode
- Apply the Kubernetes manifest below or your own version.
- Restrict the Service to internal networks. Add token auth and limits.
- Point Opencode at the internal endpoint. Confirm from CI.
- Add one read-only MCP server later if you need docs search. Start small.
- Consider SSO and self-host options when more teams join.