Groq Provider
Groq runs open-source models on custom LPU (Language Processing Unit) hardware, delivering extremely low latency inference – often 10-20x faster than cloud GPU providers.
Get an API Key
- Go to console.groq.com
- Create an account or sign in
- Navigate to API Keys
- Create a new key and copy it
Groq offers a free tier with rate limits.
Configure VibeCody
Option 1: Environment variable (recommended)
export GROQ_API_KEY="gsk_..."
vibecli --provider groq
Option 2: Config file (~/.vibecli/config.toml)
[groq]
enabled = true
api_key = "gsk_..."
model = "llama-3.3-70b-versatile"
Model Selection
| Model | Strengths | Best for |
|---|---|---|
llama-3.3-70b-versatile |
Strong general coding | Daily coding tasks |
llama-3.1-8b-instant |
Ultra-fast responses | Quick completions, simple edits |
mixtral-8x7b-32768 |
Good balance, 32K context | Longer code analysis |
Default: llama-3.3-70b-versatile
Override from the CLI:
vibecli --provider groq --model llama-3.1-8b-instant
Pricing
Groq offers a generous free tier with rate limits. Paid plans remove rate limits and add priority access.
Best For
- Ultra-fast iteration – responses arrive in under a second
- Interactive coding sessions – low latency makes back-and-forth feel instant
- Running open-source models – access Llama, Mixtral without hosting them yourself
Verify Connection
vibecli --provider groq -c "Write a Go function to reverse a linked list"
Troubleshooting
Rate limited on free tier
Error: 429 Too Many Requests
- Free tier has per-minute and per-day token limits
- Wait 60 seconds and retry, or upgrade to a paid plan
Model not available
- Groq’s model catalog changes; check console.groq.com/docs/models for current availability