Use Cases
- Test DigitalOcean AI API response times under load
- Validate AI service reliability and uptime
- Capacity planning for AI applications
- Compare different Llama3 model variants
- Monitor API rate limits and quotas
Simple Implementation
Setup Instructions
-
Get DigitalOcean AI Access:
- Sign up for DigitalOcean account
- Enable AI services in your project
- Generate an API token from the control panel
-
Configure the Script in LoadForge:
- Copy the script into LoadForge’s test editor
- Replace
your-digitalocean-ai-token
with your actual API token - Set the target host URL to your DigitalOcean AI endpoint
-
Configure Load Test Settings:
- Start with 1-5 virtual users to test connectivity
- Set appropriate ramp-up time to avoid rate limits
- Monitor response times and error rates
What This Tests
- Response Times: Measure latency for Llama3.3 70B model
- Throughput: Test concurrent request handling
- Rate Limits: Understand DigitalOcean AI quotas and limits
- Reliability: Check API stability under sustained load
- API Performance: Validate DigitalOcean AI service quality
Expected Performance
Typical results for Llama 3.3 70B on DigitalOcean AI:- Response Time: ~3-6 seconds per request
- Quality: High-quality responses with latest model improvements
- Throughput: Suitable for production workloads with proper scaling
Rate Limits & Pricing
- Request Limits: Varies by plan and model
- Token Limits: Based on input/output tokens
- Concurrent Requests: Limited per account tier
- Pricing: Pay-per-use model based on tokens consumed
Common Issues
- Authentication: Ensure API token has correct permissions
- Rate Limiting: Start with low user counts to avoid 429 errors
- Endpoint URLs: Verify the correct DigitalOcean AI endpoint
- Token Limits: Monitor usage to avoid exceeding quotas
Best Practices
- Gradual Ramp-up: Start with 1-5 users, increase gradually
- Monitor Costs: Track token usage to avoid unexpected charges
- Error Handling: Implement proper retry logic for production use
- Caching: Consider caching responses for repeated queries