DigitalOcean AI Performance Testing

Use Cases
Simple Implementation
Setup Instructions
What This Tests
Expected Performance
Rate Limits & Pricing
Common Issues
Best Practices

This guide shows how to performance test DigitalOcean’s AI platform using Locust. Perfect for testing response times, reliability, and capacity planning for AI workloads on DigitalOcean.

Use Cases

Test DigitalOcean AI API response times under load
Validate AI service reliability and uptime
Capacity planning for AI applications
Compare different Llama3 model variants
Monitor API rate limits and quotas

Simple Implementation

import random
from locust import HttpUser, task, between

class Llama3ChatUser(HttpUser):
    wait_time = between(1, 5)

    QUESTIONS = [
        "What is the capital of France?",
        "Translate 'Hello, how are you?' into Spanish.",
        "Who wrote 'Pride and Prejudice'?",
        "What's 13 multiplied by 17?",
        "Name three benefits of a vegan diet.",
        "Give me a quick summary of the plot of '1984'.",
        "Explain the concept of machine learning in simple terms.",
        "What are the main differences between Python and JavaScript?",
        "How does photosynthesis work?",
        "What are some tips for effective time management?"
    ]

    @task
    def chat_completion(self):
        question = random.choice(self.QUESTIONS)

        # Used to show a preview of the question in LF
        preview = (question[:20] + "...") if len(question) > 20 else question

        payload = {
            "model": "llama3.3-70b-instruct",
            "messages": [{"role": "user", "content": question}],
            "stream": False,
            "include_functions_info": False,
            "include_retrieval_info": False,
            "include_guardrails_info": False
        }
        headers = {
            "Content-Type": "application/json",
            "Authorization": f"Bearer {self.api_token}"
        }

        with self.client.post(
            "/api/v1/chat/completions",
            json=payload,
            headers=headers,
            name=f"chat: {preview}",
            catch_response=True
        ) as response:            
            if response.status_code == 200:
                response.success()
            else:
                response.failure(f"Status {response.status_code}")

         def on_start(self):
         # Set your DigitalOcean AI API token here
         self.api_token = "your-digitalocean-ai-token"

Setup Instructions

Get DigitalOcean AI Access:
- Sign up for DigitalOcean account
- Enable AI services in your project
- Generate an API token from the control panel
Configure the Script in LoadForge:
- Copy the script into LoadForge’s test editor
- Replace your-digitalocean-ai-token with your actual API token
- Set the target host URL to your DigitalOcean AI endpoint
Configure Load Test Settings:
- Start with 1-5 virtual users to test connectivity
- Set appropriate ramp-up time to avoid rate limits
- Monitor response times and error rates

What This Tests

Response Times: Measure latency for Llama3.3 70B model
Throughput: Test concurrent request handling
Rate Limits: Understand DigitalOcean AI quotas and limits
Reliability: Check API stability under sustained load
API Performance: Validate DigitalOcean AI service quality

Expected Performance

Typical results for Llama 3.3 70B on DigitalOcean AI:

Response Time: ~3-6 seconds per request
Quality: High-quality responses with latest model improvements
Throughput: Suitable for production workloads with proper scaling

Rate Limits & Pricing

Request Limits: Varies by plan and model
Token Limits: Based on input/output tokens
Concurrent Requests: Limited per account tier
Pricing: Pay-per-use model based on tokens consumed

Common Issues

Authentication: Ensure API token has correct permissions
Rate Limiting: Start with low user counts to avoid 429 errors
Endpoint URLs: Verify the correct DigitalOcean AI endpoint
Token Limits: Monitor usage to avoid exceeding quotas

Best Practices

Gradual Ramp-up: Start with 1-5 users, increase gradually
Monitor Costs: Track token usage to avoid unexpected charges
Error Handling: Implement proper retry logic for production use
Caching: Consider caching responses for repeated queries

DeFi Protocol Testing Infura Node Testing

⌘I

Overview

Getting Started

Advanced API Patterns

Advanced API Testing

Advanced Testing Patterns

Authentication & Security

Browser Testing

Compliance & Security

E-Commerce & Business

Emerging Technology

Geographic & Localization

Infrastructure & Scaling

Modern Web Frameworks

Performance Testing

QA Testing

Real-Time Applications

Specialized Performance

DigitalOcean AI Performance Testing

Use Cases

Simple Implementation

Setup Instructions

What This Tests

Expected Performance

Rate Limits & Pricing

Common Issues

Best Practices

Overview

Getting Started

Advanced API Patterns

Advanced API Testing

Advanced Testing Patterns

Authentication & Security

Browser Testing

Compliance & Security

E-Commerce & Business

Emerging Technology

Geographic & Localization

Infrastructure & Scaling

Modern Web Frameworks

Performance Testing

QA Testing

Real-Time Applications

Specialized Performance

​Use Cases

​Simple Implementation

​Setup Instructions

​What This Tests

​Expected Performance

​Rate Limits & Pricing

​Common Issues

​Best Practices

Use Cases

Simple Implementation

Setup Instructions

What This Tests

Expected Performance

Rate Limits & Pricing

Common Issues

Best Practices