Worker Failures
Learn how to resolve partial worker failures with LoadForge.
Worker Failures
If you see an error about a partial or total worker failure, it means that at least one worker failed during your test run.
LoadForge has detected this failure, which typically occurs due to the test configuration or an excessive number of users overloading the worker servers. The quickest solution is to increase the number of workers to distribute the load more effectively.
Tip: Increase your test’s “Workers to Launch” setting and re-run the test to see if it stabilizes.
Common Causes of Worker Failures
Worker failures are almost always caused by overloaded test workers. The capacity of a worker depends on the complexity of the test and response times of the target application.
A typical worker can handle:
- Up to ~10,000 requests per second in a well-optimized test scenario.
- Up to ~40,000 requests per second using the FastHTTP client.
However, worker overload is more likely in the following cases:
- Frequent errors in responses: High failure rates consume additional processing power.
- Slow server response times: The longer a server takes to respond, the more active requests a worker must maintain.
- Complex test scenarios: Multiple user actions, authentication steps, and API calls increase CPU and memory usage per worker.
How to Resolve Worker Failures
If you experience worker failures, try the following solutions:
-
Increase the Number of Workers
- Add more workers under “Workers to Launch” to distribute the load.
-
Reduce the Number of Virtual Users per Worker
- Instead of pushing a single worker to its limit, spread users across multiple workers.
-
Check Your Server Response Times
- Use LoadForge’s Response Time Metrics to identify slow endpoints.
- Look for requests taking 10,000ms or more to respond, as these can cause worker failures.
- Optimize database queries, caching strategies, or API response times.
-
Set Timeouts in Your Locustfile
- If the backend is too slow to respond, workers may hang indefinitely. To prevent this, set timeouts in your Locust test script.
Example timeout setting in Locust:
This ensures that if the server takes longer than 10 seconds to respond, the request fails instead of blocking the worker.
-
Simplify Your Test Scenario
- If your test includes complex user interactions, consider breaking it into smaller tests.
-
Use the FastHTTP Client (For High-Throughput Needs)
- If your test involves simple requests (e.g., static files, APIs), consider using FastHTTP for higher efficiency.
By fine-tuning your test configuration and setting timeouts for slow responses, you can prevent worker failures and ensure reliable load testing results.