posted 06-27-2002 12:48 PM
You actually raised 2 questions / issues.One, proving that the site/application can scale to XXXX users. The other showing that the system has no bottlenecks that can't be solved without adding hardware.
Adding hardware should be the last concern, as most issues found during proper load testing uncover configuration and design problems.
Points to consider:
About 30% of issues are in the infrastructure, networking, router configuration, firmware, etc. Depending on your architecture this number could be larger or lower.
Your methodology needs to include stress tests for the various servers - independent of the application. This ensures that throughput, connect/disconnect rates, garbage collection, etc can be handled by your existing web/app/db servers. Again, these are sources of bottlenecks as well.
Then you get to the application itself. At that point you need to be pragmatic and prioritize the work. You can't test everything, but you need to define the load test models that you want. Document the model
preciself, including number of users, types
of operations, arrival rates, iterations, think times, etc.
There is merit in testing against the production system. As long as you:
[a] plan the event during off-peak hours
[b] take backups
[c] have support staff on hand to assist and recover
[d] approach testing gradually.
The purpose of testing in this manner is not to crash the system, but to gradually increase load and examine various components as they become busier. Often this will expose configuration issues.
The QA lab environment will not help you in
troubleshooting production issues.
If you need help doing this Mercury also offers a service called ActiveTune, which
does this for you.