Using Visual Studio Team System to Execute High-Volume Load Tests

This document contains the following sections:  

  1. Benefits and Weaknesses of VSTS as a Load Tool
  2. Step-by-step instruction on how to execute VSTS.
  3. VSTS run-time settings descriptions if you have more than 1 created via VSTS
  4. How to collect the results of a VSTS load test,
  5. How to query the data collected via VSTS, 
  6. Location of VSTS solution and data source, and
  7. Location of LR file and templates. 
  8. Location of all supporting documents/references.
  9. Appendix: Data Mining the VSTS LoadTest Database
  10. Appendix: Appendix B Load Testing DLL’s With VSTS

1. Benefits and Weaknesses of VSTS
VSTS is an outstanding LoadTesting tool. If Microsoft sticks with it, VSTS has the potential to dominate the ASP.net Load testing market. It is far easier to script basic ASP.Net web and load tests in VSTS than it is in LoadRunner. This is primarily due to VSTS’s handling of ASP.Net viewstate using it’s “Extract Hidden Fields” feature. In LoadRunner, we must often specifically code to handle viewstate and other parameters, whereas in VSTS viewstate and web controls are handled automatically. This feature makes VSTS a better tool in an “agile” development environment where builds change rapidly*.

 Weakness of VSTS Include:
Poor visual reporting capabilities. VSTS does not have the native ability to compare the load test of one build to a prior build (LoadRunner has this in it’s “Cross with Result” feature), thus no native ability to visually show a developer that he has made a code error. This weakness is not really alleviated by the SRS Reporting Services solutions which Microsoft is trying to sell. (This problem can be alleviated to some extent by graphing VSTS data in Excel or by further developing and using the MS Access graphing tool I have created \\loadgen\Projects\Tools\LoadTestGraphsAndData2.mdb ).

  • No support of “rendezvous” feature as in LoadRunner
  • Currently no ability to create a “user initialization” section in the scripts. This tends to lead to excessive logons and logoffs.
  • Limited ability  to update a parameter within a web test (no equivalent of LR’s “update each occurrence” feature)
  • Questionable support for load testing other platforms, such as PHP, and Oracle.
  • More limited, and confusing (or missing in this version) options for adjusting think times. For instance there is no “multiply recorded think time by x” feature. I don’t understand the methodology for kicking of scripts randomly, but within a specified time range. WAST and ACT had this, but I don’t see it in VSTS.  
  • The Load Tool is still (Feb.  2, 2007) in a Beta version and there are a few oddities.

Nevertheless, VSTS provides a number of features which make it, well, more fun, and more familiar to testers working on Microsoft platforms. These features include:

  • Tie in with MSDN support, including the MSDN forums (http://forums.microsoft.com/MSDN/ShowPost.aspx?PostID=210575&SiteID=1  ). Although LoadRunner does have a forum, answers are frequently not found there, and you end up on hold with Mercury support.
  • Smaller learning curve than LoadRunner. Testers familiar with the Visual Studio GUI can be quickly trained to at least record and run web and load tests, even if they need some help validating their scripts. Once VSTS is pointed at the correct SQL Server load test database, their results are easily gathered.
  • Easy to use web test design GUI, with the ability to visually cut and paste many of the “objects” which comprise a web test.
  • 1,000 free vusers with every VSTS installation. This is definitely enough to conduct unit tests on individual aspx and individual features. (using users with no think time).
  • Ability to load test web services (see Tom Kleinecke)
  • Easier to understand automation model than LoadRunner (mstest.exe and windows scheduler)
  • Collects an enormous amount of client side data by default, with no coding or fiddling. , and automatically stores it in a useful SQL server format, which can be queried.

 Microsoft has a history of abandoning their load testing tools (WAST and ACT were both dropped), but so far it looks like they have a commitment to further development on VSTS.

 * Note: The parameter handling  feature is not a license for developers to change the names of controls anytime they feel like it. Web page control names should be maintained whenever possible.

 Step-By-Step Instruction on How To Execute a VSTS Load Test

Making a new web test is basically a matter of turning on VSTS record and clicking through IE. This produces a web test.  You then must create a second test, a load test, and add the web test to the load test.  More in depth information about how to create new web tests and load tests can be found in the book, Visual Studio 2005 Team System.

When a web test is added to a load test, the test mechanism to save the results of a test automatically changes from XML to SQL server.  Therefore to get load results you, in one way are another, are going to gather data from a SQL Server. Although Microsoft recommends keeping the .trx files on your hard drive and using those to access the SQL data,  labeling the TRX files is painful, whereas the ability to query SQL server is easy and opens up a world of automation possibilities. All you really need is the LoadRunID from your run, and then the correct queries. You can verify that the queries are correct by comparing the results of the query with the results displayed in the VSTS GUI. (Queries are currently here: \\SEAW50982\Public\Tools\VSTSTips\VSTS_SQL_Queries, but need to be checked in someplace. )

Microsoft recommends having a separate SQL server to gather load test data. The machine which houses the SQL Server is currently loadgen11 (10.207.18.101). Microsoft also recommends have a separate, second, SQL Server to hold parameter data. This machine is currently SEAD50935 (10.207.18.102).

Tips on Recording and Creating Web Tests

Careful thought about your web tests leads to better results.   

  • ALL (all) requests must have header information added to them. For the most part this must be manually pasted in to each request. The header must always contain Newsmaker’s country code. It is easy to copy and paste once you’ve got one good header. If you do not get the CC header in, every request bounces back to the country code page. I don’t know why VSTS is not good at capturing this cookie, but it isn’t. It does a fine job of capturing viewstate, and getting the username cookie.  There might be another way to get the country cookie, but I have not found it, so I just paste it in every request. I get this header by viewing the request in Wireshark. This is a little tip, without which, you will be paralyzed. 
  • More about Cookies: For some reason, each element of an HTML header (each part of a cookie, for instance) must be on a separate line. VSTS does not like long strings in the headers. I don’t know why. I think it is a bug.  
  • Dependent Requests: Requests in which users sign in must have the mediabinfooter.aspx added to the search.aspx requests. For all searches by a signed in user, mediabinfooter.aspx needs to be:
    • added as a dependent request (rt. Click—Add dependent request. Dependent requests are not recorded automatically. But there are time-savers here: once you get one of them you can basically cut and paste.)
    • have a country code cookie pasted into it.
    • The INDIVID cookie mustalso be added to the parent request (search.aspx) in addition to the mediabinfooter.aspx
    • Mediabinfooter.aspx must have a think time of zero (because it is an attachment of the search page) .
    • Should (most of the time) have “parse dependent requests” set to false.  

The above steps should result in the user’s default Lightbox being displayed. I have very little testing done switching lightboxes.  

            Requests in which anonymous users go to the search page do not need have to have the mediabinfooter.aspx footer added.  

  • Sometimes VSTS does not place an “extract hidden fields” after a request. If it misses it, you must add it. You can tell something is wrong because your recorded test does not play back.  
  • You must periodically check to make sure all your web tests are really running well. You could be running load tests in which you execute 100,000 tests without any of them ever getting past the country code page. (VSTS will not throw an error about this.)  Always run your web tests prior to executing a load test, especially after a build change.  
  • AJAX Requests: There are some requests which no load tools record properly. These are requests which involve AJAX. For these requests, you must use WireShark or Fiddler to gather the actual requests and then construct that request in VSTS. This is no different than would have to be done with LoadRunner or other testing tools. We currently do not have any AJAX requests in the mixed load scenario. But these would occur if we added a web test of the image slider feature. In such cases, the AJAX requests must be set up as a) dependent or sub-request, and b) NOT to have “Extract Hidden Fields” in the request. Otherwise the next request will not run correctly. 
  • Collation: I have a system of capitalization for keeping requests with similar URL’s separated: 

--  Search.aspx = GET of Blank search.aspx page (without image results)
---  search.aspx = POST of Search.aspx page (with image results)
--  signin.aspx = Post of the signin page to sign out
--  Signin.aspx = Post of the signin page to sign in
--  SIGNIN.aspx = initial GET of the signin page 

This capitalization of requests allows us to use collation in the our SQL queries to separate out various requests.  

  • Transactions: Try to avoid wrapping your requests in transactions. Unlike LoadRunner, VSTS, by default, collects client-side page load time data andrequest load time data. It is doing this constantly and by default. Only changing the “record results” property of the request will stop the collection and recording of this data. By the term “client-side”, I do not just mean perfmon counters on the server; VSTS is actually timestamping and recording when the request went out and when it came back. With LoadRunner, this data is collected only if you wrap the request in a transaction.

In VSTS, therefore, you don’t really need to write many transactions. If you set “Parse dependent requests” to “true”, and “record results” to true, VSTS’s “Avg. Page load time” reading will be the same reading as if you had wrapped the request in a transaction. The only exception I can think of is if you have a page which has dependent requests and you want to measure the entire page load time (primary and dependent requests). This situation occurs, for instance, when we have a signed-on user going to the Search page; here the mediabinfooter loads in addition to search.aspx.

If you have such a page, rather than constantly wrapping it in a transaction, create a separate “measurement” script. In this script, and in this script only, wrap all your requests in transactions.

The reason to limit transactions is because every transaction you add to VSTS chews up memory. It is, furthermore, not the case that if transactions have the same name, you will save on memory. Transactions with the same name, but in a different web test count as two transactions.   

  • Datasources. Sometimes VSTS looses a connection to it’s data source. This happens if either the datasource machine or the loadgenertor gets re-booted. In these cases, re-hook up one of datasources, and then make liberal use of VSTS’s “EditŕReplace in Files” feature to search and replace for that long, encrypted GUID which is part of the datasource. 
  • Always use the “Parameterize web servers” option. (This is a rt. Click on the top node of the web test).  This allows fairly transparent movement of the test between the LOAD and STAGE environments. 
  • Double and triple -check think times. After you have recorded a web test, view the properties of each request and give serous thought to the think times. Verify that dependent requests have their think time set to zero. For a super-user unit test, think times for all requests would be zero. For a “real world” multi-page load test, the think times have to represent what users in the real world are actually doing. As you get into large mixed load tests, what large amounts of users are doing at a peak load becomes harder and harder to simulate. A rule of thumb for “real world” tests is to set all the requests to 3 seconds, or maybe 5 seconds. The main thing here is to make sure you haven’t left 100 second think times in your test. More than any other factor, think time will affect the outcome of a load test.  
  • Periodically review your web test. Remember to once in a while run your web tests individually outside your load tests. Visually check, not only for the little red “x’s” which say the request failed, but also the page itself  for the “Sorry, page not available . . .”  message. In the future it would be good to create a plug in which checks for this message.  
  • Use Wireshark or Fiddler. When having difficulty creating a web test which runs successfully, your best friend is the packet sniffer tool Wireshark or Fiddler. Turn on Wireshark, execute the feature you are testing, then turn off Wireshark and do a “Follow TCP Stream.” This will show you what is really happening with your feature. You may see a cookie which failed to record, or a dependent request or an AJAX request. 
  • You can use Wireshark to both get a capture from IE, and then also to capture a VSTS web test run. You can compare these to see where you went wrong with your VSTS web test.  
  • When I have gotten stuck on header issues, what I do is copy the info right out of Wireshark and paste it, line for line into VSTS. I put each cookie on a single line.  
  • When adding cookies to a web test, try to keep it to the minimum amount of cookies. Do not add the Omniture cookies. 
  • Micro Sensitive Drag and Drop: A lot of VSTS elements can be dragged and dropped, included entire requests. There is a micro-sensitivity to this. If you drop with the mouse pointer on the bottom of a request it gets pasted underneath that request. If, you drag to the top of the same request, it gets pasted above that request. If you pay attention to this you can correctly re-order a lot of requests.  
  • Other Extraction Rules. The Mixed Load currently uses only one other Extraction Rule. This is “Extract Text” in the purchase path script.  I did not really need this rule until build 10.4 in January, 2007.  But then it came in very handy. It is helpful when there is some kind of security around a resource, in the case the cartID for the user. Until build 10.4, anyone could add anything to anyone’s cart. They then changed that, so only the signed in user could add to his cart. This means you have to get the user’s cart ID.  
  • If the cartID and the userID passed in with the request mis-match, you get a “we’re sorry” error.

        The CartID is readily available in the HTML source once you click on “My Cart”.  

 To fix the purchase path script I  added two requests to the script. One to add an image to the users cart. This ensured that there would be a link in the users card with the cart ID. Second,  a request to the “My Cart” page on which I then added the “Extract Text” rule and extracted to get the users cart ID. This ID was now required for the Pricing Popup calculators. 

To create an “Extract Text” rule,  you simply right click on the request and choose “Add Extraction Rule”. You then fill in the fields with the text before and after the parameter you are interested in. 

Remember that the whole point of using any kind of parameter values is so you can add many different users, including users with different carts.

 Note on Extracted Parameters: These do not cut and paste very well from script to script. It is better to create them uniquely in each script.  

Resorting requests within a web test: Dragging a request to the top node of the web test will actually place it at the bottom. If you do this with each request, you can resort the requests. 

Do not query the LoadTest Database while running a load test. This will interfere with the loadtest.

This is all the more reason why we should have 2 loadtest databases.  

Notes on Installing the Load Agent

Many times the load agent will not install correctly unless you are logged onto windows with the load Agent account. I have the controller service running under the account “LoadTester” with credentials “Password.1”.

So if you want to create another load generator, you must create a local account on the loadgnerator machine named “LoadTester” give it the password Password.1, and then log onto the load generator with the LoadTester account. Then install the agent. (They do not tell you this in any documentation.)

 It is best, therefore, to run all the load generators signed on as “LoadTester”

 Notes on Parameterization
 The “Sequential” setting DOES update every “instance,” whereas the “random” does not.

 This is different behavior than LoadRunner, which has the Random, Sequential and Unique functionality, but then also has the “update each occurrence” option for each of these choices.

 VSTS sort of combines these two features:

“Random” both randomizes the items in your parameter list and also randomizes it within the webtest or script. I believe that using random is the only way to get “update each occurrence” functionality in your load tests.

 Sequential moves through your parameter list one at a time, and also will not update that parameter within a web test.

 Unique behaves basically like the LoadRunner version, and when you run out of values in your parameter list, the test will stop. For this reason, you seldom want to set the datasource to unique.

 Some caveats about Load Tests

  • I set most requests to “Parse dependent requests = “false”, and then let new users be the users who download all the HTML resources. This saves on load generator RAM. I have one script named “measurements” in which I have “Parse dependent requests” set to true.
  • Do not let the memory utilization on your load generator fall below 300 mb. If this happens, the load test is invalid. I like to use a freeware tool called “FreeMem.” This puts a little white icon in system tray showing how much memory you have left. Be sure to disable any and all of FreeMem’s optimization tools. Optimizing memory during a load test will cause the load test to stop during the privatization.

 General LoadTest Troubleshooting

  • Disappearing webtests. Sometimes VSTS drops webtests from a loadtest and replaces it with some weird loadtest you wrote two weeks ago. When this happens, try re-opening the “Edit Test Mix” and/or exiting VSTS and reopening your solution.
  • Low Memory on load generators. Follow the steps at: http://forums.microsoft.com/MSDN/ShowPost.aspx?PostID=841598&SiteID=1. Also, minimizing the GUI during the load test run will save about 300 MB of memory.

 Load Generator/Load Environment Tips

  • If you have to test in the STAGE or PROD environment, there are often various problems getting data back and forth to the load generating machines. It is best to try to connect by the IP address of the machines. If you can get to one load generator, it is generally easier to then copy data from that load generator to all the others.
  • The load generators will basically be in workgroups (not joined to a domain) and you must have a local account on that machine. The exception will generally be the “perfmon gathering machine” (loadgen06 in STAGE) which must be joined to the STAGE domain in order to gather the perfmon counters.

 Future goals for VSTS Load and Web Tests

  • Write a plug in which checks for the message “Sorry, page not available . . . .”
  • Write a SQL Query which allows us to publish “real” or final load test results to a separate reporting database. The query should be based on the LoadRunID and extract all the data. This would be an excellent way to maintain a history of performance improvement. The MS Access database I have started creating would be an excellent front-end for viewing this data.
  • Get the counter sets for various machines (CPU, memory, etc.) Integrated into the load test solution. Create some graphs in MS access which report on these.
  • Get better at parameterizing the lightbox footer page.

 2) VSTS Run Time Settings

First, on any load generator machine you set up, follow the instructions I posted on Microsoft’s VSTS forum in order to maximize memory utilization of the load generator:

http://forums.microsoft.com/MSDN/ShowPost.aspx?PostID=841598&SiteID=1 , paying particular attention to setting Server-style Garbage collection to true, as described here:

http://forums.microsoft.com/MSDN/ShowPost.aspx?PostID=210575&SiteID=1 

Think Times:
The Omniture data we have gathered (\\loadgen\Projects\WVUI_Framework\HowManyUsers\OmnitureInfo ) shows that at the peak viewing hour Feb, 9th 2006 at 7 – 8 am, there is a total of 181,096 page views) . ~12,000 users viewed were responsible for generating that load. This is ~15 pages per user per hour.  This means that as a statistical average there is a 4 minute (240 second) think time for each user. (60/15 = 4). This 4 minute statistical average think time is  a fact to bear in mind when modifying the mixed load scenario.  The December 2006 Load and Performance Test Result document contains more information and graphs which illustrate additional facts about this think time. Probably what is really happening is that users have our site bookmarked, and/or another site links to one of our images. So in a sense, we are hit by a lot of traffic which is not what “real,” purchasing users would do. Based on all this data and calculations, think times in the mixed load scenario are all currently set to 238 seconds. It probably would be good to modify the Mixed Load scenario to reflect some users paging through the site more quickly, and some not returning to the site at all. This could be done by duplicating some of the scripts currently in the mixed load scenario. But in the end, the think time should average out to 238 to 239 seconds for the Editorial Load. (Think time for Creative and Editorial combined, will be different.)

 My personal methodology for setting the think times is to do it by, for each of the approximately 150 requests in the mixed load scenario. There might be another way to do it.

 A few requests are not set to 239 seconds, such as logging in and the price pop-ups. I was afraid these pages would time-out in 4 minutes, so I set them to 10 seconds.

 Think Time Between Test Iterations
I have this set to zero in the mixed load scenario. There may be other ways to do it.

 Script Percentages:
The percentage at which each of the 17 scripts in the mixed load scenario are run is documented in the December 2006 Load and Performance Test Results document. Basically Searching (27%) and Image details (23%) are viewed the most. These percentages basically follow the Omniture data, keeping in mind that

 Other Settings:

 IP Switching - True -- will not work until we have the agent installed.

 Percentage of New Users. I have this set to 10,15, or 20% when emulating a peak load. You have to be careful how you set it. If set too high, then VSTS will record results image requests (jpgs) rather than ASPX requests. By default VSTS records results for 1,000 different URLs. But if all the users are new users VSTS getting more jpgs than ASPX's and then starts dropping ASPX requests. Also, the higher this percentage, the more data you are sucking through the network cards. Image data will saturate the network cards very quickly. I have tested Newsmaker with 50% and 100% new users, and the processor utilization is not that different than at 15%. I would leave this at 15%

 Browser Mix – Currently 100% IE Users. We ought to try some other browsers here.

 Network Mix - 34% LAN, 33% T3, 33% Cable Modem. With no network latency these settings hardly make a difference, but if we ran the mixed load over a WAN simulator, we might see some differences.

 Step Load Pattern

 - Initial Users - 0

 - Max Users - 600 users per 2 GB machine. This could be increased to about 1200 when we get 4 GB per machine.

 - Step duration - 10 (10 seconds)

 - Step Ramp Time - 10 seconds

 - Step User count - This is one which matters. To *really*  emulate our user ramp up time on the Editorial site, this should be no more than 9 or 10 users per second. This is *total* users for all load generating machines combined. If you are in a hurry and you just want to get to a max userload, you could do 30 users total per second, but be aware that Getty does not really have a situation in our live site where users log onto the site that rapidly.

 Counter Sets - I currently leave these at the default settings. I do not have any of the web or SQL servers entered. I have been using LoadRunner to gather these counter sets because LoadRunner has superior graphing capabilities. Eventually, however, we should get good at adding these counters in VSTS. We will also have to develop SQL queries to pick out these counters. If we want VSTS to graph these counters, they will have to be added to the scenario.

 Run Settings:
The following are changed from the defaults:

 Duration (I.e., how long should I run my test?): This depends on what you are testing for. To reach a decent size user load of 1,800 users (for 8 web servers this would equal 14,400 users), and at 10 users per second, you need 18 minutes just to ramp up and get all your users in. You want at least a half hour of test data running at 1,8000 user level, so that means at a minimum you must run mixed load tests for 40 minutes.

 This above calculations are  for the mixed user load scenario when we are testing for capacity. Unit tests on individual features can be run with a lower number of users and no think time. One advantage of this is that you cut out the ramp-up time of all those users. See John Gagnier for more information about this.

 Tracing Enabled - FALSE

 Storage Type – Database. (The database where the results are stored is set in the “TestŕAdminister Test Controllers” setting.

 Warm up -- zero. Since we have to run the mixed load so long to ramp up users, I do not bother with a warm-up. I basically just let the test run for 15 minutues longer, or take only the last hours page views.  For unit testing warm up is more important.

 Web Test connection Model - Connection Pool

 Web Test Connection Pool size - 1,000

 3) How to Collect Data

VSTS by default collects data to it’s own, local, SQL Server Express database. Query analyzer can be added for free to The SQL Server Express by upgrading to the Advanced Version of SQL Server Express (http://msdn.microsoft.com/vstudio/express/sql/download/ ).

 However data is best collected on a separate SQL Server not involved in generating load. We currently use loadgen11. This server is running on a trial version of SQL Server and needs to be upgraded.

 To set up a LoadTest database to collect VSTS data, read this post on how to create a new loadtestResultsRepository: http://forums.microsoft.com/MSDN/ShowPost.aspx?PostID=55303&SiteID=1

 The script to create the LOADTEST database is also here: <deleted>

 Once the LoadTest database is set up,  set recovery mode of the database to simple. Do not leave in full recovery mode, space will be used up too quickly.

 Finally,  set the correct SQL Server Database in the “TestŕAdminister Test Controllers” menu item of VSTS.

 Once the load test database is setup, you now have two ways to collect data:

 1.     When a test is done, copy and paste data from the VSTS GUI into Excel or some other place

2.     Use the following queries I developed to query the VSTS LoadTest database: <deleted>

 Here is a list of the queries I have developed:

  

Script

Function

1. GetLoadTestID.sql.txt

gets the LoadRunId for the lastest run.

2. CountPageHitsByRequestWithDateDiffAndCollation.sql.txt -

Collects and collates Page Views for the run

3. RequestResponseTimesOrderedByTestcaseAndUserLoad.sql.txt

Gathers and collates the response time of every the aspx.net request and sorts them by webtest.

4. AvgPageResponseTimeVSTS.sql.txt

Gathers and collates the response time of every the Page (dependent requests as well as gif, jpgs, etc.) and sorts them by webtest.

5. ErrorMessagesAndTimeStamps.txt

Gathers any error messages which occurred during the test.

6. SetServerStyleGarbageCollection.txt

Script to turn on server site garbage collection on the load generators.

7. BulkInsertSample.txt

Sample methodology to get parameter data loaded into SQL server.

8. DeleteOldData.sql.txt

Script to run when the load test db runs out of space.

9. CreateLOADTESTDatabase.sql.txt

The script to create a new, blank, LoadTest database.

 Using the SQL Scripts
Only the first 4 scripts that are used in normal load testing data collection. The first thing you have to do is query the LoadTest DB for the most recent run ID (your test). You then must take this run ID and replace it in the other SQL scripts. A few of the queries can be altered in order to find pages which ran the longest time. Currently, what I do is copy the query results out of the Query Analyzer window and paste it in Excel. I then graph a few of the key pages in Excel.

 I have been developing a Microsoft Access tool which automates the process of using these scripts and automatically produces desirable graphs.

 4. How to Query the Data Collected Via VSTS

This is described above. Basically after the query runs, paste the resulting data from Query Analyzer into Excel and graph it. The MS Access tool I have partially developed should help greatly with this.

 The only trick is getting the correct LoadRunID. This ID is time and date stamped in the LoadTest database so it is not too hard to find old tests if you can remember when you ran it.

There are some other tricks, but currently these tricks are necessary only because we do not have the Load Test Agent installed. This should be installed shortly.

The only final trick is that sometimes a query fails, because sometimes a request has the exact time and date stamp as another request. This is going to need a SQL developer to figure out. The load test agent may help with this problem too.

 5. Location of VSTS Solution And Data Sources

The best location to find the most current version of the VSTS Mixed load solution is on one of the STAGE load generators, loadgen04, loadgen07, or loadgen10. (10.207.18.104, 107, or 110). I place the  mixed load scenario in a folder called “LoadTestSimplified” and the solution file is named “stage.” (But of course, you can move this solution around to the load environment if you have used the parameterize web servers” feature.

 

All the POST requests in the solution have  a parameterized data source. I always use SQL Server as the data source, rather than a text file. I don’t use text files because they chew up memory and CPU on the load generator. The SQL server which holds the parameter data is on a separate machine than the SQL server which holds the results. The SQL Server I have been using is <deleted>. This machine is in the STAGE subnet, 10.207.18.102.  I always connect with  SQL Server credentials, USR: LoadTester, PWD: Password.1 .

 I have been toying with the idea of using the product catalog and Keyword lookup databases as datasources. This would give us more coverage.  

7. Location of all supporting documents/references.

 Your basic reference for questions is Microsoft’s VSTS testers forum.  http://forums.microsoft.com/MSDN/ShowForum.aspx?ForumID=19&SiteID=1 .  Microsoft FTE’s stopped answering questions here about November 1, 2006. But there is still lots of good info.

 The book on VSTS, Visual Studio Team Test 2005,  is of some use, but not as helpful as the forums. To use the forum’s search functionality you must enter search terms with a capital “AND” or “NOT” as the logical operators.  

Excel Spreadsheets where I calculated think times and user load are here:

\\loadgen\Projects\StageLoadTesting\TestDetailsFor8WebServersBetaProduction.xls

 All the Omniture data on which the Think Times and Mixed Load Scenarios are based are here:

\\loadgen\Projects\WVUI_Framework\HowManyUsers\OmnitureInfo

 To do

There is a conflict in some of the queries when the time stamps are the same. The next time I run into these, use profiler to see how VSTS resolves this conflict.


Appendix: Data Mining the VSTS LoadTest Database

Some things I have learned about VSTSload testing might be of use in automating VSTSweb tests.

Namely, it is possible to store the results of all web tests in a  (pre-existing) SQL Server database. VSTS stores Loadtest data in SQL instead  trx files.  This is the default behavior of VSTS.

But  the results of web tests can also be stored in the Loadtest database.

Some advantages of using SQL Server  to store test result data are:

    • The results of the test can be pulled out of the database by means of SQL queries, thus avoiding the parsing of XML.
    • Centralization of test results
    • History of tests is kept
    • Easier parsing of results
    • More errors caught

These advantages require  that web tests be run from a VSTS load test. And the load test then set to an iteration of one. A load test can also be kicked off by mstest.exe

There are two excellent posts in the VSTS forum about  running web tests as load tests and setting this iteration level to one:

Once a web test is run from a loadtest, you will find the results kept very compactly in a database named “LoadTest”  in the SQL Server express database on your local machine.

This database can be mined to find the results of your tests or the failure results.

Data Mining the VSTS LoadTest Database

Viewing the data in the LoadTest database requires a painless upgrade to the Advanced version of SQL Server Express. The free download for this is here: (http://msdn.microsoft.com/vstudio/express/sql/download/ “SQL Server 2005 Express Edition withAdvanced Services SP1” ) The advanced version gives you a GUI from which you can access your test data by point and click, and a query manager to code up queries. You could also install a regular SQL server instance.  (You can in fact point VSTS to any existing SQL server anywhere on the network, and use the loadtestresultsrepository.sql file to create a loadtest database.)

Once you have a GUI you will see the all the tables where the results of every load test are stored, including error messages, and you can run queries against all this data to find the results of your tests programmatically.

A file is included on the VSTS install  calledloadtestresultsrepository.sql.  ( “C:\Program Files\Microsoft Visual Studio 8\Common7\IDE”  )  creates the LoadTest database when VSTS is installed. This database stores the results of everyload test,including error messages and stack traces . (Precise description of this database here:http://blogs.msdn.com/billbar/articles/529874.aspx)

The following are some queries I hacked together as a result of much fiddling with loadtestresultsrepository.sql.  This file does not automatically give you the queries, or provide them in the most useful format. You have to rearrange statements, combine and subtract things, etc. (And, by the way, VSTS itself does NOT use the queries to populate the VSTS GUI. If you run SQL profiler while clicking on a Load test TRX file you will see that VSTS actually uses some really weird code to pull the data to the GUI. But you can compare the results in any VSTS GUI with the results of the SQL queries below and they will match –except sometimes VSTS misses the exact timestamp, e.g., sometimes two rows in SQL will have the same time stamp, or be off by a milli-second. The VSTS GUI will order these datapoints one way, the SQL query slightly different order, but the next second will be in matching order.)

Here are the queries I have hacked together to pull out load test data. 

I have not experimented with pulling out error info or pass/fail info, but I’m sure mining the LoadTest database would provide rich results for any web test automation.

For all these queries, the first thing you must find is the LoadTestRunID of your test. You must change this ID number in each of the queries below. 


From: Charles Morrison
Sent: Tue 11/14/2006 1:15 PM
Subject: RE: Test Automation Idea: Using SQL Server to Store WebTest Results

I just tried and was successful at adding a unit test to a load test. (It shows up as a“TestMethod”) and then I also queried the database and it was showing that the unit test executed, and how many times it executed, etc.  It seems like you could code anything intoa unit test, write a SQL provider and have whatever results you want thrown into the SQL Server.

Technically, I think that could be done without adding the test to a load test. You could just write your own SQL Connection string.

The loadtest database just makes it easier because they have already optimized the process ofthrowing results data into a database.

The first step I think is to install that Advanced SQL Server express and start experimenting.

_____________________________________________
From:
Sent:
Tuesday, November 14, 2006 12:27 PM
To: Charles Morrison
Subject: RE: Test Automation Idea: Using SQL Server to Store WebTest Results

Can this same technique be applied to unit tests? We have a lot of test cases coming down the pipe that are going to be centered around web services and backend api that are going to be driven through the unit test framework and it would be nice to have one platform we use to query all of the results.

Alex

_____________________________________________
From: Charles Morrison
Sent: Thursday, November 09, 2006 1:42 PM
To: Sandrine McFadden; Aaron Spainhower;Bill Sammons; Tom Kleinecke; Alex Shepard; Steve Swartz; David Organ;Steve Swartz
Subject: Test Automation Idea: Using SQL Server to Store WebTest Results


 

Appendix B Load Testing DLL’s With VSTS
 In this document I conclude that the best way to load test DLL’s under development is to create simple web pages which interface with DLLs under development and place the pages and DLLs on a separate web server (in a load environment) and use standard load tools, such as VSTS,  to create load similar what the DLL’s will actually experience in production. These simple web pages would actually be prototypes of the finalized web pages already under development by a web team.  In this way, stress and performance testing of both the web page and the DLL, simultaneously,  can be integrated early into the development cycle.

 Although white box testing can provide valuable experience and familiarity with the VSTS 2005 GUI (which is good preparation to begin to execute load VSTS load tests), there are several obstacles to the goal of simply transferring  white box API tests over for  load use. A simple switch-over of the tests goes against industry standard practices, and such a simple switch-over is NOT the way  Microsoft advertises VSTS. It is not that some cross-over of knowledge between the two testing methodologies, functional and load,  is impossible (not at all), but nowhere it it’s documentation for VSTS load testing does Microsoft describes or advocate a simple switch-over or leveraging functional tests for load testing purposes.

 The API tests developed by our white box testers during Newsmaker development were written in C# and exercised, among other DLL’s, the API of Search-Middle Tier components. This document describes the problems which occurred with the idea leveraging these white box tests for load use, and suggests some alternatives for future testing. These problems are as follows:

 A)     Results from tests executed on DEV and Test machines require, at their best,  large amounts of interpretation and speculation.   Much of the white box DLL tests were executed on tester’s local machines. There are two reasons much interpretation (really too much) is required in this scenario:

1. DLLs perform differently  on a TEST or DEV machine with their CPU and RAM than they do on a server with much more CPU and RAM.

2. It is difficult to separate the CPU, Memory, disk, and other resources consumed by the test harness vs. the resources consumed by the DLL under test.

3. The test machine’s CPU and RAM are a bottleneck to the load test.

4. Executed locally, network issues are not taken into account (i.e., sending too much data back and forth through the network card – a situation which we did encounter. )

 B)     API test suites have a poor representation of the load pattern which the DLL will see in actual use. By their nature, a suite of API tests is very comprehensive.  A full suite of API tests will exercise one method of a DLL with a hundred different input cases, many of which might consist of sending error conditions at the DLL. This is a completely different load than when the DLL is  in actual use on a web server. In actual use, it is likely that a single method, with few parameters and therefore few input cases, and few error conditions, will actually be exercised the most frequently. By using a functional API suite as a load test, some methods of  the DLL will be over-exercised, in other cases, functional test suites even though comprehensive may actually not be placing enough load on a particular method of the DLL, and that DLL may actually fail under the true load.

 It is really best to look at the actually traffic of the web site, using a tool such as Omnture, or IIS logs, as well as careful understanding and thought about the architecture of the DLL in order to determine what methods of the DLL will be actually hit the most and in what ratios.

 C)     Most importantly, industry standard coding practices of  functional tests tend to preclude their use as any sort of stress or load harness. API functional tests are coded tests which do a  lot of data manipulation and frequently require more disk, CPU and RAM than is practical for any kind of load test.  For example, a tester functionally testing a web method might approach the coding of his API tests in the following manner:

a.       Save a SOAP XML document to disk which he will modify and send at the web method

b.       Instantiate an XML reader or string reader object to read the XML into memory (he might do this for each test case, in a 150 test case suite)

c.       Open a second file reader to pick and read from a list of parameters he has on disk

d.       Use XPATH to update an attribute of the SOAP

e.       Instantiate a web proxy in preparation for sending the SOAP at the web method.

f.        Create a “Results” object to get the return values from the web method

g.       Write the new SOAP result to disk

h.       Instantiate another string reader to read a set of expected results

i.        Compare the expected XML with the actual XML received from the web method.

 These sorts of steps are fine for functional testing, but when you start repeating those steps in the large amounts required for a load test, the test machine does not have enough CPU and RAM to instantiate 150 web proxies.  All load tools are optimized and though through so they do not have to instantiate so many objects in order to generate load (they instantiate singleton objects).


The problems with using a suite of functional test cases  as a harness to conduct stress tests are numerous:  for instance

  • when the web proxy is instantiated, does the tester instantiate it once for all the tests, or does he destroy re-instantiate  it for every test?
  • How large are the XML documents when he reads them into the memory of the test machine? How much RAM do they consume?
  • Would it be better to just read one attribute or element of the XML?
  • How much time is lost in writing and reading results from disk?
  • Does he re-read in the result file for every test, or does he put that result file into an array in memory first, before the start of the entire test.

 Thinking through the code in a functional test harness in order to optimize for load and performance testing  it is an extra step do and is difficult work -- work which has actually already been done for us in the web and load test tools of VSTS. If a white box tester does not construct his tests in a way which optimizes the memory and CPU of his test machine, the machine will run out of these resources and grind to a halt. This is exactly what happened with the Mr. T VSTS functional tests when we tried to  use them as a stress harness.

 A white box tester should NOT be required completely design, engineer, and test a new stress harness.  That’s not his job, agile development does not give him time to do such work.  Writing a harness which effectively sends load at a web server is precisely the reason why load tools such as ACT, VSTS and Microsoft Web Application Stress Tool were written. These tools are optimized to send load in an efficient manner, deliberately limiting  the number of parameters which can be changed or altered by the user so that the test harness remains stable and uses a small amount of resources.

 VSTS already has  pre-optimized ways of paramterizing XML elements and attributes and then sending the XML at a web server. It is silly not to take advantage of them .  See for instance,  the document or section “Data Mining the VSTS LoadTest Database “ for information about how to collect some of the results of tests executed with VSTS. What is mostly required is that we a) take advantage of the VSTS testing abilities and that b) we execute load tests in a meaningful environment, representative of the machines they will actually run on in production.

 It is important to work with the intentions of the VSTS designers and not against them.

Conclusion
Although some knowledge of the VSTS GUI is gained by writing API and other functional tests in VSTS, for the reasons stated above, very few of the actual API tests can be leveraged or used in a stress or performance test.

 Running a functional test repeatedly can give a very rough idea of the robustness and stability of  a DLL as it is being developed, but it is very relative measurement, not taking into account the RAM and CPU of the tester’s machine, nor a realistic load on the DLL.  Again, since these tests are executed on a tester’s machine, at best they can only provide a relative measurement of how the DLL will perform on a much larger web server, and such tests are likely to miss failures which will occur when the true load is exerted against the DLL 

Visual Studio Team System