Friday, July 15, 2011
Presentation: An Introduction to Practical Unit Testing
Monday, February 28, 2011
4 Types of automated tests - unit, integration, UI, and performance
Some key ideas:
- These build off of each other -
- Unit --> Integration: Don't bother with complex integration tests if you can't even have a simple unit test.
- Integration --> UI: It's going to be near impossible to do a UI test (which usually has poor APIs) if you can't at least integrate the backend (with at least has APIs - like web services, SQL, or C# calls).
- UI --> Performance: If you can't at least functionally run the code from end-to-end, then you can't expect reliable performance measures on it. Yes, there are always exceptions, and semantics (one may consider "UI" to be fronted integration, or one may test performance on just the backend APIs and bypass the UI). But in general, these 4 tests are a very natural path to follow.
- The higher you go, the more expensive: Unit tests (low-level) are cheapest, performance tests (high-level) are most expensive. So it's bad business to pay for an integration test to do the work of a unit test. It's like using an armani suit as a dish rag.
- These 4 types of tests should be separated. You can call any code from a unit test (depending on security, you could even call APIs to shutdown the server), so you could mix all your tests into one test harness. But don't do this - it will burn you. For example, unit tests are generally fast (they're all in-memory), whereas UI and integration are much slower (databases, web services, IIS hosts, etc...) So you don't want them coupled together because the slow integration tests will bog down your fast unit tests, and then developers never run unit tests before check-in because "it takes too long".
- Unit testing is but one tool. There is different types of code (algorithms, data containers, plumbing, installation scripts, UI, persistence plumbing, etc...). This requires an arsenal of developer skills, of which unit testing is one tool. With respect to unit testing and code coverage, the goal isn't N% coverage of all code, but rather N% coverage of unit-testable code. (You can get better coverage tools, like NCover, which can provide coverage when running integration, UI, and even manual tests run by QA, but that's a different story).
Test Type | Good For | Bad For |
Unit |
|
|
Integration ("backend") | Ensuring high-level flows work, such as you can call a web service that loads or saves data to a database and writes something to a file. | Anything that can be handled with a unit test instead. For example, you likely wouldn't use an integration test to verify every input combination for a text field. |
UI ("frontend integration") | Very-high level, functional tests. | Anything that can be handled via backend integration or unit tests. |
Performance | Identifying performance problems that could be costly to the business. | Any functional testing |
Monday, January 17, 2011
Gaming Unit Test Metrics
1. Metric: Code Coverage
The #1 unit testing metric is code coverage, and this is a good metric, but it's insufficient. For example, a single regular expression to validate an email could require many different tests - but a single test will give it 100% code coverage. Likewise, you could leverage a mocking framework to artificially get high code coverage by mocking out all the "real" code, and essentially having a bunch of "x = x".
2. Metric: Number of unit tests
Sure, everything being equal, 10 tests probably does more work than just 1 test, but everything is not equal. Developer style and the code-being-tested both vary. One dev may write a single test with 10 asserts, another dev may put each of those asserts in its own test. Also, you could have many tests that are useless because they're checking for the "wrong" thing or are essentially duplicates of each other.
3. Metric: % of tests passing
If you have a million LOC with merely 5 unit tests, having 100% tests passing because all 5 pass is meaningless.
4. Metric: Having X tests for every Y lines of code
A general rule of thumb is to have X unit tests (or X lines of unit testing code) for every Y lines of code. However, LOC does not indicate good code. Someone could write bloated unit test code ("copy and paste"), so this could be very misleading.
5. Metric: How long it takes for unit tests to run
"We have 5 hours of tests running, so it must be doing something!". Ignore for the moment that such long-running tests are no longer really "unit" tests, but rather integration tests. These could be taking a long time because they're redundant (loop through every of a 1 GB file), or extensively hitting external machines such that it's really testing network access to a database rather than business logic.
6. Metric: Having unit testing on the project plan
"We have unit testing as an explicit project task, so we must be getting good unit tests out of it!" Ignore for the moment that unit tests should be done hand-in-hand with development as opposed to a separate task - merely having tests as an explicit task doesn't mean it's actually going to be used for that.
7. Metric: Having high test counts on methods with high cyclometric complexity
This is useful, but it boils down to code coverage (see above) - i.e. find the method with high complexity, and then track that method's code coverage.
Conclusion
Obviously a combo of these metrics would drastically help track unit test progress. If you have at least N% code coverage, with X tests for every Y lines of code, and they all pass - it's better than nothing. But fundamentally the best way to get good tests is by having developers who intrinsically value unit testing, and would write the tests not because management is "forcing" them with metrics, but because unit tests are intrinsically valuable.
Sunday, May 30, 2010
Three cautions with mocking frameworks
I'm a big fan of unit testing. I think in many cases, it's faster to developer with unit tests than without.
Perhaps the biggest problem for writing unit tests is how to handle dependencies - especially in legacy code. For example, say you have a method that calls the database or file system. How do you write a unit test for such a method?
One approach is dependency injection - where you inject the dependency into the method (via some seam like a parameter or instantiate it from a config file). This is powerful, but could require rewriting the code you want to test.
Another approach is using mock or "isolation" framework, like TypeMock or RhinoMock. TypeMock lets you isolate an embedded method call and replace it with something else (the "mock"). For example, you could replace a database call with a mock method that simply returns an object for your test. This is powerful; this changes the rules of the game. It's great to assist a team in adopting unit testing because it guarantees that they always have a way to test even that difficult code. However, as Spiderman taught us, "With great power comes great responsibility". TypeMock is fire. A developer can do amazing things with it, but they can also burn themselves. If abused, TypeMock could:
- Enable developers to continue to write "spaghetti" code. You can write the most tangled, dependent code ever (with no seams), the kind of thing that would get zero test coverage, and TypeMock will rescue it. One of the key points of unit testing is that by writing testable code, you are writing fundamentally better code.
- Allow developers to get high test coverage by simply mocking every line. The problem is that if everything is mocked, then there's nothing real left that is actually tested.
- Make it harder to refactor because the method is no longer encapsulated. For example, say a spaghetti method has a call to a private database method, so the developer uses TypeMock to mock out that private call. Later, a developer refactors that code by simply changing the name of a private method (or splits a big private method into two smaller ones). It will break the related unit tests. This is the opposite of what you want - encapsulated code means you can change the private implementation without breaking anything, and unit tests are supposed to give confidence to refactoring.
TypeMock can work magic, but it must be used properly.
Sunday, May 23, 2010
Why it is faster to developer with unit tests
I keep hinting at this with various blog posts over the years, so I wanted to just come out and openly defend it.
It is faster for the average developer to develop non-dependent C# code with unit tests than without.
By non-dependent, I mean code that doesn't have external dependencies, like to the database, UI controls, FTP servers, and the like. These types of UI/Functional/Integration tests are tricky and expensive, and I fully emphasize why projects may punt on them. But code like algorithms, validation, and data manipulation can often be refactored to in-memory C# methods.
Let's walk through a practical example. Say you have an aspx page that collects user input, loads a bunch of data, and eventually manipulates that data with some C# method (like getting the top N items from an array):
public static T[] SelectTopN
{
if (aObj == null)
return null;
if (aObj.Length <= intTop || aObj.Length == 0 || intTop <= 0)
return aObj;
//do real work:
T[] aNew = new T[intTop];
for (int i = 0; i < intTop; i++)
{
aNew[i] = aObj[i];
}
return aNew;
}
This is the kind of low-hanging fruit, obvious method that should absolutely be tested. Yet many devs don't unit test it. Yes it looks simple, but there's actually a lot of real code that can easily be refactored to this type of testable method (and it's usually this type of method that has some sort of "silly" error). There's a lot that could go wrong: null inputs, boundary cases for the length of the array, bad indexes on an array, mapping values to the new array. Sure, real code would be more complicated, which just reinforces the need for unit testing even more so.
Here's the thing - the first time the average developer writes a new method like that, they will miss something. Somehow, the dev needs to test it.
So how does the average programmer test it? By setting up the database, starting the app, and navigating 5 levels deep. Oops, missed the third-null; try again. That's 3 minute wasted. Oops, had an off-by-one in the loop; try again. 6 minutes wasted. Finally, set everything back up, testing the feature, score! 15 minutes later, the dev has verified a positive flow works. The dev is busy and under pressure, got that task done, so they move on. 4 weeks later (after the dev has forgotten everything), QA comes and says "somewhere there's bug", and the dev spends an hour tracking it down, and it was because the dev didn't handle when the array has a length less than the "Select Top N", and the method throws an out-of-range exception. Then the dev makes the fix, hopes they didn't break anything else, and waits a day (?) for that change to be deployed to QA so a test engineer can verify it. Have mercy if that algorithm was a 200-line spaghetti code mess ("there wasn't time to code it right"), and it's like a rubix cube where every change fixes one side only to break another. Have even more mercy if the error is caught in production - not QA.
Unit test is faster because it stubs out context. Instead of take 60 seconds (or 5 minutes, or an hour of hunting a month later!) to set up data and stepping through the app, you just jump straight to it. Because unit tests are so cheap, the dev can quickly try all the boundary conditions (outside of the few positive flows that the application normally runs when the dev is testing their code). This means that QA and Prod often don't find a new boundary condition that the dev just "didn't have time" to check for. Because unit tests are run continually throughout the day, the CI build instantly detects when someone else's change breaks the code.
Especially with MSTest or NUnit, unit test tools are effectively free. Setting up the test harness project takes 60 seconds. Even if you start with only 5% code coverage for just the public static utility methods - it's still 5% better than nothing.
Of course, the "trick" is to write your code such that more and more of it can be unit tested. That's why dependency injection, mocking, or even refactoring to helper public-static helper utilities is so helpful.
Over the last 5 years, I've heard a lot of objections to avoid testing even simple C# methods, but I don't think they save time:
Objection against unit testing being faster | Rebuttal |
You're writing more code, so it's actually slower | It's not typing that takes the time, but thinking. |
I can test it faster by just running the app | What does "it" really mean? You're not testing the whole app, just a handful of common scenarios (and you're only running the app occasionally on your machine, as opposed to unit tests than run every day on a verified build server) |
"Unit testing" is one more burden for developers to learn, which slows us down | Unit tests are ultimately just a class library in the language of your choice. They're not a new tool or a new language. The only "learning curve" is that conceptually it requires you write code that can be instantiated and run in a test method - i.e. you need to dogfood your own code, which a dev should be prepared to do anyway. |
My code is already perfect, there are no bugs, so there is not need to write such unit tests. | Ok, so you know your code is perfect (for the sake of argument) - but how will you prove that to others? And how will you "protect" your perfect code from changes caused by other developers? If a dev is smart enough to write perfect code the first time, then the extra time needed to write the unit test will be trivial. |
If I write unit tests, when I change my code, then I need to go update all my tests. | Having that safety net of tests for when you do change your code is one of the benefits of unit tests. Play out the scene - you change your code, 7 tests fail - that's highlighted what would likely break in production. Better to find out about breaking changes from your unit tests rather than from angry customers. |
Devs will just write bad tests that don't actually add value, so it's a waste of time. | Any tool can be abused. Use common sense criteria to help devs write good tests - like code coverage and checking boundary cases. |
I just don't have time | For non-dependent C# code, you don't have time not too. Here's the thing - the code has got to be tested anyway. What is your faster alternative to prove that the code indeed works, and that it wasn't broken by someone else after you moved on? |
Related:
Monday, April 19, 2010
Unit testing random methods
You can unit test random methods by running the method in a loop and checking for statistical results.
For example, say you have a method to return a random integer between 1 and 10 (this could just as easily be to return any type between any range). You could run the test 100,000 times and confirm that the statistical distribution makes sense. With a sufficient sample size, there should be at least one of each value. The mathematically advanced could apply better statistics, like checking the proper distribution.
Here's a simple sample. It runs the random method in a loop, and checks just that each value was returned at least once.
public class MathHelper
{
private static Random _r = new Random();
public static int GetRandomInt()
{
//return Random int between 1 and 10
//recall that upper bound is exclusive, so we use 11
return _r.Next(1, 11);
}
}
[TestMethod]
public void Random_1()
{
int iMinRandomValue = 1;
int iMaxRandomValue = 10;
//Initialize results array
//ignore 0 value
int[] aint = new int[11];
for (int i = iMinRandomValue; i <= iMaxRandomValue; i++)
{
aint[i] = 0;
}
//Run method many times, record result
//Every time a number is returned, increment its counter
const int iMaxLoops = 1000;
for (int i = 0; i < iMaxLoops; i++)
{
int n = MathHelper.GetRandomInt();
aint[n] = aint[n] + 1;
}
//assert that each value, 1-10, was returned
for (int i = iMinRandomValue; i <= iMaxRandomValue; i++)
{
Assert.IsTrue(aint[i] > 0,
string.Format("Value {0} never returned.", i));
}
}
Of course, this assumes you're explicitly trying to test the random method - you could mock it out if the method is just a dependency and you're trying to test something else.
Tuesday, February 16, 2010
Where 100% Code Coverage is not sufficient
Sometimes I wish development were as easy as telling junior guys to "follow this one metric", and then they write perfect code. However, it's not. One example is how to know when you've written enough unit tests. Code Coverage is the obvious metric, and therefore "100% Code Coverage" sounds great. But there are plenty of cases where even 100% coverage doesn't do the job.
Case 1: Regular expressions
Take an email validator, something like so:
public static bool IsEmail(string s)
{
return System.Text.RegularExpressions.Regex.IsMatch(s,
@"\b[A-Z0-9._%+-]+@[A-Z0-9.-]+\.[A-Z]{2,4}\b");
}
A single test would give 100% coverage, but obviously there's a lot of other paths to check. Ironically, because regular expressions are often used to validate input data, and it's an in-memory operation (no databases or external files to hit), it's a prime candidate for lots of unit tests to catch all the boundary conditions - as opposed to just the 1 test needed to reach 100% coverage.
Case 2: Single-line expressions
Similar to the previous case, merely calling this method with one set of inputs (say the "less-than" path, such as i1=5 and i2=10), will get 100% coverage. But that wouldn't test the "greater-than" and "equal to" conditions.
public static bool IsGreater(int i1, int i2)
{
return (i1 > i2);
}
Case 3: Missing Asserts (bad logic)
Even with 100% coverage, it doesn't guarantee that the method logic is correct.
For example, say you've got a CSV-parsing method:
public static string[] ParseCsvString(string strLine)
{
string[] astr = strLine.Split(',');
return astr;
}
That is tested by:
[TestMethod]
public void ParseCsvString()
{
string[] astr = Foo.ParseCsvString("a, b, c");
}
This will give 100% coverage. However, there are no asserts, so it's really just showing that the method didn't throw an exception. Even if a developer adds an assert, they need to make sure it's asserting the right thing. Say, adding an assert that the returned array is not null, or has a length of 3, misses the logic that trims the white space after each comma. For example, we want elements like "b" (no whitespace), not " b". In other words, we'd want the ParseCsvString method to loop through each item and Trim() it.
Case 4: Mocking giving a false sense of security
Mocking Frameworks, like TypeMock, are very powerful tools for increasing unit test coverage. These tools allow you to "mock out" a method call within code, such as that database or logging call that would be hard to run in a test method.
While this is great for testing legacy code, it can easily be abused. If every line is mocked out, there's nothing real that's left to test. So while it does get high coverage, if used incorrectly, it becomes meaningless.
Thursday, January 28, 2010
Four tool approaches to automated web UI tests
We all know that in the ideal world, our web apps would have "sufficient" automated tests. But how? I'm not an expert here, but I've come across four general strategies:
Description | Example | PRO | CON |
Send web requests, and then parse the corresponding response. | MSTest WebTests |
|
|
Directly tap into the ASPX pipeline (I'm no expert here, but to my limited knowledge, it seemed different than merely the request-response model). | NUnitAsp (but this was officially ended) |
|
|
Recording and playing back the mouse and keyboard movements | Don't know offhand, but I've heard of them with COM+ |
|
|
Run the browser in an automated script. | WatiN |
|
|
Personally, I've seen the best luck with WatiN. Especially in the Ajax age, automated tests need to run JavaScript. I also find that to get a team to adopt a new tool, it's invaluable to let them run it themselves (i.e. anyone can download the open-source tool and run it with management paying $$$ for a license), and to provide tutorials (i.e. WatiN has an active community).
Wednesday, January 20, 2010
Five Ironies of Unit Testing
I am a huge advocate of unit testing. After years of writing tests, and encouraging other devs to write tests, I find five common ironies:
- The devs who would most benefit from unit tests are the devs who are least likely to write them - and vice versa. The star devs, who would write the code correctly to begin with, are also the ones most open to unit testing. Likewise, the low-quality-code-developers who shun testing are the one's whose code could benefit the most from it.
- Writing unit tests actually saves time - not just in integration testing but also in development - because it stubs out the context, allowing you to immediately jump to the area that needs testing instead of spending 5 minutes setting up the scenario.
- Developers often punt on unit testing because "my manager doesn't support it", but unit testing is really an encapsulated development detail that doesn't need managerial support (although of course their support is appreciated).
- Many devs generate the unit tests after they write the code ("those ivory-tower architects said we needed tests"), but tests are most beneficial before you write the code because they force you to think what the code does, and they make it faster to write the code.
- The same teams who don't want to write unit tests are relieved to have such tests on the code they need to maintain.
Sunday, September 20, 2009
I don't have time to put on the parachute
If you're jumping out of a plane at 5000 feet - you take the time to put on the parachute. Sure, you could save yourself a few seconds, and for the first 1000 feet having that parachute doesn't matter yet when you're still free-falling ("I saved schedule time by skipping the 'put-on-parachute' task!"). However, by the time you hit the ground, that parachute is life-saving.
Same thing with software projects and best practices. The project is like jumping out of a plane, and the best practices are like the parachute. Sure, some "best practices" are just useless marketing buzzwords. However, others - like unit testing - are the real deal. And a developer or manager who rejects unit testing (for new .Net code in the middle tier) because they "don't have time" is like jumping out of the plane without putting on your parachute. You save a bit of time upfront, but when the project goes to production and "crashes" into reality, the maintenance costs and untested boundary cases will kill it.
"I don't have time" sounds much more noble and business-like than "I don't understand that idea", or "I just don't feel like doing something new." But within the (very common) context of new .Net class-library development - given that testing frameworks are free (NUnit, MSTest), and that any developer can at least write their tests locally - regardless of management support, and that in certain scenarios it's actually faster to develop code by writing unit tests (because it stubs out the context), the "I don't have time" reply to unit testing certain scenarios is misguided. Sure there are always exceptions, and reasons that good devs may not write unit tests. Testing databases, or legacy code, or the UI, or integration is a different beast. However, in general, testing public methods in a C# class library, without external data dependencies, should be as reasonable as putting a parachute on while jumping out of a plane.
Sunday, July 19, 2009
Would you still write unit tests even if you couldn?t automatically re-run them tomorrow?
I am constantly amazed at how difficult it is to encourage software engineering teams to adopt unit testing. Everyone knows (wink wink) that you should test your own code, and we all love automation, and all the experts are pushing for it, and we all know how expensive bug fixes are, etc... Yet, there are still many experienced and good-hearted developers who simply don't write unit tests.
I think a critical question may be "Would you still write unit tests even if you couldn’t automatically re-run them tomorrow?"
Here's why - most managers who push unit tests do so saying something like "Yeah, it's a lot of extra work to write all that testing code right now, but you'll sure be glad in a month when you can automatically re-run them. Oh, and by the way, you can't go home today until you fix these three production issues."
The problem is this demotes unit testing to yet another "invest now; reward later" methodology. This is a crowded field, so it's easy to ignore a new-comer like "unit testing". Obviously, most devs live in the here and now, and they'll just trying to survive today, so they care much more about "invest now, reward now".
The "trick" with unit testing - at least with basic unit testing to at least get your foot in the door - is that it adds immediate value today. Even if you can't automate those tests tomorrow, it can often still help get the current code done faster and better. How is this possible?
- Faster to develop - Unit testing is faster to developer because it stubs out the context. Say you have some static method buried deep within your web application. If it takes you 5 minutes to set up the data, recompile the host app, navigate to the page, and do whatever action triggers your method being called - that's a huge lag time. If you could write a unit test that directly calls that method, such that you can bypass all that rigmarole and run the static method in 5 seconds - and now you need to test 10 different boundary cases - you've just saved yourself a good chunk of time.
- Think through your own code - Unit testing forces you to dog food your own code (especially for class-library APIs). It also force you to think out boundary conditions - per the previous question, if it takes several minutes to test one usage of a function, and that function has many different boundary cases, a time-pressed developer simply won't test all the cases.
- Better design - Testable code encourages a more modular design that is more flexible to change, and easier to debug. Think of it like this: in order to write the unit test, you've got to be able to call the code from a context-free, class library; i.e. if a unit test can call it, then so could a windows service, web service, console app, windows app, or anything else. Every external dependency (i.e. the things that usually break in production due to bad configuration) has been accounted for.
Even if you could never re-run those unit tests after the code was written, they are still a good ROI. The fact that you can automatically re-run them, and get all the additional benefits, is what makes unit testing such a winner for most application development.
RELATED: Is unit testing a second class citizen?, How many unit tests are sufficient?, Backwards: "I wanted to do Unit Tests, but my manager wouldn't let me"
Thursday, March 19, 2009
Is unit testing a second class citizen?
Especially with the successful track record of unit tests, no project wants to be caught rejecting the notion of "unit testing your software". However, for many projects, unit testing seems like a second-class citizen. Sure, people speak the buzzwords, but they don't actually believe it, hence they diminish unit tests as some secondary thing unworthy of time or resources, as opposed to "real code".
- Will developers actually spend time writing unit tests while they develop a feature (not just as an afterthought)?
- Will developers (including architects) design their code such that it's conducive to unit testing?
- Will a broken test get a manager's attention, or is it just some nuance to be worked around?
- When a business logic bug is found, is there enough of an infrastructure such that you could write a unit test to catch it (the test initially fails because the code is broken, then passes once you fix the bug)?
- Will developers invest mental energy in learning to write better tests, such as reading blogs or books on testing, or experimenting with better testing techniques?
- Will developers write unit tests even when no-one's looking, or is it just some "tax" to appease an architect or manager?
- Will management support the hardware for it, like having an external build server with the right software (while NUnit is free, MSTest still requires a VS license)?
- Will a broken unit test cause the build to fail?
- During code reviews, will other devs review your unit tests, similar to how a QA person reviews functionality?
- Do your amount of unit tests increase as the rest of the project grows?
- Is the team concerned with code coverage?
Wednesday, January 14, 2009
Why good-intentioned devs might not write good unit tests
I'm a big fan of unit testing. A related question to "How many tests are sufficient?" is "Why don't we write good unit tests?" While I've seen some people attribute it to purely negative things like laziness or dumbness or lack or care for code quality, I think that misses the mark. While sure, there are some devs who don't write tests for those reasons, I think there are tons of other devs who are hard-working, smart, and do care about their work, but still don't write good or sufficient unit tests. Calling these hard-working coworkers "dumb" isn't going to make anything better. Here are some reasons why a good-intentioned developer might not write tests.
I think I already write sufficient unit tests for my code.
I don't have time - the tests take too long to initially write.
I don't have time - the tests take too long to maintain and/or they keep breaking.
The unit tests don't really add value. It's just yet another buzzword. They don't actually catch the real errors. So it's not the best use of my time.
It's so much faster to just (real quick) run through my feature manually because all the context is already there (the data, the web session, the integration with other features, etc...).
My code isn't easily testable - unit tests are great for business logic in C#, but I write code other than C# (SQL, JS), or things that aren't business logic (like UI rendering), or my code is too complex for unit tests.
My code isn't easily testable - there are too many dependencies and limits. For example, I can't even reference an ASP.Net CodeBehind in a unit test.
The tests take too long to run (the full test suite takes about 10 minutes, even without the database tests it still takes 3 minutes).
I write code that already works, so it doesn't require unit tests.
My code is so simple so that it doesn't need tests. For example, I'm not going to test every option in a switch-case.
Sounds great, but I just don't know how to write tests for my code.
Note that I absolutely don't offer these as excuses, but rather as practical ideas to help understand a different perspective so you can improve things. For example, if someone is working on a 2-million line project that takes 5 minutes just to compile, let alone run any sort of test, they might skip running the tests with a "I don't have time" mindset. Yes, I still think it's overall faster to write and run the tests, but at least it helps you understand their perspective so you can try to meet them half way (perhaps improve their machine hardware, split up the solution, split up the tests, etc...). Of, if someone thinks that unit tests don't catch "real errors", then you can have a discussion with concrete examples. Either way, understanding someone's reasons for doing something will help bridge the gap.
Sunday, September 21, 2008
Getting Categories for MSTest (just like NUnit)
Short answer: check out: http://www.codeplex.com/testlistgenerator.
Long answer: I previously discussed Automating Code Coverage from the Command Line. Another way to buff up your unit tests is to add categories to them. For example, suppose you had several long-wrong tests (perhaps unit testing database procedures), and you wanted the ability to run them separately, using categories, from the command line.
The free, 6-year old, open-source NUnit has category attributes, for which you can filter tests easily by. For a variety of reasons, MSTest - the core testing component of Microsoft's flagship development product, after 2 releases (2005 and 2008) still does not have these. I've had the chance to ask to ask people at MSDN events about these, and I've heard a lot of "reasons":
Proposal for why MSTest does not need categories: | Problem with that: |
Just put all those tests in a separate DLL |
|
Just add the description tag to them and then sort by description in the VS GUI |
|
Just filter by any of MSTest's advanced filter GUI features - like class or namespace. |
|
Just use MSTest test lists |
|
Well, you shouldn't really need categories, just run all your unit tests every time. |
|
So, if you want to easily get the equivalent of NUnit categories for your MSTest suite, check out http://www.codeplex.com/testlistgenerator.
Thursday, September 18, 2008
Automating Code Coverage from the Command Line
We all know that unit tests are a good thing. One benefit of unit tests is that they allow you to perform code coverage. This works because as each test runs the code, (using VS features) it can keep track of which lines got run. The difficulty with code coverage is that you need to instrument your code such that you can track which lines are run. This is non-trivial. There have been open-source tools in the past to do this (like NCover). Then, starting with VS2005, Microsoft incorporated code coverage directly into Visual Studio.
VS's code coverage looks great for a marketing demo. But, the big problem (to my knowledge) is that there's no easy way to run it from the command line. Obviously you want to incorporate coverage into your continuous build - perhaps even add a policy that requires at least x% coverage in order for the build to pass. This is a way for automated governance - i.e. how do you "encourage" developers to actually write unit tests - one way is to not even allow the build to accept code unless it has sufficient coverage. So the build fails if a unit test fails, and it also fails if the code has insufficient coverage.
So, how to run Code Coverage from the command line? This article by joc helped a lot. Assumeing that you're already familiar with MSTest and CodeCoverage from the VS gui, the gist is to:
In your VS solution, create a "*.testrunconfig" file, and specify which assemblies you want to instrument for code coverage.
Run MSTest from the command line. This will create a "data.coverage" file in something like: TestResults\Me_ME2 2008-09-17 08_03_04\In\Me2\data.coverage
This data.coverage file is in a binary format. So, create a console app that will take this file, and automatically export it to a readable format.
Reference "Microsoft.VisualStudio.Coverage.Analysis.dll" from someplace like "C:\Program Files\Microsoft Visual Studio 9.0\Common7\IDE\PrivateAssemblies" (this may only be available for certain flavors of VS)
Use the "CoverageInfo " and "CoverageInfoManager" classes to automatically export the results of a "*.trx" file and "data.coverage"
Now, within that console app, you can do something like so:
//using Microsoft.VisualStudio.CodeCoverage;
CoverageInfoManager.ExePath = strDataCoverageFile;
CoverageInfoManager.SymPath = strDataCoverageFile;
CoverageInfo cInfo = CoverageInfoManager.CreateInfoFromFile(strDataCoverageFile);
CoverageDS ds = cInfo .BuildDataSet(null);
This gives you a strongly-typed dataset, which you can then query for results, checking them against your policy. To fully see what this dataset looks like, you can also export it to xml. You can step through the namespace, class, and method data like so:
//NAMESPACE
foreach (CoverageDSPriv.NamespaceTableRow n in ds.NamespaceTable)
{
//CLASS
foreach (CoverageDSPriv.ClassRow c in n.GetClassRows())
{
//METHOD
foreach (CoverageDSPriv.MethodRow m in c.GetMethodRows())
{
}
}
}
You can then have your console app check for policy at each step of the way (classes need x% coverage, methods need y% coverage, etc...). Finally, you can have the MSBuild script that calls MSTest also call this coverage console app. That allows you to add code coverage to your automated builds.
[UPDATE 11/5/2008]
By popular request, here is the source code for the console app:
[UPDATE 12/19/2008] - I modified the code to handle the error: "Error when querying coverage data: Invalid symbol data for file {0}"
Basically, we needed to put the data.coverage and the dll/pdb in the same directory.
using System;
using System.Collections.Generic;
using System.Linq;
using System.Text;
using Microsoft.VisualStudio.CodeCoverage;
using System.IO;
using System.Xml;
//Helpful article: http://blogs.msdn.com/ms_joc/articles/495996.aspx
//Need to reference "Microsoft.VisualStudio.Coverage.Analysis.dll" from "C:\Program Files\Microsoft Visual Studio 9.0\Common7\IDE\PrivateAssemblies"
namespace CodeCoverageHelper
{
class Program
{
static int Main(string[] args)
{
if (args.Length < 2)
{
Console.WriteLine("ERROR: Need two parameters:");
Console.WriteLine(" 'data.coverage' file path, or root folder (does recursive search for *.coverage)");
Console.WriteLine(" Policy xml file path");
Console.WriteLine(" optional: display only errors ('0' [default] or '1')");
Console.WriteLine("Examples:");
Console.WriteLine(@" CodeCoverageHelper.exe C:\data.coverage C:\Policy.xml");
Console.WriteLine(@" CodeCoverageHelper.exe C:\data.coverage C:\Policy.xml 1");
return -1;
}
//string CoveragePath = @"C:\Tools\CodeCoverageHelper\Temp\data.coverage";
//If CoveragePath is a file, then directly use that, else assume it's a folder and search the subdirectories
string strDataCoverageFile = args[0];
//string CoveragePath = args[0];
if (!File.Exists(strDataCoverageFile))
{
//Need to march down to something like:
// TestResults\TimS_TIMSTALL2 2008-09-15 13_52_28\In\TIMSTALL2\data.coverage
Console.WriteLine("Passed in folder reference, searching for '*.coverage'");
string[] astrFiles = Directory.GetFiles(strDataCoverageFile, "*.coverage", SearchOption.AllDirectories);
if (astrFiles.Length == 0)
{
Console.WriteLine("ERROR: Could not find *.coverage file");
return -1;
}
strDataCoverageFile = astrFiles[0];
}
string strXmlPath = args[1];
Console.WriteLine("CoverageFile=" + strDataCoverageFile);
Console.WriteLine("Policy Xml=" + strXmlPath);
bool blnDisplayOnlyErrors = false;
if (args.Length > 2)
{
blnDisplayOnlyErrors = (args[2] == "1");
}
int intReturnCode = 0;
try
{
//Ensure that data.coverage and dll/pdb are in the same directory
//Assume data.coverage in a folder like so:
// C:\Temp\ApplicationBlocks10\TestResults\TimS_TIMSTALL2 2008-12-19 14_57_01\In\TIMSTALL2
//Assume dll/pdb in a folder like so:
// C:\Temp\ApplicationBlocks10\TestResults\TimS_TIMSTALL2 2008-12-19 14_57_01\Out
string strBinFolder = Path.GetFullPath(Path.Combine(Path.GetDirectoryName(strDataCoverageFile), @"..\..\Out"));
if (!Directory.Exists(strBinFolder))
throw new ApplicationException( string.Format("Could not find the bin output folder at '{0}'", strBinFolder));
//Now copy data coverage to ensure it exists in output folder.
string strDataCoverageFile2 = Path.Combine(strBinFolder, Path.GetFileName(strDataCoverageFile));
File.Copy(strDataCoverageFile, strDataCoverageFile2);
Console.WriteLine("Bin path=" + strBinFolder);
intReturnCode = Run(strDataCoverageFile2, strXmlPath, blnDisplayOnlyErrors);
}
catch (Exception ex)
{
Console.WriteLine("ERROR: " + ex.ToString());
intReturnCode = -2;
}
Console.WriteLine("Done");
Console.WriteLine(string.Format("ReturnCode: {0}", intReturnCode));
return intReturnCode;
}
private static int Run(string strDataCoverageFile, string strXmlPath, bool blnDisplayOnlyErrors)
{
//Assume that datacoverage file and dlls/pdb are all in the same directory
string strBinFolder = System.IO.Path.GetDirectoryName(strDataCoverageFile);
CoverageInfoManager.ExePath = strBinFolder;
CoverageInfoManager.SymPath = strBinFolder;
CoverageInfo myCovInfo = CoverageInfoManager.CreateInfoFromFile(strDataCoverageFile);
CoverageDS myCovDS = myCovInfo.BuildDataSet(null);
//Clean up the file we copied.
File.Delete(strDataCoverageFile);
CoveragePolicy cPolicy = CoveragePolicy.CreateFromXmlFile(strXmlPath);
//loop through and display results
Console.WriteLine("Code coverage results. All measurements in Blocks, not LinesOfCode.");
int TotalClassCount = myCovDS.Class.Count;
int TotalMethodCount = myCovDS.Method.Count;
Console.WriteLine();
Console.WriteLine("Coverage Policy:");
Console.WriteLine(string.Format(" Class min required percent: {0}%", cPolicy.MinRequiredClassCoveragePercent));
Console.WriteLine(string.Format(" Method min required percent: {0}%", cPolicy.MinRequiredMethodCoveragePercent));
Console.WriteLine("Covered / Not Covered / Percent Coverage");
Console.WriteLine();
string strTab1 = new string(' ', 2);
string strTab2 = strTab1 + strTab1;
int intClassFailureCount = 0;
int intMethodFailureCount = 0;
int Percent = 0;
bool isValid = true;
string strError = null;
const string cErrorMsg = "[FAILED: TOO LOW] ";
//NAMESPACE
foreach (CoverageDSPriv.NamespaceTableRow n in myCovDS.NamespaceTable)
{
Console.WriteLine(string.Format("Namespace: {0}: {1} / {2} / {3}%",
n.NamespaceName, n.BlocksCovered, n.BlocksNotCovered, GetPercentCoverage(n.BlocksCovered, n.BlocksNotCovered)));
//CLASS
foreach (CoverageDSPriv.ClassRow c in n.GetClassRows())
{
Percent = GetPercentCoverage(c.BlocksCovered, c.BlocksNotCovered);
isValid = IsValidPolicy(Percent, cPolicy.MinRequiredClassCoveragePercent);
strError = null;
if (!isValid)
{
strError = cErrorMsg;
intClassFailureCount++;
}
if (ShouldDisplay(blnDisplayOnlyErrors, isValid))
{
Console.WriteLine(string.Format(strTab1 + "{4}Class: {0}: {1} / {2} / {3}%",
c.ClassName, c.BlocksCovered, c.BlocksNotCovered, Percent, strError));
}
//METHOD
foreach (CoverageDSPriv.MethodRow m in c.GetMethodRows())
{
Percent = GetPercentCoverage(m.BlocksCovered, m.BlocksNotCovered);
isValid = IsValidPolicy(Percent, cPolicy.MinRequiredMethodCoveragePercent);
strError = null;
if (!isValid)
{
strError = cErrorMsg;
intMethodFailureCount++;
}
string strMethodName = m.MethodFullName;
if (blnDisplayOnlyErrors)
{
//Need to print the full method name so we have full context
strMethodName = c.ClassName + "." + strMethodName;
}
if (ShouldDisplay(blnDisplayOnlyErrors, isValid))
{
Console.WriteLine(string.Format(strTab2 + "{4}Method: {0}: {1} / {2} / {3}%",
strMethodName, m.BlocksCovered, m.BlocksNotCovered, Percent, strError));
}
}
}
}
Console.WriteLine();
//Summary results
Console.WriteLine(string.Format("Total Namespaces: {0}", myCovDS.NamespaceTable.Count));
Console.WriteLine(string.Format("Total Classes: {0}", TotalClassCount));
Console.WriteLine(string.Format("Total Methods: {0}", TotalMethodCount));
Console.WriteLine();
int intReturnCode = 0;
if (intClassFailureCount > 0)
{
Console.WriteLine(string.Format("Failed classes: {0} / {1}", intClassFailureCount, TotalClassCount));
intReturnCode = 1;
}
if (intMethodFailureCount > 0)
{
Console.WriteLine(string.Format("Failed methods: {0} / {1}", intMethodFailureCount, TotalMethodCount));
intReturnCode = 1;
}
return intReturnCode;
}
private static bool ShouldDisplay(bool blnDisplayOnlyErrors, bool isValid)
{
if (isValid)
{
//Is valid --> need to decide
if (blnDisplayOnlyErrors)
return false;
else
return true;
}
else
{
//Not valid --> always display
return true;
}
}
private static bool IsValidPolicy(int ActualPercent, int ExpectedPercent)
{
return (ActualPercent >= ExpectedPercent);
}
private static int GetPercentCoverage(uint dblCovered, uint dblNot)
{
uint dblTotal = dblCovered + dblNot;
return Convert.ToInt32( 100.0 * (double)dblCovered / (double)dblTotal);
}
}
}
Monday, August 6, 2007
Easily insert huge amounts of test data
I've plugged the free MassDataHandler tool before - an open source tool that lets you use XML script to easily insert data. One limitation of the tool (perhaps to be solved in future releases) is the inability to easily insert mass amounts of data.
Say you want a table to have 10,000 rows so you can do some SQL performance tuning. Ideally you could specify some algorithm to dictate the number of rows and how each row of data is unique. For example, you may want to say "Insert 10,000 rows of company data, using different company names of the form 'co' + i.ToString(), such as co1, co2, co3, etc...".
You can easily do this. First you could use the MDH to insert the parent data. Then for specific high-volume tables, you could use the SQL while loop to specify the insert strategy, like so:
Declare @i int select @i = 1 WHILE (@i <= 10000) BEGIN --Define the dynamic data to insert Declare @co varchar(10) select @co = 'co' + cast(@i as varchar(4)) --Do the SQL insert Insert into MyTable ( [co], [SomeColumn] ) Values( 1, @co, 'someData' ); --increment the counter select @i = (@i + 1) END
The would quickly insert 10,000 rows of test data into MyTable. You could customize the technique for other tables, adding multiple inserts in the loop, or adjusting for a multi-column unique index.
Living in Chicago and interested in working for a great company? Check out the careers at Paylocity.
Sunday, October 15, 2006
Backwards: "I wanted to do Unit Tests, but my manager wouldn't let me"
I've heard this before when I give interviews or meet new developers at tradeshows. The thinking seems to go that "While I'd personally love to do this best practice of 'Unit Tests', adding them takes a lot longer (like adding a new development phase), which costs extra money, therefore I need managerial approval." This is fundamentally backwards.
The whole point of Unit Tests is that:
- They save you time: Obviously with regression testing, but also by stubbing out the context so you can very quickly test things in isolation (without wasting tons of time constantly re-setting up that context). They also help you to see all the boundary test cases, and hence prevent future bugs.
- They are free to use - open source tools like NUnit can be downloaded for free and instantly used for your own personal development. It's not like you need to purchase a separate expensive tool, or hire out some auditor to review your code.
- You write tests as you develop, not afterwards.
Here's an analogy: Think of your schedule like a bucket, and your tasks are like rocks that fill up the bucket. You can't increase the size of your bucket, or decrease the number of rocks, therefore the bucket (i.e. your schedule) seems full. However, there are gaps between the rocks (just like there are gaps between tasks - like setting up the context and regression testing). You could pour sand into a full bucket, in the cracks in between the rocks. That's what unit tests are like. If you do them as you develop, you can squeeze them into your schedule without overflowing it.
Sunday, September 3, 2006
Creating Database Unit Tests - new install UI
The MassDataHandler is a framework to assist with Database Unit Testing. The framework makes it very easy to insert test data into a database, which in turn makes it very easy to write database unit tests. The user merely needs to specify their relevant data in a simple XML fragment, and then the framework uses knowledge of the database schema to do all the grunt work and convert that XML data into SQL, from which it populates the database.
It is an open-source .Net app available here.
We recently improved the install process for the app by having a GUI form collect environmental info for SQL 2005, MSTest, and MSBuild. The install script uses the concept of Having an MSBuild script collect userinput.
If you're looking to test your data layer, check out this free tool. It's now easier to install.
Monday, August 21, 2006
Creating Database Unit Tests in 10 minutes.
Unit testing the database is hard - and there are mixed reactions to it. My take is that there could be legitimate logic in the database (transactions, SQL aggregations, reusable functions, complex SP like paging, etc...). The problem with unit-testing the database is creating a test database and filling it with data. At Paylocity, we created a tool called the MassDataHandler to help us unit-test our data layer. We recently open-sourced this tool, and you can download the beta here.
The MassDataHandler is a framework to assist with Database Unit Testing. The framework makes it very easy to insert test data into a database, which in turn makes it very easy to write database unit tests. The user merely needs to specify their relevant data in a simple XML fragment, and then the framework uses knowledge of the database schema to do all the grunt work and convert that XML data into SQL, from which it populates the database.
The XML fragments that contain the test data are robust and refactorable. You can include expressions like variables substitution and identity lookup, default row templates, and include statements to import sub-templates. Because the framework already knows the database schema, you only need to specify the relevant data, and the framework can auto-populate the rest of the row’s required columns with dummy data.
For example, the following Xml script would insert data into three tables - providing you variables, default row templates, and automatically handling the identity lookups and non-null fields:
<Root>
<Variables>
<Variable name="lastName" value="Simpson" />
Variables>
<Table name="Customer">
<Row CustomerName="Homer $(lastName)" />
<Row CustomerName="Marge $(lastName)" />
Table>
<Table name="Product">
<Row ProductName="KrustBurger" Description="best burger ever" />
Table>
<Table name="Order">
<Default LastUpdate="12/23/1997" />
<Row CustomerId="Customer.@1" ProductId="Product.@1" />
<Row CustomerId="Customer.@2" ProductId="Product.@1" />
Table>
Root>
Download the MDH Framework Beta here.
By making it so easy to insert and maintain test data, the MDH framework helps you write database unit tests.
Thursday, May 18, 2006
How to send the Console output to a string
Sometimes when writing a tool or automated process, you want to read in the console output and send it to a string. From there, you can do whatever you want to it, such as parse for info. For example, a third-party tool (not necessarily even built with .Net) may display result information to the console, and you may want that info as part of some bigger process. The classes to do this reside in System.Diagnostics.
public static string GetConsoleOutput(string strFileName, string strArguments)
{
ProcessStartInfo psi = new ProcessStartInfo();
StringBuilder sb = new StringBuilder();
try
{
psi.WindowStyle = ProcessWindowStyle.Hidden;
//Need to include this for Executable version (appears not to be needed if UseShellExecute=true)
psi.CreateNoWindow = true;
psi.FileName = strFileName;
psi.Arguments = strArguments;
psi.UseShellExecute = false;
psi.RedirectStandardOutput = true;
psi.RedirectStandardError = true;
Process p = Process.Start(psi);
sb.Append(p.StandardOutput.ReadToEnd());
sb.Append(p.StandardError.ReadToEnd());
p.WaitForExit();
}
catch (Exception ex)
{
sb.Append("\r\nPROCESS_ERROR:" + ex.ToString()); ;
}
return sb.ToString();
}