Barry S. Stahl, MCSD
Skip Navigation LinksBarry S. Stahl, MCSD > Blogs > Developer Information Blog
Skip Navigation Links.

Unit Test "Normalization"

Friday, July 06, 2007

In a recent conversation about Unit Tests, I was asked about how many asserts I would put into a single test, since some feel that there should only be one Assert per test. My answer was, that I look at it like database normalization with the test name serving as the primary key; that is, the asserts in the test should relate directly and only to that key. This analogy is also appropriate because DB normalization is a good thing within reason, but can definately be overdone. Unit test "normalization" can also be overdone if we try to break-out each assert into its own test.

An example of where multiple asserts might be put into one test is a test of the Add method of a collection object which inherits from System.Collection.CollectionBase. When an item is added, it is appropriate to test for the proper index of that item to be returned from the method, as well as to test that the collection is holding the correct number of items after the Add is done. Both tests relate directly to the Add method. An argument could be made that the count of items relates to the Count property of the collection and therefore that assert doesn't relate only to the Add method, but since we are usually not coding the count property (because it was coded for us in CollectionBase), we don't need to test the Count property on its own, and it should be tested as part of the Add test.



Removing Assemblies from the GAC

Saturday, June 30, 2007

I recently stumbled across an interesting item in a back-issue of MSDN Magazine. The article, "Improving Application Startup Time" by Claudio Caldato, appeared in the CLR Inside Out segment in February 2006. While discussing strong-named assemblies, Claudio recommended adding them to the GAC for performance.

If an assembly is not installed in the Global Assembly Cache (GAC), you will pay the cost of hash verification of strong-named assemblies along with native code generation (NGEN) image validation if a native image for that assembly is available in the machine. In other words, if an assembly is strong named, the CLR will ensure the integrity of the assembly binary by verifying that the cryptographic hash of the assembly matches the one in the assembly manifest. But if the assembly is in the GAC, this verification can be skipped because the verification is performed as part of installation into the GAC and any update requires administrative permissions. So the CLR is basically assured that changes have not occurred.

The hash verification process is expensive because it involves touching every page in the assembly, which can be bad for cold startup. Also, the hash computation is CPU-intensive and thus impacts warm startup, too. The extent of the impact depends on the size of the assembly being verified.

If an assembly has been precompiled using NGEN but it is not installed in the GAC, then during binding, fusion needs to verify that the native image and the MSIL assembly are the same version (to avoid cases where a newer version of the assembly is deployed on the machine but a newer version of the native image is not generated). In order to accomplish that, the CLR needs to access pages in the MSIL assembly, which can hurt cold startup time.

I found this particularly interesting because I generally do not recommend putting assemblies into the GAC unless there is a particular need. The GAC is a very useful and powerful tool, but it does add complexity to the deployment of applications, occasionally limiting the frequency with which applications can be deployed, and often increasing the testing requirements for deployment of applications that use shared assemblies. As a result, I usually avoid putting assemblies in the GAC unless they truly need to be there (such as shared .dlls in applications that require that they be using the same version of the assembly). I have also heard of people pulling assemblies that were installed in the GAC, back out into bin-folder type deployments in order to simplify the deployment process.

The information from this article adds a wrinkle to the process of removing assemblies from the GAC because it makes the best-practice for doing so include the removal of the strong-name (which was required for inclusion in the GAC). As a result, there may be a performance penalty incurred at each application startup for these apps if the strong-name is left in place. Since removal of the strong-name will not always be possible, this is certainly something to consider. While I doubt that this could cause enough of a performance decrease by itself to make it worth keeping assemblies in the GAC that would otherwise be removed, it is a fact worth knowing, and more importantly, worth testing when considering such a move.



Testing Properties with Inconsistent Accessibility

Monday, June 25, 2007

I ran into an interesting problem today while attempting to test a property member which had a public getter, but an internal setter (a .NET 2.0 construction in C#).

Consider the following class:

Class Definition

This class features an internal constructor, along with a private field (_id) which is exposed by a property (ID) that is read-only on the public interface, but read-write internally to the assembly. This class looks as shown below in Reflector. Notice that the ID property is recognized as having a public getter, but the setter is marked as internal.

Class in Reflector
Class in Reflector

Using the Visual Studio 2005 test generator (right-click on the class and select Create Unit Test), I get the following:

Test Method - As Generated (click for full-size image)

The problem is, this test will not work (notice the blue squiggly). The error is:

Property or indexer 'TestClass.ID' cannot be assigned to -- it is read only

It appears that the code-generator only looks at the primary property scope delaration, that it is public, and ignores the internal qualifier on the setter. As a result, the code-generated accessor for the object does not contain an accessor for the ID property, and the generated test will not compile since the property is, in fact, settable only internal to the assembly.

The work-around here is actually quite simple, that is to do within the test what the code-generated accessor object normally does for us:

Test Method - Workaround (click for full-size image)

I figure that Microsoft knows about this problem by now, but I couldn't find anything about it on the net. I may not have been searching using the appropriate terminology. I'll send Scott Guthrie a note just in case. If you are aware of another way around this problem, or if you know of a way to get the code-generator to act properly under these conditions, I would be very interested to hear about it.



Owning Code is Evil

Wednesday, June 13, 2007

Commenting on a Rich Skrenta post, the point of which is that we should write as little code as possible, Jeff Atwood writes:

I couldn't agree more. I've given similar advice when I exhorted developers to Code Smaller. And I'm not talking about a reductio ad absurdum contest where we use up all the clever tricks in our books to make the code fit into less physical space. I'm talking about practical, sensible strategies to reduce the volume of code an individual programmer has to read to understand how a program works. Here's a trivial little example of what I'm talking about:

if (s == String.Empty)
if (s == "")

It seems obvious to me that the latter case is better because it's just plain smaller. And yet I'm virtually guaranteed to encounter developers who will fight me, almost literally to the death, because they're absolutely convinced that the verbosity of String.Empty is somehow friendlier to the compiler. As if I care about that. As if anyone cared about that!

I certainly agree that we should endeavor to write as little code as we can, for all of the reasons spelled out in Rich’s post. The example that Jeff gives is however, in my opinion, totally contrary to the true intent. While he is correct that nobody should care about whether or not code is better for the compiler, String.Empty is generally better to use than double-quotes because it is explicit, and therefore much easier for a human to read. There can be no doubt what String.Empty means, and there can be no doubt about what value it holds. While most editors/compilers eliminate the worry about control characters between double-quotes, there is no easy way to be sure, especially if you are viewing the code in notepad. As a result, we are generally better off typing the few extra characters. Remember that the goals is to create code that is easier (and therefore cheaper) to create, edit and maintain. Saving a few typed characters here and there does not help achieve that goal, it is simplicity of the code that does. Since one of the key factors in achieving simplicity is clarity, we should do whatever we can to make our code as clear and explicit as possible. This usually includes things like avoiding the use of default values, even though explicitly defining those values will cost us extra keystrokes.

Far more important in terms of code-reduction than saving keystrokes is avoiding owning code that someone else, usually Microsoft, is willing to own for us. I don't know how many times I have seen developers create their own serialization mechanism when .NET serialization would have worked fine, or create their own collection implementation from scratch rather than inheriting from System.Collections.CollectionBase. Please don’t misunderstand me; there are times when it is appropriate to do these things, if the canned mechanisms truly won’t work for the use-case. If these already-existing frameworks will work however, it is imperative that we allow Microsoft to own that code, and allow the thousands of other .NET developers out there to test it for us.



NullReferenceException Reading From SettingsPropertyValueCollection

Thursday, October 05, 2006

While working on a custom Profile provider, I needed to set the values in a SettingPropertyValuesCollection object to pass to the SetPropertyValues method of the provider. Using the code below, I was always getting a NullReferenceException when the provider attempted to read the values out of the collection.

        Dim objProperties As New System.Configuration.SettingsPropertyValueCollection
        Dim objProperty As New System.Configuration.SettingsProperty("BirthDate")
        objProperty.PropertyValue = #2/14/2004#
        Dim objPropertyValue As New System.Configuration.SettingsPropertyValue(objProperty)
        objProperties.Add(objPropertyValue)

The problem occurs because the collection doesn't know what type to assign the value to coming out of the collection. By modifying the code as follows, I specify the type of the property, and the process executes as expected.

        Dim objProperties As New System.Configuration.SettingsPropertyValueCollection
        Dim objProperty As New System.Configuration.SettingsProperty("BirthDate")
        objProperty.PropertyValue = #2/14/2004#
        objProperty.PropertyType = GetType(System.DateTime)
        Dim objPropertyValue As New System.Configuration.SettingsPropertyValue(objProperty)
        objProperties.Add(objPropertyValue)


nUnit vs. VSTS

Wednesday, October 04, 2006

Mark Michaelis posted a hitlist of things to do to convert from nUnit to VSTS tests in his article Converting a class library to a VSTS Test Project. A big part of this process is understanding the attribute translation:


nUnitVSTS
[TestFixture][TestClass]
[TestFixtureSetUp][ClassInitialize]
[TestFixtureTearDown][ClassCleanup]
[SetUp][TestInitialize]
[TearDown][TestCleanup]
[Test][TestMethod]



SQL ERD for Membership and Other ASP.NET 2.0 Services

Friday, September 29, 2006

SQL ERD for Membership and Other ASP.NET 2.0 Services - Click to see full-size
Click to see the image full-size



Solving DataSet Constraint Problems

Tuesday, August 29, 2006

Roy Osherove, and one of his commenters on his ISerializable blog, explains how to find out the source of a constraint problem in a DataSet. To do so, simply set the DataSet.EnforceConstraints property to false, then load your data. Once everything is loaded, set the EnforceConstraints property back to True, while trapping for the error. Once the error occurs, you can iterate through the Tables, testing the HasErrors property. For each table with errors, iterate through its rows testing the same property. Rows that have errors will have a property called RowError that describes the specific problem with that row.

Roy's original article is: DataSet hell - "Failed to enable constraints. One or more rows contain values....".


Emitting XML

Thursday, July 20, 2006

In Five Ways to Emit Test Results as XML, James McCaffrey provides a number of methods for producing XML data from your applications. His analysis is primarily centered around using the XML output for testing purposes but the information applies to any .NET application that uses XML.

XSL vs. Regular Expressions

Friday, May 19, 2006

I had an interesting discussion today with a colleague on the use of XSL vs. Regular Expressions. During the course of the conversation, I broke the process of translation down into 3 steps; pattern recognition, data interpretation, and data mapping. XSL excels at all 3 of these tasks, while Regular Expressions can do all 3, but excels primarily at pattern recognition. The result of the conversation was that Regular Expressions should be used in all situations where only pattern recognition needs to be done, such as in data validation (i.e. does this string look like an email address) and would be excellent for when only 1 item of data needs to be interpreted (i.e. Grab the email address from this string and do something with it). When multiple data items need to be interpreted and mapped, then XSL is clearly the better choice. Also, XSL is almost always the proper solution when the data is in XML format and is to stay in XML format.

Generics Concerns

Thursday, April 27, 2006

Another feature that concerns me (see my earlier post) is .NET Generics. While it is nice to be able to specifiy a strongly-typed collection without having to create a class, it seems to me like this is still the house of sticks, rather than the house of bricks we really want. Also, extensibility seems to suffer in this model because we are limiting our encapsulation. I certainly have not used this enough to say one way or another for certain, I just currently have concerns.

.NET 2.0 Concerns

Thursday, April 27, 2006

I am seeing some things in .NET 2.0 that concern me. Much of it has to do with Microsoft putting in features that have obviously been demanded by many developers, but were not included in earlier versions of the framework because, for the most part, they are the wrong thing to do the majority of the time. For example, Microsoft has included the ability to have inline code as well as the standard code-behind model in ASP.NET 2.0 pages. While this seems like a nice feature, I can't come up with a good reason to ever mix my object code and HTML code. Perhaps someone else can. If you do, please let me know.

Sample Using Statement in VB.NET 2005

Thursday, April 27, 2006

        Using wsBlogService As New BlogService.Blog
            Try
                Me.Text = wsBlogService.DisplayBlog("BlogName", 0)
            Catch objException As System.Exception
                Me.Text = String.Format("", objException.Message, wsBlogService.url)
            End Try
        End Using