[This was originally posted at
http://timstall.dotnetdevelopersjournal.com/codesmith_code_generation_v_refactoring.htm]
Looking into CodeSmith, a good code generation tool, I started analyzing the differences between code generation and refactoring. Both help reduce redundancy. They are complimentary, with each concept having a different purpose:
- Code Generation - Given input values and a template, automatically generate redundant code that varies only by the input values. This deals with how you create code.
- Refactoring - Keep the functionality of your code the same, but improve the code itself via eliminating redundancy, improving clarity, etc... This deals with the final code, regardless of how that code was created.
Whenever you find yourself making repetitive code, you should at least check if these techniques can bail you out. Common examples may be (depending on your project) the data-access layer, the structure of business entities, documentation, unit tests, etc... Note that code generation and refactoring aren't opposites - but rather complementary. You can create a refactored class that still has some repetitiveness, and then code-generate it. For example, a fully refactored business entity may still have many properties that all follow the same format. For example, say your class has 10 such properties below. Given only the DataType and PropertyName, a code-generation template could crank this out:
public Double MyValue { get { return _dblMyValue; } set { _dblMyValue = value; } }
|
There are certainly places where repetitive code can be refactored such that there is no longer any redundancy, and therefore nothing to code-generate. A good code-generation tool is no excuse for not refactoring properly. However there are many places where refactoring alone is insufficient:
- Design time: Code generation is done at design-time, and thus offers a performance benefit over refactoring (executed at run time). Say you needed a class to access the properties of your business entities. You could use reflection at design time, or (assuming you know the entities) you could use code-generation to create the necessary code before hand.
- More functionality: Certain things can't effectively be refactored. If you needed to refactor a class that could handle many different Data Types, you'd need to use boxing./unboxing to handle the type-conversion. For example, pre-.Net 2.0 (which has generics) most collections, like HashTable or ArrayList, are not strongly typed. This requires boxing and unboxing which has obvious performance problems. .Net 1.1 does not provide a way to refactor the handling of various data types without the performance loss of boxing/unboxing. However you could use CodeSmith's strongly typed collection templates to automatically generate your own collections that don't require boxing. This is essentially still refactored because it provides additional functionality that the "refactored" version didn't meet - type safety.
- Beyond source code: A good code generation tool, like CodeSmith, can create any text file, not just object-oriented source code. For example, suppose you wanted to document your database schema. CodeSmith provides pre-packaged templates that do this. I am not a aware of a way that "refactoring" would solve this problem.
- Always automatic: Refactoring is great for object-oriented code with supporting unit tests. But it is difficult for many things outside of source code like documents or html pages. Short of other tools, these require a more manual approach. Code generation remains automatic
In conclusion, both refactoring and code generation are good things, but they are different and complimentary things.