timstall

Thursday, September 1, 2005

Migrating an app from Excel to .Net, Part I

[This was originally posted at http://timstall.dotnetdevelopersjournal.com/migrating_an_app_from_excel_to_net_part_i.htm]

I recently had the challenge of improving upon an Excel workbook. This was a complicated sheet, with 200 columns of ugly calculations that solved a difficult engineering problem. There were many rows, one for each widget being engineered. I'm going to blog about several issues I encountered for this.

The first issue I needed to deal with: should I extend Excel (with VBA) or rewrite the application in .Net. I see the following pros and cons:

	Extend Excel with VBA	Rewrite in .Net
Pro	Potentially much quicker - not rewriting all the complex calculations Excel is already designed for this kind of thing. It's easy to copy new rows, and it has a built in tools like GoalSeek (a numeric solver) of which there is no clear counterpart available in .Net (perhaps I could implement Newton's Method with numeric derivatives, but that takes some work would be very error prone for me to mess up). Excel is easy to maintain for the end user, especially if they need to make a change.	Powerful platfom Easier to test code with Test-Driven-Development (what if you need to modify a formula in your Excel book, and you want to make sure everything still works) Splits out input, logic, and view into separate components. For example you could have a WinForm take (and validate) the input, a backend DLL could do all the logic, and then you could generate an Xml file or Html document with the final output results.
Con	Limited power, VBA is messy to deal with compared to .Net with C#, Visual Studio, and the .Net Framework Excel combines the input, logic, and view all into one sheet. For example is you wanted to show the client only the calculated output, you'd either send them the entire sheet (which lets them see your calculations unless you do some fancy cell-hiding), or copy those values to a new document.	Lot of extra work Hard for the end user to change or maintain

The final solution I'm looking at: make a hybrid. Continue using Excel for user input and calculations, extending it with VBA. But use .Net to read the Excel document and generate output views. I'll discuss this more in my next blog.

Monday, August 29, 2005

10 Tips on CodeSmith SubTemplates, Batch Mode, and XmlProperties

[This was originally posted at http://timstall.dotnetdevelopersjournal.com/10_tips_on_subtemplates_batch_mode_and_xmlproperties.htm]

I've finished using CodeSmith to help with generating some of our tiers (or should I say "just started"?). So I posted some things I've learned from it: 10 Tips on SubTemplates, Batch Mode, and XmlProperties on the CodeSmith Peer Support Forum. This covers solutions I've found for:

Passing values between subtemplates
Handling that Sub template does not inherit parent's merge strategy
Handling that target of RenderToFile must exist
Handling Multi-level subtemplates
Throwing and using errors in batch mode
Embed XmlProperty filename instead of data
Dynamically make your own batch file
Only write new content
Reusable code for XmlProperties
Instantiate an XmlProperty at runtime

Details on the forum.

Wednesday, August 24, 2005

Using Sysinternal PsTools to find what has locked that process

[This was originally posted at http://timstall.dotnetdevelopersjournal.com/using_sysinternal_pstools_to_find_what_has_locked_that_proce.htm]

Every get one of those frustrating errors "Cannot access file because it is being used by another process"? For example, you may be trying to delete or rename a file, but this error message won't let you. Sometimes it is as simple as closing an application that obviously has locked that file, but sometimes it's just more complicated - perhaps some service (which has no GUI, and therefore may be less intuitive) has locked the file. Ideally you could just click the file in Windows Explorer and it would tell you what locked the file and what to do.

There is a solution - Sysinternals provides a free tool, Process Explorer, that lets you find all the processes accessing a file. The tools requires no MSI; you can directly copy it to your machine and run it. It's like Task Manager on steroids (in a good sense).

Initially one may make sure that they don't have the file open in some sort of editor, but the problem may persist even if everything is closed. Go to Find > Find Handle, and type in the name or directory of the file. Click search, and it will show you all processes accessing that file. You can now kill those processes such that you can delete the file.

This is actually very educational - you'll see that IIS, other instances of Windows Explorer, certain background services, or apps that you wouldn't think of, all can lock files.

Monday, August 22, 2005

CodeSmith: Beyond Just Generation - Doing Simple Data Analysis

[This was originally posted at http://timstall.dotnetdevelopersjournal.com/codesmith_beyond_just_generation__doing_simple_data_analysis.htm]

I've made several CodeSmith related posts on this blog:

There's another topic I'd like to explore - using CodeSmith for analyzing Xml data and Database schemas. For example, suppose you wanted to compare if two tables had similar schemas. For example I wanted to see if one database had different column lengths than another for all tables in the database. There's several ways you can do this:

Manual --> bad (too slow, error prone, tedious, not-reproducible)
Build your own custom console app, use ADO.Net --> very slow to create, to easy to have errors
Find a custom tool specifically for DB comparisons --> yet another tool to learn, may provide far more functionality than needed
Use CodeSmith's Schema Explorer --> has potential!

CodeSmith's Schema explorer is extensive - it lets you start at the database and drill down to its collections of tables, columns, and individual properties. So it has the info needed for most tasks. It's also easy to use, so it saves you time in writing your own equivalent object. It's also a reusable tool. I.e. if you need to learn a technology, learn one that is helps out with more than just the current problem. You could create two SchemaExplorer.DatabaseSchema objects, and then cycle through their tables and columns, comparing them.

CodeSmith also has the XmlProperty, which lets you analyze an Xml document. You could program compare the Xml doc to the database schema.

The "generated code" could be the results of your analysis. For example, wherever the condition you're searching for is met, you could Response.WriteLine the necessary info.

While CodeSmith is designed for CodeGeneration, it also can be used for simple analysis of data sources that are commonly used as inputs for that generate - such as Xml docs and Database schemas.

Sunday, August 14, 2005

Codesmith: Code Generation v. Refactoring

[This was originally posted at http://timstall.dotnetdevelopersjournal.com/codesmith_code_generation_v_refactoring.htm]

Looking into CodeSmith, a good code generation tool, I started analyzing the differences between code generation and refactoring. Both help reduce redundancy. They are complimentary, with each concept having a different purpose:

Code Generation - Given input values and a template, automatically generate redundant code that varies only by the input values. This deals with how you create code.
Refactoring - Keep the functionality of your code the same, but improve the code itself via eliminating redundancy, improving clarity, etc... This deals with the final code, regardless of how that code was created.

Whenever you find yourself making repetitive code, you should at least check if these techniques can bail you out. Common examples may be (depending on your project) the data-access layer, the structure of business entities, documentation, unit tests, etc... Note that code generation and refactoring aren't opposites - but rather complementary. You can create a refactored class that still has some repetitiveness, and then code-generate it. For example, a fully refactored business entity may still have many properties that all follow the same format. For example, say your class has 10 such properties below. Given only the DataType and PropertyName, a code-generation template could crank this out:

public Double MyValue
{
    get
    {
        return _dblMyValue;
    }
    set
    {
        _dblMyValue = value;
    }
}

There are certainly places where repetitive code can be refactored such that there is no longer any redundancy, and therefore nothing to code-generate. A good code-generation tool is no excuse for not refactoring properly. However there are many places where refactoring alone is insufficient:

Design time: Code generation is done at design-time, and thus offers a performance benefit over refactoring (executed at run time). Say you needed a class to access the properties of your business entities. You could use reflection at design time, or (assuming you know the entities) you could use code-generation to create the necessary code before hand.
More functionality: Certain things can't effectively be refactored. If you needed to refactor a class that could handle many different Data Types, you'd need to use boxing./unboxing to handle the type-conversion. For example, pre-.Net 2.0 (which has generics) most collections, like HashTable or ArrayList, are not strongly typed. This requires boxing and unboxing which has obvious performance problems. .Net 1.1 does not provide a way to refactor the handling of various data types without the performance loss of boxing/unboxing. However you could use CodeSmith's strongly typed collection templates to automatically generate your own collections that don't require boxing. This is essentially still refactored because it provides additional functionality that the "refactored" version didn't meet - type safety.
Beyond source code: A good code generation tool, like CodeSmith, can create any text file, not just object-oriented source code. For example, suppose you wanted to document your database schema. CodeSmith provides pre-packaged templates that do this. I am not a aware of a way that "refactoring" would solve this problem.
Always automatic: Refactoring is great for object-oriented code with supporting unit tests. But it is difficult for many things outside of source code like documents or html pages. Short of other tools, these require a more manual approach. Code generation remains automatic

In conclusion, both refactoring and code generation are good things, but they are different and complimentary things.

Thursday, August 11, 2005

Things that fundamentally change how you program

[This was originally posted at http://timstall.dotnetdevelopersjournal.com/things_that_fundamentally_change_how_you_program.htm]

There are a lot of things that fundamentally change how you program - you see something and the light bulb goes off. For me, the following was such a checklist (not in any order)

Object Oriented Code --> Many VB6 developers haven't needed to deal with OOP yet. However VB.Net is now object-oriented.
Unit Tests --> automated tests for specific units of code to assist with development and regression (usually run with NUnit)
Regex Expressions --> Manipulating text
Xml --> Consistent and standard data storage
Code Generation --> eliminate redundant code that can't just be refactored.
Consciously seeking Tools --> tools help you avoid or simplify tedious tasks.

Many people do some of these, but I've seen many developers reluctant to invest in the other areas. They'll write lines of text-parsing code before spending two hours investing in Regular Expressions.

I'm currently looking more into code generation with CodeSmith. It is a great tool, and I plan to post some cool findings on it.

Sunday, August 7, 2005

Referencing lists of Html controls in Repeaters and DataGrids

[This was originally posted at http://timstall.dotnetdevelopersjournal.com/referencing_lists_of_html_controls_in_repeaters_and_datagrid.htm]

If you have list that you bind to a repeater, and you need to dynamically create html objects for each row (such as adding a checkbox or hidden field for each row). .Net provides ways to cleanly handle those Html controls using the "name" attribute.

Note that each html control can have an ID (which must be unique) and a Name (which need not be unique). If you assign your controls a common name attribute, such as giving each checkbox in the repeater rows the name "chkA", then you can easily access those fields. Say you had the following repeater:


	... other columns

On the server, you can create an array of the selected values like:

string[] s = Page.Request["ChkA"].Split(',');

On the client, you can access the fields via the document.getElementsByName('name'):

function DoClick() {
    var v = document.getElementsByName("ChkA");
    var s = "";
    for (var i = 0; i < v.length; i++) {
        if (v[i].checked) {
            s = s + v[i].value + ",";
        }
    }
}