Monday, July 30, 2007

More: How normal life experience helps you better understand software (Part II)

[This was originally posted at http://timstall.dotnetdevelopersjournal.com/more_how_normal_life_experience_helps_you_better_understand.htm]

I mentioned in my previous post how normal life (i.e. things besides software) can help you better understand software engineering because it sometimes explains concepts in a more easily-understood context. Here are some more examples:

  • Reuse - Developers are notorious for avoiding code reuse. However, in the real world, we wouldn't dream of constantly re-inventing the wheel. For example, you'd go to the hardware store to get standard size nails and bolts (as opposed to smelting your own metal), you'd fuel your car at a standard gas station (as opposed to using your own processed fuel), or you'd buy furniture (instead of growing your own wood, cutting the pieces, and assembling things yourself). While there are exceptions, generally we reuse standard things. Entire franchises and industries exist to provide us those things. The point is that there simply isn't enough time or resources to do everything from scratch. Same thing applies to software engineering.
  • Demand for simplicity and reliability. When you see a light switch, you expect a standard behavior - simply switch on or off, with perhaps an intermediate state to dim the lights. You just want to take the light switch for granted and move onto other more important things. What you don't want is to constantly need to tinker with it and "hope" that it works. Same thing with software. People expect our software to just work, so that they can move on to their important tasks. So much software is like a broken light switch - you need to tinker with the interface, tweak the config file, add an external dependence, maybe recompile something, etc... Ideally, you can just run a simple install script (I like using MSBuild to automate all those tasks) and then take it for granted.
  • Resources limit you - In normal life, things cost money, and that limits us. If you want twice as much food for a party, you pay more for it. In programming it's easy to ignore the cost of resources (like disk space or CPU cycles) because the machine is so fast and the program is usually developed on a dev machine with a light load where the cost of resources is easily ignored. The obvious problem is that when the code goes into production, and there's literally 1000x more demand, it can screech to a halt. Practically, we've all seen developers squander resources (like using tons of unnecessary, yet expensive database hits) in a way that you'd never do in other aspects of daily living.
  • Get rid of  junk - Physical objects, like cars and furniture, eventually wear down and become junk. For example, you won't drive across the country in a car so worn that it could leave you stranded. While you may be able to salvage spare parts or something, it's time to move on. Code is the same way. Some code, via run-away bug lists, flawed design, obsolete technology, or obsolete purpose (business requirements totally change), essentially becomes junk. While parts may still be salvageable, bad code becomes a millstone hanging around your neck, and it may be best to move on (i.e. overhaul, rewrite, use new technologies). The problem is that many developers are emotionally attached to their code, and would rather sink with it than cast it away.

    Living in Chicago and interested in working for a great company? Check out the careers at Paylocity.

Wednesday, July 25, 2007

How normal life experience helps you better understand software

[This was originally posted at http://timstall.dotnetdevelopersjournal.com/how_normal_life_experience_helps_you_better_understand_softw.htm]

There are some things in software engineering that are hard to explain, or hard to convince others to do. One benefit of normal life experience, i.e. things besides programming, is that they can sometimes more effectively explain those difficult concepts by emphasizing them in a different context.

Most excuses for bad code come from "I don't have time to do it properly", or "It's just throw-away code for my own personal use, it will never be used in production." Yeah right. There are so many developer bad practices with analogies to normal-day life activities that show how silly these excuses are.

For example, here's a partial list

  • Bad Labeling - Many developers give their variables and methods useless names, such as "x1" or "DoEverything()". But we label things in normal life, like our luggage that we check in at the airport, or boxes when moving houses. Imagine how silly (and time-consuming) it would be to refuse to label your luggage because "you don't have time - I'll just look for the black suitcase".
  • Packaging and a clean contract - lots of code has messy contracts - it's not clear how to call the code, or where the code's responsibility ends and the consumer's begins. Apply this to moving houses - the contract is clear - you put things in designated moving boxes (packing them within those boxes however you see fit), and the movers haul them to the new location. Imagine the mess if you "didn't have time" to pack the boxes. Some movers will still do it for you, but it will cost a lot more.
  • Kicking off  a process - a lot of developers program only in series. But in real life we often kick something off while we go do another thing - for example with chores like starting the dishwasher, letting things dry or melt, or letting plants grow. Once you kick these things off, they're easy to maintain. But if you wait until one such task is finished before starting the next, you'll never get all the chores done.
  • The cost of failure - in most engineering practices, failure can be devastating. If your car breaks down on the highway, it's bad. In civil engineering, a failure in a bridge or building could cause the entire structure to collapse and cost lives and tens of millions of dollars. In software engineering, a lot of developers don't really account for potential failure (error checking code, security flaws, bad logic, etc...). Software has errors for several reasons, including that software engineering is still relatively new and people are still amazed that software actually works, management doesn't want to pay to ensure that program works, or because it's just hard making something be solid. Either way, in software engineering it can be easy to ignore the cost of failure, but this can be much clearer in other fields.
  • The need for peer review - In most daily activities you'd ask for help if something is complicated, whether it's asking for directions while driving, or how to fix an appliance in your house. However, it still amazes me that many management teams develop incredibly complex applications, but don't want to "waste" time reviewing that error-prone work. It's almost as if some teams spend more time reviewing how to fix their $50 toaster than how to check their $500,000 software application.

Experience in software engineering is great, but there are some concepts that are just really easy for some people to understand outside of a software-engineering context. Once understood, they can then re-apply to their engineering discipline.


Living in Chicago and interested in working for a great company? Check out the careers at Paylocity.

Thursday, July 12, 2007

Using CodeSmith to create your own Domain-Specific-Language

[This was originally posted at http://timstall.dotnetdevelopersjournal.com/using_codesmith_to_create_your_own_domainspecificlanguage.htm]

Using CodeSmith to create your own Domain-Specific-Language

Yesterday I mentioned about software factories and domain-specific languages (DSL). A DSL is just that - working at a higher level language that maps to a specific problem domain instead of constantly re-inventing the wheel with a lower-level language. Some common examples of DSLs are:

  • SQL
  • Regular Expressions
  • XPath

Each of these could be achieved by coding in a "low level" language like C#, but you wouldn't think of doing that because it'd be too slow and error prone. It's so easy to use each language because it maps naturally to what you're trying to do in that domain.

This same concept applies to application development. For example, an application has different domains:

  • Initial data for your application (like security settings, roles, out-of-the-box dropdown values, etc...) - usually achieved with lots of custom SQL scripts.
  • UI formatting - usually achieved with tons of table or CSS references, or highly-refactored controls
  • Validation
  • Data access code

Each of these can have their own DSL, which you could easily create using a code-generator like CodeSmith. You could abstract the concepts to an XML schema, and then use CodeSmith as a "Compiler" to transform that xml into the appropriate output (sql, html, or C# code files). CodeSmith's out-of-the-box XmlProperty feature, along with text based templates and huge online community make it very easy to do.

For example, instead of having tons of custom SQL scripts for your security data, you may have a hierarchal XML file that (1) is completely refactored and maps directly to the business needs (something potentially impossible with a procedural language like SQL), (2) can be easily validated via the XML schema and CodeSmith checks, and (3) is much easier to track version history on.

Microsoft offers their own DSL toolkit, but I think it doesn't yet compete with CodeSmith because the MS DSL toolkit: (1) requires you to learn a whole new GUI syntax (whereas CodeSmith is intuitive C# and Xml templates), (2) seems limited in what it can generate, and (3) screws up Visual Studio by inserting a new reference (or something like that) into every project.

Once you start code-generating tedious stuff, such as using your own DSL, you'll never go back.


Living in Chicago and interested in a great company? Check out the careers at Paylocity.

Wednesday, July 11, 2007

Practical Software Factories in .NET

[This was originally posted at http://timstall.dotnetdevelopersjournal.com/practical_software_factories_in_net.htm]

I'm a big fan of automation (especially via MSBuild and CodeSmith), so I'm very interested in the Software Factory community. I recently finished Practical Software Factories in .NET by Gunther Lenz Christoph Wienands. As I understood it, they emphasized four main parts for a software factory:

  1. Product Line Development - emphasis on a specific type of software (web apps vs. 3D-shooters vs. winForms)
  2. Reusable assets - class libraries, controls,
  3. Guidance in Context - this could be as simple as instructions in the comments of generated code
  4. Model-driven development (i.e. Domain Specific Languages) - working at a high level of abstraction.

A lot of this boils down to standardization, automation, and reuse - things already covered by classic books (like the Pragmatic Programmers). However, the software factory methodology provides a structured way to achieve those things.

I also found this book interesting because it discussed concepts at a much higher, and more practical level. for example, there are plenty of "syntax" books out there, like How to Program Technology X. There are also lots of conceptual books that address the theory and problems of software engineering, like Code Complete, The Pragmatic Programmers, Joel on Software, or The Mythical Man-Month. These transcend individual, and are therefore still relevant as new technologies come out.

Practical Software Factories is different because it both address concepts, yet refers to the current technologies, websites, articles, and open-source projects to achieve those concepts. So even a year or two from now, when the current crop of  tools and articles are replaced, its concepts will still be relevant, and likely implement-able with a new wave of tools.


Living in Chicago and interested in a great company? Check out the careers at Paylocity.

Wednesday, June 6, 2007

Using Generics for dynamic return type and validation.

[This was originally posted at http://timstall.dotnetdevelopersjournal.com/using_generics_for_dynamic_return_type_and_validation.htm]

Generics are a .Net 2.0 feature that essentially let you abstract out type. You can learn a lot about generics on MSDN.

One problem that generics can solve is how to have a method dynamically return a given type. For example, in the snippet below, the method GetData returns different types depending on what you pass in - either a double or an Int32. This is useful for creating a generalized method to parse out (and potentially validate) data. Note that the consumer of the GetData method need not deal with conversion - it receives a strongly typed value.

While this is just a trivial snippet, it's a nice demo of one of the features of generics.

    [TestMethod]
    public void DemoGenerics()
    {
      int i = GetData("123");
      double d = GetData("456");

      Assert.AreEqual(Convert.ToInt32(123), i);
      Assert.AreEqual(Convert.ToDouble(456), d);

    }

    public static T GetData(string strData)
    {
      string strType = typeof(T).Name;

      switch (strType)
      {
        case "Int32":
          return (T)(object)Convert.ToInt32(strData);
        case "Double":
          return (T)(object)Convert.ToDouble(strData);
        default:
          throw new Exception("Type not supported");
      }
    }

 


Living in Chicago and interested in a great company? Check out the careers at Paylocity.

Thursday, May 31, 2007

How to tune a SQL script

[This was originally posted at http://timstall.dotnetdevelopersjournal.com/how_to_tune_a_sql_script.htm]

Performance is critical, and slow SQL procs are often a huge performance bottleneck. The problem is how to measure it. Microsoft provides SQL Profiler to help with that.

While I'm certainly no DBA, here's a basic tutorial on how to tune a SQL script (check here for more on SQL profiler).

1. Get a database with production-sized data.

Because performance can very exponentially (i.e. there may be twice as much data, but it goes twenty times slower), you absolutely need to test with production-sized data, else all your measurements could be off.

2. Be able to run the SQL script in an isolated, deterministic environment

We don't want to chase ghosts, so make sure you have a deterministic environment: (A) no one else is changing the script-under-test, (B) you can call the script with a single SQL command (like exec for a SP, or select for a function), (B) you can call the script repeatedly and get the same functional result every time. Once the script works, we can make it work fast.

3. Open up SQL Profiler for a tuning template.

SQL profiler lets you measure how fast each SQL command took. This is invaluable if you have a complicated script with many sub-commands. It's almost like stepping through the debugger where you can evaluate line-by-line.

  1. Open up SQL Profiler (either from SQL Studio > Tools > SQL Server Profiler, or from the Start > Programs menu).
  2. In SQL Profiler, go to File > New Trace, and connect as the SA user.
  3. In the "Use the template", specify "Tuning"
  4. Open Profiler
  5. Profiler starts recording every command being sent to the database server. To filter by the specific SPID that you're running your SP from, run the SP_WHO command in your SQL Studio window to get the SPID, and then in SQL profiler:
    1. Pause the SQL Profiler trace
    2. Goto File > Properties
    3. A new window opens up, go to the "Events Selection" tab
    4. Select SPID, and in the filter on the right enter the value into the "Equals" treeview option.
    5. Filter SPID

 

4. Run your SQL statements and check the Profiler

Simply run your SQL statements in SQL studio, and check the results in SQL profiler.

The tuning template in profiler will record every command and sub-command being run. It will show the total duration (in milliseconds) for each line, and the full SQL text of what was run. This allows you to identify the bottlenecks, and tune those by changing the SQL code to something more optimal.

Trace

5. Compare the output to ensure same functionality

If you don't have an exhaustive suite of database tests, you can still help ensure that your proc is functionally equivalent by comparing the original SQL output (before tuning) to the new SQL output (after tuning). For example, you could save the output resultset as a file and then use a file-diff tool like Beyond Compare to ensure they're identical.

Summary

Again, books could be written on how to SQL tune. This is just a brief high-level tutorial to get you started.

 


Living in Chicago and interested in a great company? Check out the careers at Paylocity.

Wednesday, May 30, 2007

Development Trivia

[This was originally posted at http://timstall.dotnetdevelopersjournal.com/development_trivia.htm]

Real-world development has so many miscellaneous facts and trivia, so I'm going to experiment writing a "Friday Trivia" blog post. The intent is to discuss trivia that arose during the week.

Specify the default editor for a file

In windows explorer, right click a file, select "open with" > "choose program", and then check the checkbox that says "Always use the selected program to open this kind of file". You can then automatically open the file (in the designated editor) just by running System.Diagnostics.Process.Start(strFullFileName).

Selecting Comments with XPath

XPath is a powerful way to select nodes from an XML document. While XPath commonly selects normal nodes, you can also use it to select elements. For example, you may want to select a comment if you're inserting a node into a document, and want the comment to be a placeHolder for where you append that node.

      XmlDocument xDoc = new XmlDocument();
      xDoc.LoadXml(@"
       
            Additional Notes
           
            Text notes
       

        "
);

      XmlNode n = xDoc.SelectSingleNode("/employees/comment()");

 

MSBuild - Command line properties that contain CSV strings

In MSBuild, you can specify properties via a or with the /p: switch in the command line. These two are supposed to be identical, but they're not. You can specify a CSV string in a PropertyGroup just fine, but you can't in the command line switch because it interprets commas as a property delimiter (just like semi-colons). A work around is to use another character (like a hyphen '-'), or have the MSBuild script import the PropertyGroup from a separate file and write the CSV string to that file. Perhaps there's also a way to escape the comma (maybe with a carrot '^')


Living in Chicago and interested in working for a great company? Check out the careers at Paylocity.