Thursday, January 5, 2012

Detecting if a file is a merge in TFS VersionControl database

I was trying to run some metric calculations on files within a changeset, but I only wanted new files – i.e. I wanted to filter out merged, branched, or renamed files. For example, if someone created a branch, that shouldn’t count as adding 1000 new files.
One solution I found was to check the Command column of the TfsVersionControl.dbo.tbl_Version table. I realize the TfsVersionControl is a transactional database, and reports are encouraged to go off of TfsWareHouse, but that didn’t seem to contain this field.
Here’s the relevant SQL (NOTE: this is for VS2008, I haven’t tested it on VS2010).
select 
      CS.ChangeSetId, FullPath, Command, CreationDate,
      case
            when Command in (2,5,6,7,10,34) then cast(1 as bit)
        else cast(0 as bit)
      end as IsNew
from TfsVersionControl..tbl_Version V with (nolock)
      inner join TfsVersionControl..tbl_ChangeSet CS with (nolock)
      on V.VersionFrom = CS.ChangeSetId
where CS.ChangeSetId = 20123

The question becomes, what does the “tbl_Version .Command” column mean, and why those specific values? I couldn’t find official documentation (probably because it’s discouraged to run queries on it), so I did a distinct search on 50,000 different changesets to find all values being used, and I worked backwards comparing it against the Team Explorer UI to conclude it appears to be a bitflag for actions:

Command
Bit value
add
1
edit
2
branch
4
rename
8
delete
16
undelete
32
branch
64
merge
128


Recall there can be multiple actions (hence the bit field), such as merge and edit. So, if you want to find new code – i.e. adds or edits, then we’d take the following bit flags: 2, 5, 6, 7, 10, and 34.

Is New?
Bit value
Actions
Yes
2
edit
Yes
5
add (folder)
Yes
6
type/edit
Yes
7
add (add file)
No
8
rename
Yes
10
rename/edit
No
16
delete
No
24
delete,rename
No
32
undelete
Yes
34
undelete, edit
No
68
branch
No
70
branch, edit
No
84
branch,delete
No
128
merge
No
130
merge, edit
No
136
merge,rename
No
138
merge,rename,edit
No
144
merge,delete
No
152
merge, delete, rename
No
160
merge, undelete
No
162
merge, undelete, edit
No
196
merge, branch
No
198
merge, branch, edit
No
212
merge, branch, delete


Of course, this is induction, and it’s possible I may have missed something, but given a large sampling and lots of spot-checking, it appears to be reliable.

Wednesday, January 4, 2012

It’s not your code, but it is your opportunity

I occasionally hear the developer say “my code”, as in “I’ll check in my code at the end of next week”, or “My code doesn’t need unit tests”.
In one sense, I want developers to think “this is my code” so they take pride in doing the best job possible. But really, it’s not your code, it’s the company’s code – they’re paying for it, and often legally they own it (i.e. it would be illegal to take chunks of code you wrote at one company and either privately sell it, or check it into another company’s source code repository).
This perspective really changes the discussion, i.e. “The company would like their code to be checked-in on a regular basis”, or “The company would like their code to be properly tested”.
However, it is the developer's "opportunity to learn" – i.e. the company keeps the code, but the developer keeps the improved skill from writing that code.

Tuesday, January 3, 2012

31 User Groups in the Midwest

Clark Sell did a great series on the various user groups in the Midwest. He provided a helpful recap here:
Here's the link for our Lake County .Net Users Group (LCNUG).
There's over a dozen groups in Illinois alone.  There's something for everyone. Given the benefits of such groups (meeting dedicated peers face to face, hearing from expert presenters, etc…), check it out.

Friday, December 30, 2011

The benefits to “check in early and often”

I am a huge advocate of checking in early and often. I’ve seen many a project get burnt by the developer who saves 3 weeks of work for a single “glorious” check-in.
I favor frequent check-ins because it’s:
  1. Cheaper integration. Someone once said “Integration is pay me now or pay me later”, and I find it much easier to pay now. Especially with automated builds and continuous integration, it’s much easier to check in 10 little changes than 1 big change (Sometimes I think of it like being easier to hold my breath for 30 seconds, ten times, as opposed to holding it for 5 minutes straight). Why? Because with bigger changes, you inevitably get farther out of sync – especially on critical shared files – and there’s more to forget.
  2. More objective measure of what you really have: Code that isn’t checked in, that just works on a developer’s machine, doesn’t really exist. They might as well say “it works in my head”. Once you actually get the code past a build server’s policy, then we can see what’s really there.
  3. Earlier Detection: We all know it’s cheaper to fix a bug or redesign the sooner you catch it. I’d rather developers check in code early so we can quick detect things (“why is there 5000 lines but no tests?”)
  4. More Modular: Checking in 10 chunks of code, where each one works, implies more granular and modular code. I.e. code that can at least be split into multiple check-ins is more modular than code that can’t be split at all.
Of course there’s always exceptions (you do a massive refactoring, etc…), but those should be the exception, not the rule.
Most of the time, in my experience, large check-ins by developers means something bad – spaghetti code, tightly-coupled code, code that was trying to hide under the radar until right before the deadline and then the developer says “oops, I just don’t have time to change it”, or something like that. Think of it like this: there is zero benefit to you to have to wait one month before seeing what a developer is doing, but there is benefit to early detection of code, so risk-reward wise it’s better to check-in early.
Note that for these purposes, a shelve set is not the same as a check-in. Shelvesets are private, and hence deliberately avoid the benefits listed above (which some say is a feature). For example, you mostly likely don’t have builds on a private shelfset. For a developer to say “I put my 20,000 lines in a shelveset” is misguided– use a branch instead if you need to.
So how to encourage check-in early and often?
You could write a whole chapter on this, but here's a short answer: You can explain the benefits so some developers are internally motivated, or you can make it official policy so that other developers are externally “motivated”. You can leverage the TFS Code Churn tables to automatically monitor activity, or even just view check-ins in Team Explorer, to see how often a developer checks in and how much code has changed. If a developer or contractor insists that they need to wait 1 month to check-in their code when “it’s ready”, you’ve got problems, much like if a developer insisted they didn’t need to follow any other policy or good practice.

Monday, December 5, 2011

Is development for sissies?

I was reading the book "Tales to make boys out of men" (I have 2 young ferocious gorillas boys). It was filled with adventurous stories of courage and valor who fought battles in the jungle or trekked through the frost-bitten Antarctic. Then here am I, a software engineer, essentially doing a "desk job" in an air-conditioned office with free coffee.
Sometimes I see people who have two categories: "tough-guy" jobs like fighter pilot, football player, or astronaut,  and "sissy" jobs like software engineer sitting behind a desk. What do I tell my impressionable kids?
I see it like this. "Tough-guy" jobs are honorable, and you certainly need them. But don't dismiss a "desk job" as being a sissy. Many IT engineers need to work with the most ferocious, dangerous, lethal, destructive animal on the planet – other people.  People inevitably have competing demands and interests, there are ruthless sharks out there, and any job that must constantly deal with people cannot, by definition, be a sissy job.
Second, developers also must work with the most uncaring and cold-hearted beast ever to exist – the compiler. The compiler doesn't care if you've had a bad day, if your code should work, or if you've spent a hundred hours on a 5 minute task. It has no grace. Such an inhuman vacuum is not the field of sissies.
Furthermore, other people are depending on the IT engineer's work. You could have a million customers using your financial application, or a billion dollars of revenue flowing through your processing system. Hackers attack your system every day. And the system has got to work. To have that sort of responsibility is not sissy-like.
Lastly, software engineering is so complex that you inevitably make mistakes (sometimes really big ones) – and then need to own up to them. That takes courage.
Ok, it's still not Rambo, but software engineering is not for the weak.

Friday, December 2, 2011

10 Reasons why the build works locally but fails on the build server

This is a braindump:
1.       Developer did not check all the files in, or developer doesn't have the latest files (sometimes TFS hiccups getting latest dlls files).
2.       Different modes (release vs. debug). Either #if DEBUG, or project is unmarked in configuration manager.
3.       Different bin structure - each project gets its own (Default for visual studio), vs. single shared bin for all (default for TFS). This is especially common when different versions of the same assembly is referenced in multiple projects in the same solution.
4.       Different platform/configuration
5.       The build is running other steps (perhaps a packaging or command-line unit tests)
6.       Different bitness, say developer workstation is 64-bit, but build server is 32-bit, and some extra step breaks because of this.
7.       Rebuild-vs-build. Developer not running a rebuild. Hence there's an error in creating a dll, but it already exists on dev machine due to some other process, but build server fails.
8.       Workspace mapping is incorrect – TFS not getting all the files it needs
9.       Unit test code coverage – visual studio (at least 2008) can be very brittle running command line unit tests and code coverage.
10.   Treat warnings as compile errors – depending on your process, the build server may fail on these, but Visual studio may only flag you with a warning (which dev ignores)

Tuesday, November 29, 2011

Why I'm liking Pluralsight

My department had scheduled to send each of us to training. We had different training classes, and the specific vendor for my class needed to cancel. That left me short notice to squeeze in different training by year end. So, being creative, I got an online subscription to Pluralsight instead of the traditional training.
PluralSight is a set of online .Net videos created by industry experts.  Each video seems 2-4 hours' worth of power point slides and code demos. It's worked out very well. What I'm liking so far:
·         Different medium - After 10 linear feet of books, I like the different medium. Hearing someone's voice seems to trigger a different part of the brain for remembering, and seeing the demo from end-to-end has obvious benefits over isolated screenshots in a book or article.
·         It's on-demand – It's hard to make it to physical events. I like the inherent benefit of on-demand training, where I can listen on my schedule (by which I mean everyone else's schedule - my kid's sleeping schedule, my company's work schedule, etc…)
·         Professional content - There are tons of free videos online, but these are often like reactionary scraps. To break the ice with a new technology, it helps to have a systematic 2-hour block that goes from end-to-end.
·         Track progress – Some personality types won't care about this, but I like how it tracks completion progress through courses. It's almost like finishing levels of a video game.
·         Coordinated – I don't need 10 videos all telling different or rehashed angles of the same thing (which is often what I'd find in a google search) – rather I need one good video that nails it, or a collection of videos that each explains their specific part well.
·         Continually Improving – They seem to come out with a few new "courses" every week.
It's getting to the point where rather than watch my favorite sitcom, I watch the next Pluralsight video.