Wednesday, January 18, 2012

The problem with "It's not what you know, it's who you know."

I wasn't the most popular kid growing up. Even in college as I lived up to the analytical stereotype and stayed home studying (a better word would be "experimenting" or "training"), my party-going acquaintances would assure me that I was investing in the wrong thing. "It's not what you know, it's who you know. So don't spend so much  effort with the books when it's the relationships that matter." And there certainly is some truth to this. We've all seen the stranger's perfect resume get passed over for the friend's average resume (the stranger is by definition unknown, and therefore risky, so there is business rational to pick the safe candidate over the risky one). People ultimately make the decisions, so people are important. It's one reason I so actively endorse the community user groups.
However, there must be balance. There are three caveats that this cliché misses:
1.       If what you know is valuable, then people will want to know you. Even a hermit who cures cancer will begrudgingly become famous. Recruiters in every major city are scouring over LinkedIn, user groups, monster, dice, and every online job board trying to find good candidates, offering bounties, and poaching top talent from competitor's. In other words, "what you know" will quickly open doors to "who you know" (and "who knows you").
2.       Really, it's not "who you know," but "who knows you." Sharing an elevator, or even a lunch, doesn't mean that they'll risk their reputation giving you a referral, or that you can "phone them for a favor".
3.       There are talkers and doers. Talkers can drop a name for every occasion, have 500+ social-networking friends, and can truthfully say things like "Oh, I know Acme's Chicago director, Bill, we met at last Autumn's pumpkin-throwing contest…" They could get the interview with their connections, but they could never pass the interview itself.
Of course, with "what you know" vs. "who you know", like most two-way debates in life, you'd prefer both. But in the field of software engineering, you can never sell-short the "what you know".

Monday, January 16, 2012

Command-line Cyclomatic Complexity in VS2008 with VS2010 free Metrics.exe

Visual  Studio had code complexity metrics, but they were only available in the GUI. (At least for code coverage you could call the private assemblies and roll your own command-line tool.) However, VS 2010 offers  a free power-tool that lets you run complexity metrics from the command line! The result is an xml file, so you can leverage that for anything you need.
These blogs tell more:
Part of the cool thing is even if you're still on VS2008 (!), and you can't buy a 3rd party tool (NDepend!), you can still use the 2010 power tools to call .Net 3.5 assemblies. So, you could install VS2010 on your build server and use the power tools on 2008 builds.

Thursday, January 5, 2012

Detecting if a file is a merge in TFS VersionControl database

I was trying to run some metric calculations on files within a changeset, but I only wanted new files – i.e. I wanted to filter out merged, branched, or renamed files. For example, if someone created a branch, that shouldn’t count as adding 1000 new files.
One solution I found was to check the Command column of the TfsVersionControl.dbo.tbl_Version table. I realize the TfsVersionControl is a transactional database, and reports are encouraged to go off of TfsWareHouse, but that didn’t seem to contain this field.
Here’s the relevant SQL (NOTE: this is for VS2008, I haven’t tested it on VS2010).
select 
      CS.ChangeSetId, FullPath, Command, CreationDate,
      case
            when Command in (2,5,6,7,10,34) then cast(1 as bit)
        else cast(0 as bit)
      end as IsNew
from TfsVersionControl..tbl_Version V with (nolock)
      inner join TfsVersionControl..tbl_ChangeSet CS with (nolock)
      on V.VersionFrom = CS.ChangeSetId
where CS.ChangeSetId = 20123

The question becomes, what does the “tbl_Version .Command” column mean, and why those specific values? I couldn’t find official documentation (probably because it’s discouraged to run queries on it), so I did a distinct search on 50,000 different changesets to find all values being used, and I worked backwards comparing it against the Team Explorer UI to conclude it appears to be a bitflag for actions:

Command
Bit value
add
1
edit
2
branch
4
rename
8
delete
16
undelete
32
branch
64
merge
128


Recall there can be multiple actions (hence the bit field), such as merge and edit. So, if you want to find new code – i.e. adds or edits, then we’d take the following bit flags: 2, 5, 6, 7, 10, and 34.

Is New?
Bit value
Actions
Yes
2
edit
Yes
5
add (folder)
Yes
6
type/edit
Yes
7
add (add file)
No
8
rename
Yes
10
rename/edit
No
16
delete
No
24
delete,rename
No
32
undelete
Yes
34
undelete, edit
No
68
branch
No
70
branch, edit
No
84
branch,delete
No
128
merge
No
130
merge, edit
No
136
merge,rename
No
138
merge,rename,edit
No
144
merge,delete
No
152
merge, delete, rename
No
160
merge, undelete
No
162
merge, undelete, edit
No
196
merge, branch
No
198
merge, branch, edit
No
212
merge, branch, delete


Of course, this is induction, and it’s possible I may have missed something, but given a large sampling and lots of spot-checking, it appears to be reliable.

Wednesday, January 4, 2012

It’s not your code, but it is your opportunity

I occasionally hear the developer say “my code”, as in “I’ll check in my code at the end of next week”, or “My code doesn’t need unit tests”.
In one sense, I want developers to think “this is my code” so they take pride in doing the best job possible. But really, it’s not your code, it’s the company’s code – they’re paying for it, and often legally they own it (i.e. it would be illegal to take chunks of code you wrote at one company and either privately sell it, or check it into another company’s source code repository).
This perspective really changes the discussion, i.e. “The company would like their code to be checked-in on a regular basis”, or “The company would like their code to be properly tested”.
However, it is the developer's "opportunity to learn" – i.e. the company keeps the code, but the developer keeps the improved skill from writing that code.

Tuesday, January 3, 2012

31 User Groups in the Midwest

Clark Sell did a great series on the various user groups in the Midwest. He provided a helpful recap here:
Here's the link for our Lake County .Net Users Group (LCNUG).
There's over a dozen groups in Illinois alone.  There's something for everyone. Given the benefits of such groups (meeting dedicated peers face to face, hearing from expert presenters, etc…), check it out.

Friday, December 30, 2011

The benefits to “check in early and often”

I am a huge advocate of checking in early and often. I’ve seen many a project get burnt by the developer who saves 3 weeks of work for a single “glorious” check-in.
I favor frequent check-ins because it’s:
  1. Cheaper integration. Someone once said “Integration is pay me now or pay me later”, and I find it much easier to pay now. Especially with automated builds and continuous integration, it’s much easier to check in 10 little changes than 1 big change (Sometimes I think of it like being easier to hold my breath for 30 seconds, ten times, as opposed to holding it for 5 minutes straight). Why? Because with bigger changes, you inevitably get farther out of sync – especially on critical shared files – and there’s more to forget.
  2. More objective measure of what you really have: Code that isn’t checked in, that just works on a developer’s machine, doesn’t really exist. They might as well say “it works in my head”. Once you actually get the code past a build server’s policy, then we can see what’s really there.
  3. Earlier Detection: We all know it’s cheaper to fix a bug or redesign the sooner you catch it. I’d rather developers check in code early so we can quick detect things (“why is there 5000 lines but no tests?”)
  4. More Modular: Checking in 10 chunks of code, where each one works, implies more granular and modular code. I.e. code that can at least be split into multiple check-ins is more modular than code that can’t be split at all.
Of course there’s always exceptions (you do a massive refactoring, etc…), but those should be the exception, not the rule.
Most of the time, in my experience, large check-ins by developers means something bad – spaghetti code, tightly-coupled code, code that was trying to hide under the radar until right before the deadline and then the developer says “oops, I just don’t have time to change it”, or something like that. Think of it like this: there is zero benefit to you to have to wait one month before seeing what a developer is doing, but there is benefit to early detection of code, so risk-reward wise it’s better to check-in early.
Note that for these purposes, a shelve set is not the same as a check-in. Shelvesets are private, and hence deliberately avoid the benefits listed above (which some say is a feature). For example, you mostly likely don’t have builds on a private shelfset. For a developer to say “I put my 20,000 lines in a shelveset” is misguided– use a branch instead if you need to.
So how to encourage check-in early and often?
You could write a whole chapter on this, but here's a short answer: You can explain the benefits so some developers are internally motivated, or you can make it official policy so that other developers are externally “motivated”. You can leverage the TFS Code Churn tables to automatically monitor activity, or even just view check-ins in Team Explorer, to see how often a developer checks in and how much code has changed. If a developer or contractor insists that they need to wait 1 month to check-in their code when “it’s ready”, you’ve got problems, much like if a developer insisted they didn’t need to follow any other policy or good practice.

Monday, December 5, 2011

Is development for sissies?

I was reading the book "Tales to make boys out of men" (I have 2 young ferocious gorillas boys). It was filled with adventurous stories of courage and valor who fought battles in the jungle or trekked through the frost-bitten Antarctic. Then here am I, a software engineer, essentially doing a "desk job" in an air-conditioned office with free coffee.
Sometimes I see people who have two categories: "tough-guy" jobs like fighter pilot, football player, or astronaut,  and "sissy" jobs like software engineer sitting behind a desk. What do I tell my impressionable kids?
I see it like this. "Tough-guy" jobs are honorable, and you certainly need them. But don't dismiss a "desk job" as being a sissy. Many IT engineers need to work with the most ferocious, dangerous, lethal, destructive animal on the planet – other people.  People inevitably have competing demands and interests, there are ruthless sharks out there, and any job that must constantly deal with people cannot, by definition, be a sissy job.
Second, developers also must work with the most uncaring and cold-hearted beast ever to exist – the compiler. The compiler doesn't care if you've had a bad day, if your code should work, or if you've spent a hundred hours on a 5 minute task. It has no grace. Such an inhuman vacuum is not the field of sissies.
Furthermore, other people are depending on the IT engineer's work. You could have a million customers using your financial application, or a billion dollars of revenue flowing through your processing system. Hackers attack your system every day. And the system has got to work. To have that sort of responsibility is not sissy-like.
Lastly, software engineering is so complex that you inevitably make mistakes (sometimes really big ones) – and then need to own up to them. That takes courage.
Ok, it's still not Rambo, but software engineering is not for the weak.