Sunday, January 24, 2010

Is this code broken?

[This was originally posted at http://timstall.dotnetdevelopersjournal.com/is_this_code_broken.htm]

What constitutes broken code? Everyone agrees that code that crashes in production and threatens to have developers fired is indeed broken. But where's the line? Is the following code broken:

  • The code logs incorrect data in an error log file?
  • The code displays incorrect values in a label (like a mis-formatted date)?
  • If a certain rare case occurs (like a button is pressed at exactly 12:00AM, or an uploaded file size is exactly 1.00 MB), then the code crashes?
  • The code has mis-leading names for variables and methods. For example, it has a method "IsNumber" that checks only for integers, or "IsLetter" that allows for special characters? Say the current program calls the method with correct data so that the app never crashes?

In all these cases, say the application essentially "works" and handles the main use cases.

The problem is maintenance. Maintaining and extending code is an expensive part of the total cost of ownership. You could spend hours tracking down a single erroneous line of code. Code that is low quality (tons of copy & paste, misnamed methods, misleading logic, etc...) is going to be a fortune to maintain. On the other hand, certain functional errors that have zero impact on the business can perhaps be documented as "known-issues" (i.e. the month is formatted in a label with a preceding zero like "03" instead of just "3".

I'd say it comes down to the business, and the code is broken if it costs the business more than it should - whether it be via maintenance costs or functional errors that hinder the end users. Perhaps the question isn't as much "is this code broken", but "how can we maximize the business-value of this code?"

Wednesday, January 20, 2010

Five Ironies of Unit Testing

[This was originally posted at http://timstall.dotnetdevelopersjournal.com/five_ironies_of_unit_testing.htm]

I am a huge advocate of unit testing. After years of writing tests, and encouraging other devs to write tests, I find five common ironies:

  1. The devs who would most benefit from unit tests are the devs who are least likely to write them - and vice versa. The star devs, who would write the code correctly to begin with, are also the ones most open to unit testing. Likewise, the low-quality-code-developers who shun testing are the one's whose code could benefit the most from it.
  2. Writing unit tests actually saves time - not just in integration testing but also in development - because it stubs out the context, allowing you to immediately jump to the area that needs testing instead of spending 5 minutes setting up the scenario.
  3. Developers often punt on unit testing because "my manager doesn't support it", but unit testing is really an encapsulated development detail that doesn't need managerial support (although of course their support is appreciated).
  4. Many devs generate the unit tests after they write the code ("those ivory-tower architects said we needed tests"), but tests are most beneficial before you write the code because they force you to think what the code does, and they make it faster to write the code.
  5. The same teams who don't want to write unit tests are relieved to have such tests on the code they need to maintain.

 

Monday, January 18, 2010

Blogging as a Legacy

[This was originally posted at http://timstall.dotnetdevelopersjournal.com/blogging_as_a_legacy.htm]

I was casually chatting with a dev manager who had worked in the trenches for 25 years. He emphasized how a lot of developers spend their career without leaving a legacy. You work on a dozen systems, for different companies, and a decade later don't have anything external to show for it. Sure, you've got skills, 10-linear feet of obsolete tech books you've read, and the memories. But it's not as tangible as shipping actual products (like the devs who can say "I helped ship AoE2 - that was me!").

This hit home with me as I reach the 5-year mark for my blog. After 5 years of consistent blogging, I've written 430 posts and received hundreds of comments (many of them educating to me). There are a lot of obvious benefits to blogging, but after writing for years, what really sticks out to me is the legacy of blogging. I started blogging at CSC, blogged all through Paylocity, and continue blogging now at CareerEd. Ironically, my blog topics have evolved from hard-core development, to tech lead, to architecture, to more managerial-related tasks.

I can't show people any of the systems I've done (except for the occasional screenshot from a dusty tech doc), but as appropriate, I can share with them the archive of hundreds of blog posts. It helps motivate me to want to write for another 5 years.

Thursday, January 14, 2010

Three basic communication tips

[This was originally posted at http://timstall.dotnetdevelopersjournal.com/three_basic_communication_tips.htm]

I am certainly no communication or "people" expert, which is probably why wiser people in my life have explicitly offered me three good rules for communicating with others. I think these rules apply in the corporate world as well, so it makes a good blog post (because I can't just blog about normal life-issues unless I can somehow apply it to software engineering).

Connect the dots - In software programming, we must explicitly spell every step. Sure you can refactor and abstract things out, or use third-party software to spare you from writing it - but somehow, every step must be flushed out in detail. This is great for a robust program, but it will drive real people insane (i.e. managers, business analysts, customers, and executives who write your paycheck). They'll just think you're being clueless, are inexperienced, or a smart aleck. When dealing with actual people, we need to be able to use our background knowledge of the situation to connect the dots.

Don't wait to be asked - It takes effort to ask for something, and people don't like exerting effort, so anytime you can "just know" and do the right thing (perhaps because you know the bigger picture of what they're trying to do), people are going to appreciate it.

  • For example, say you see a simple bug in the code. It's a safe change and the release isn't for a while. The savvy developer doesn't need to ask their manager "can I fix this bug" - they just do it (and maybe submit a ticket if their company's process requires that). Often just making the fix can be quicker than asking. (Of course common sense applies, don't go "fixing" mission-critical production code that's out of your scope).
  • It seems like only junior resources ask "What can I do to help?"- the senior ones already know.

Of course, it's reasonable to pro-actively notify a manager "I see A, B, and C. I know you're busy, so I'll assume I should start working on 'B' first because of reasons XYZ. Just let me know if you'd like to switch tasks". This lets your manager reply with a 1 word email like "great", and managers like being able to delegate entire tasks with 1-word emails.

"Connecting the dots" and "Don't wait to be asked" are related. Think of this as not needing to be micromanaged.

Translate what people say. There is a world of difference between someone's words and what they actually mean. Whether they've made a simple typo, using poor word choice, or they're struggling to articulate something - it's a big personal win if you can "see the forest through the trees" and know what they mean. You can do this by leveraging context, know where they're trying to go, and having familiarity with what it takes to get there.

  • Example: Missing words - A non-technical manager may say "we need to store this dropdown control in the database". They probably mean "..store the value of this dropdown control..."
  • Example: Wrong words - A non-technical manager says "We need a bigger machine". They probably mean "we need a better-performing machine"

Summary

You might say "but this isn't fair - I was doing my job to the 'T', I was technically correct" Ah, but life is not fair. You can be right, or you can be happy. And (as I find out the hard way) if you want to be happy with other people, you're eventually need to master many things, including these three.

 

Sunday, January 10, 2010

BOOK: Complete MBA for Dummies

[This was originally posted at http://timstall.dotnetdevelopersjournal.com/book_complete_mba_for_dummies.htm]

I'm always been intrigued by the non-coding aspects of a project that are necessary for that project to succeed. Much of this includes people and business skills. I keep hearing of co-workers who take business classes, and it sounds fun, but it takes more time than I have right now (building snow-dinosaurs, sandboxes, and Christmas lights for the kids takes a lot of time). So I settled for the next best thing: reading the Complete MBA for Dummies. I was impressed.

The book is a casual 400-page read, and certainly lives up the the "for dummies" genre. It offers an overview of starting a small business, from the basics of management to HR to accounting to marketing and economics. I liked the practical tone.

While a book like this doesn't fundamentally change one's view of business, it is useful to get one to casually think about business-concepts during the normal work day. For each project, it prompts me to ask questions like:

  • "where does the revenue come from?"
  • "who is paying for this project?"
  • "how will this project help the business?"
  • "can this thing I built actually be marketed?"
  • "who are the customers for this product?"

Continually keeping these types of questions in mind also helps a developer relate to business-sponsors, who are the people that ultimately write the developer's paycheck.

 

Thursday, January 7, 2010

Coding is just the tip of the iceberg

[This was originally posted at http://timstall.dotnetdevelopersjournal.com/coding_is_just_the_tip_of_the_iceberg.htm]

I love coding. The more I do software engineering, the more I realize that coding is just the tip of the iceberg. Consider tasks besides coding that are required for a successful project:

  • Identifying a business problem such that business sponsors are willing to pay for the product
  • Recruit the team to build the project
  • Provide the team the tools to develop the app (hardware & software)
  • Collecting business requirements
  • Coordinating with business partners, such as those providing data that the product will use
  • Designing a functional spec
  • Creating the architectural and technical designs
  • Decide on build vs. acquire (buy, open-source)
  • Outsource part of the project
  • Managing the project
  • Procuring the physical infrastructure that the app is deployed on
  • QA testing the app (functional, integration, user-acceptance, performance, etc...)
  • Deploying the app
  • Write training manuals for the app
  • Training support staff and users
  • Marketing the app such that people actually use it
  • Supporting the app

From start to finish, actually coding for a project may only be 5% - 10% of the total effort. That means that there's a huge portion of the project that is non-coding, and that huge portion can often overcome difficult coding tasks.

For example, say there is a component that is just difficult to program (it's complex, it's big, it occurs outside your expertise, etc...) You could possibly get around coding it yourself by maybe:

  • Buying or open-sourcing it (example: use an open-source tool or class library from CodePlex instead of writing it yourself)
  • Training the internal end users around using that feature ("we know the website has a bug, but just don't click the browser back button")
  • Using project management to get it punted, or out of scope
  • Convincing the business sponsor that the feature is not needed ("we don't need to invest all that time making a dancing paper clip assistant")
  • Better hardware (example, upgrading hardware for better performance)

The stars who keep delivering successful projects are familiar with this, and they are constantly mitigating challenges in one task by giving up something that doesn't matter from another task.

Sometimes you can solve hard coding problems by just sheer skill and coding right through it. But it's good to be aware of other techniques to work around the problem altogether.

Sunday, December 27, 2009

Estimating database table sizes using SP_SpaceUsed

[This was originally posted at http://timstall.dotnetdevelopersjournal.com/estimating_database_table_sizes_using_sp_spaceused.htm]

One of Steve McConnell's tips from his great book on estimating (Software Estimation: Demystifying the Black Art) is that you should not estimate that which you can easily count. Estimating database table sizes is a great example of this. Sure, on one hand disk space is relatively cheap; on the other hand you want to know at least a ballpark estimate of how much space your app will need - will database size explode and no  longer fit on your existing SAN?

Here's a general strategy to estimate table size:

1. Determine the general schema for the table

Note the column datatypes that could be huge (like varchar(2000) for notes, or xml, or blob)

2. Find out how many rows you expect the table to contain

Is the table extending an existing table, and therefore proportional to it? For example, do you have an existing "Employee" table with 100,000 records, and you're creating a new "Employee_Reviews" table where each employee has a 2-3 reviews (and hence you're expecting 200,000 - 300,000 records)? If the table is completely new, then perhaps you can guess the rowcount based on expectations from the business sponsors.

If the table has only a few rows (perhaps less than 10,000 - but this depends), the size is probably negligible, and you don't need to worry about it.

3. Write a SQL script that creates and populates the table.

You can easily write a SQL script to create a new table (and add its appropriate indexes), and then use a WHILE loop to insert 100,000 rows. This can be done on a local instance of SQL Server. Note that you're not inserting the total number of rows you estimated -  i.e. if you estimated that table will contain 10M rows, you don't need to insert 10M rows - rather you'll want a "unit size", which you can then multiple by however many rows you expect. (Indeed, you don't want to wait for 10M rows to be inserted, and your test machine may not even have enough space for that much test data).

For variable data (like strings), use average sized data. For null columns, populate them based on how likely you think they're be used, but err on the side of more space.

Obviously, save your script for later.

4. Run SP_SPACEUSED

SP_SpaceUsed displays how much data a table is using. It shows results for both the data, as well as the indexes (never forget the index space).

You can run it as simply as:

exec SP_SPACEUSED 'TableTest1'

Now you can get a unit-size per row. For example, if the table has 3000KB for data, and 1500KB for indexes, and you inserted a 100K rows, then the average size per row is: (3000KB + 1500KB) / 100,000. Then, multiple that by however many rows you expect.

This may seem like a lot of work, and there are certainly ways to theoretically predict it by plugging into a formula. My concern is that it's too easy for devs to miscalculate the formula (like forgetting the indexes, not accounting the initial table schema itself, or just all the extra steps)

5. Estimate the expected growth

Knowing the initial size is great, but you also must be prepared for growth. We can make educated guesses based on the driving factors of the table size (maybe new customers, a vendor data feed, or user activity), and we can then estimate the size based on historical data or the business's expectations. For example, if the table is based on new customers, and the sales team expects 10% growth, then prepare for 10% growth. Of if the table is based on a vendor data feed, and historically the feed has 13% new records every year, then prepare for 13% growth.

Depending on your company's SAN and DBA strategy, be prepared to have your initial estimate at least include enough space for the first year of growth.

6. Add a safety factor

There will be new columns, new lookup and helper tables, a burst of additional rows, maybe an extra index - something that increases the size. So, always add a safety factor.

7. Prepare for an archival strategy

Some data sources (such as verbose log records) are prone to become huge. Therefore, always have a plan for archival - even if it's that you can't archive (such as it's a transactional table and the business requires regular transactions on historical data). However, sometimes you get lucky; perhaps the business requirements say that based on the type of data, you only legally need to carry 4 years worth of data. Or, perhaps after the first 2 years, the data can be archived in a data warehouse, and then you don't worry about it anymore (this just passes the problem to someone else).

Summary

Here's a sample T-SQL script to create the table and index, insert data, and then call SP_SpaceUsed:

USE [MyTest]
GO

if exists (select 1 from sys.indexes where [name] = 'IX_TableTest1')
    drop index TableTest1.IX_TableTest1

if exists (select 1 from sys.tables where [name] = 'TableTest1')
    drop table TableTest1

--=========================================
--Custom SQL table
CREATE TABLE [dbo].[TableTest1](
    [SomeId] [int] IDENTITY(100000,1) NOT NULL,
    [phone] [bigint] NOT NULL,
    [SomeDate] [datetime] NOT NULL,
    [LastModDate] [datetime] NOT NULL
) ON [PRIMARY]

--Index
CREATE UNIQUE NONCLUSTERED INDEX [IX_TableTest1] ON [TableTest1]
(
    [SomeId] ASC,
    [phone] ASC
) ON [PRIMARY]
--=========================================


--do inserts

declare @max_rows int
select @max_rows = 1000

declare @i as int
select @i = 1

WHILE (@i <= @max_rows)
BEGIN
    --=============
    --Custom SQL Insert (note: use identity value for uniqueness)
    insert into TableTest1 (phone, SomeDate, LastModDate)
    select 6301112222, getDate(), getDate()
    --=============

    select @i = @i + 1

END

--Get sizes
exec SP_SPACEUSED
'TableTest1'