Friday, June 20, 2008

Natural Language Processing

[This was originally posted at http://timstall.dotnetdevelopersjournal.com/natural_language_processing.htm]

I got very bad marks in grammar during middle school. It wasn't until after I graduated public school that I cared about writing, and hence grammar. Now, I find it quite fascinating, so I was interested to read about a CodePlex project - SharpNLP. In their own words:

SharpNLP is a collection of natural language processing tools written in C#. Currently it provides the following NLP tools:

  • a sentence splitter

  • a tokenizer

  • a part-of-speech tagger

  • a chunker (used to "find non-recursive syntactic annotations such as noun phrase chunks")

  • a parser

  • a name finder

  • a coreference tool

  • an interface to the WordNet lexical database

So, you could type in a paragraph, and it parses that out into the different sentences, and then different words and parts of speech. Of course, I began to think if maybe this could be extended to be like the FxCop of technical articles. For example, according to the Microsoft Manual of Style for Technical Publications, the rules to merely capitalize a sub-title are non-trivial - there are 10 rules and it takes a full page to explain them. I wonder in theory if you could have a NLP (Natural Language Processing) tool run through a subtitle (as an input string), and apply the rules, much like you can parse a arithmetic expression for correct syntax. I was starting to toy around with it, but it seemed to quickly get difficult, so maybe I look at it another day.

 

In the meantime, if you're interested in NLP, check out SharpNLP.

 

Tuesday, June 17, 2008

Is developing a young person's profession?

[This was originally posted at http://timstall.dotnetdevelopersjournal.com/is_developing_a_young_persons_profession.htm]

Talking with another developer the other day, we began discussing if software engineering is really a young person's profession (we've both been doing this for a while, have families, and aren't jumping to do the 70-hour weeks of boring features with boring technologies). Given the rapid turnout of new technologies, pressure from out-sourcing, and demanding projects, one might view the software world as incompatible with an older person, especially someone with a family who didn't want to spend 70 hour weeks at work.

 

I'm an optimist here, and I think that no -software engineering is certainly not just "a young person's profession". There's a lot of advantages that older developers have, they:

  • Have more experience, so they usually have better intuition and a broader understanding with which to learn new technologies. For example, someone who already knew J2EE would pick up .Net much more quickly than someone with no computer background.

  • Have more understanding of the purpose of technology. They've seen lots of business applications, so they know what they're trying to do.

  • Have a wider stash of reusable tools and source code to work with

  • Can be wiser about what they invest in learning

  • Have deeper knowledge. New technology is often built on top of old technology. I've seen lots of young "copy and paste" developers become paralyzed when their program does anything abnormal - like throw a COM exception or accidently encode files in an unexpected format - whereas the older devs have been around and know how to handle that.

While new technologies do come out frequently, there are also many older technologies and concepts that still form the backbone of enterprise apps. Html, JavaScript, Xml, CSS, Sql, code generation, automation, and object oriented languages like C++ and Java,  have all been around since the 90's. A senior developer who already knows these technologies can focus on learning just the new stuff, whereas a young developer still needs to come up to speed on these basics.

 

Of course, I wouldn't say that older developers are necessarily better, but rather give everyone their chance. There's a lot to look forward to in software engineering, for both old and young developers.

Monday, June 16, 2008

Upward spiral

[This was originally posted at http://timstall.dotnetdevelopersjournal.com/upward_spiral.htm]

We've all heard of the dreaded downward spiral, where you're behind schedule with lots of bugs, so you cut corners and rush sloppy work, only to create more bugs and get further behind. Some folks seem to think this is just a right of passage for computer programming, but it doesn't have to be that way. Just like there's a downward spiral, there's also an upward spiral, where you do good process and coding upfront, which saves you time, with which you can further improve your tools and process.

 

I'm a big fan of the upward spiral. The best way I see to jump on the upward spiral is to focus your resources on things that last, and avoid spending your resources on things that don't. For example, there's no benefit to you to waste hundreds of hours writing data-access plumbing code. You learn nothing new, you're probably not energized by it, and everyone around you likely just takes plumbing for granted. Things that "last" include your personal and technical knowledge, tools, helper classes, and any other thing that you can take with you to the next project or feature. Things that "don't last" are tedious bug fixes, plumbing code, obsolete technologies, and pointless trivia. These things help with the immediate task, but then are (usually) useless afterwards.

 

Practical ways to spend your resources on things that last:

  • If you're developing on your own time, if at all possible, focus on learning new technology and techniques that you find interesting - as opposed to just plugging on a boring feature because it might impress your boss. Knowing the new techniques will let you develop much faster later, which will both permanently enhance your developing skills, and impress your boss.

  • Try to learn something new each day. Usually if you're working for 8 hours, you can squeeze in a half-hour of experiments that relate to that work. Learning something new each day will quickly add up. Ideally, a developer could happily answer the question "what did you learn today?"

  • Actively pursue features that will teach you something. All else being equal, a good boss will want to give you features you're interested in because they know that you'll then be motivated to do a better job of it. So, aggressively show interest in the features that give you learning opportunities. Consider even investing your own personal time to prepare for it. If you invest two hours on the weekend to understand enough of the background to be qualified on a feature, which then lets you spend two weeks actually programming it hands on, that's a great investment.

I understand that sometimes your current project forces you into a rut, and you're just trying to survive. I still remember several 70 hr/week projects when I used to do consulting.  But it's always good to keep your eye on the prize - even if right now sucks, at least be aware of that in order to strive for something better.

Monday, June 9, 2008

Smart vs. Smart Aleck

[This was originally posted at http://timstall.dotnetdevelopersjournal.com/smart_vs_smart_aleck.htm]

An intellectually grueling field like software engineering will attract a lot of smart people. It also attracts smart alecks too. There's a big difference.

 

Smart personSmart aleck
Uses their smarts to help the project succeedTries to get people to think how smart they are. Usually their facts are wrong and their ideas are impractical.
Tries to build positive things up - i.e. create new process, actively solve problemsConstantly nitpicking trivia and tearing things down, without offering alternative solutions.
Ready to actually implement ideasRetreats to empirical trivia or theory (perhaps by considering themselves "visionary")
Willing to admit they're wrong in order to find the best solutionAvoids offering any criteria for falsification because they don't want to be "trapped" or be wrong.
Eager to have their own work reviewed in order to get the best product, and learn from others.Eager to review (and critique) others work, but resistant to apply the same standard to them self.

 

No one likes smart alecks. I don't have a cure, but here's some ways I've found to deal with them:

  • Encourage them to funnel their efforts into something good. For example, saying things like "That's an interesting idea, why don't you try building it?" or "That's interesting trivia, but do you see anything with a higher rate-of-return to focus on?"

  • Don't be intimidated - smart alecks often try to intimidate others with big buzzwords or obscure trivia. But you can cut through the buzzwords by asking them to explain it in plain English. Smart alecks are dangerous to a project, so good developers have an obligation to defend the project against the smart aleck's ego.

  • Put the smart aleck in their place, perhaps using a standard pass-or-fail approach with agreed-upon rules. For example, you may say "if your idea is right, then this C# method should not compile, are we agreed?" Nothing like having the compiler itself crush a smart aleck's ego.

  • Fire them. It can be tough, but if they're writing bad code, while constantly distracting others with false alarms via pointless trivia and argumentative questions, they may just be "not a good fit" for the company.

There's a saying, "If you have to tell someone you're a lady, then you aren't one." Likewise, smart people don't need to try impressing or convincing others that they're smart, usually people just recognize it as a side affect of the value they add.

Sunday, June 8, 2008

The problem with "It's not what you know, it's who you know"

[This was originally posted at http://timstall.dotnetdevelopersjournal.com/the_problem_with_its_not_what_you_know_its_who_you_know.htm]

I remember when job-hunting back in college, lots of business majors would tell me "It's not what you know, it's who you know." Some kids even used it as an excuse to avoid studying in order to go to parties instead ("Why waste time studying pointless knowledge when what really matters is having a strong social network?"). While there is some merit to the idea - i.e. you do want to build your network - this paradigm doesn't apply well to skilled labor that can be objectively measured, like software engineering.

 

If a job doesn't require much skill, such that there are tons of qualified candidates, then of course personally knowing the hiring manager is a competitive edge. From their perspective, if all else is equal, hiring a known acquaintance mitigates risk. But, if a job does require a lot of skill, such that recruiters are actively competing to find that top talent, then they will beat a path to your door. In software engineering, if you have the knowledge, then people will want to know you. It's a two-way street: developers what to be employed, and companies want the best employees.

 

I think of it like talent in the NBA - some players just play better than others (I believe all people are equal, they just some have different skills). That's why scouts are running all over the nation, constantly trying to woo the top free-agents. If you're the top NBA draft pick, even if you don't know anyone yet, scouts are going to want to know you.

 

Sure, I understand that cronyism and nepotism exist, but in software engineering, such corruption would put that recruiter at a serious competitive disadvantage. Worst case, I'd expect that a corrupt manager's greed would trump their cronyism, and they'd hire the best talent. Anything else would essentially be throwing away money.

Tuesday, May 27, 2008

The new Lake County .Net Users Group

[This was originally posted at http://timstall.dotnetdevelopersjournal.com/the_new_lake_county_net_users_group.htm]

I'm a big fan of user groups - it's great being able to meet other professional developers. That's why I'm excited about a new user group being started in the ChicagoLand area: the Lake County .Net Users Group (LCNUG). It meets at the College of Lake County. Scott Seely, an author and former Microsoft employee, will be kicking it off with a presentation on Windows Workflow Foundation on June 26th. If you live in the northern Chicago suburbs, consider checking out the new LCNUG.

Sunday, May 25, 2008

Performance tips for a faster machine

[This was originally posted at http://timstall.dotnetdevelopersjournal.com/performance_tips_for_a_faster_machine.htm]

We all want faster machines. Slow machines, especially ones that freeze up, constantly interrupt one's thought process and can pull them out of the zone. It's not just the extra 20 minutes spread throughout the day, it's also all the time lost to re-focus yourself after waiting for a long process. I'm no machine performance expert, but here are tips I've learned.

 

1. Run Defrag.

 

You can run this via the command line, such that you hook it up to a weekly script. This MSDN explains: "Disk fragmentation slows the overall performance of your system. When files are fragmented, the computer must search the hard disk when the file is opened to piece it back together. The response time can be significantly longer."

defrag c:\ /v /f

 

2. Clean up your hard drive.

 

A crowded hard drive makes your machine run slower - there's just less wiggle room for the operating system. I've heard some suggest that you should have at least 25% free. You'll need two big things for this: (A) a backup drive for offloading infrequently used files, and (B) a tool to find obsolete files. One good, free, tool is CCleaner, which detects most of the common spots for dead-weight files. Another tool, SequoiaView, shows all files sizes in a treemap graph so that you can easily see which files are taking up space.

 

3. Clean out your registry

 

If you're continually installing and uninstalling programs, your registry may get bloated, causing big slowdowns. Modifying your registry is dangerous and could irreparably corrupt your entire machine (i.e. back up your registry and machine data first). But, given the potential performance gain, it's still worth doing some easy changes. While there are several commercial  registry cleaner products out there, CCleaner is free and works well - it plays it safe and only removes the obvious registry errors. CCLeaner has a feature to clean out much of the garbage from your registry.

 

4. Adjust your UI settings

 

Windows XP (I haven't touched Vista yet) lets you choose the balance between "pretty UI" vs. "fast UI". The idea is that pretty graphics (shading, rounded corners, transitions, etc...) take extra resources to render. If you're a developer who wants speed and doesn't care about gradient-shaded window panels, you can turn that stuff off: In "My Computer > Advanced > Performance Options", adjust for "best performance." This will make everything look like old, grey, boxes - but it will be faster.

 

5. Kill or Block certain "hog" processes and system startup apps

 

Background services are a big culprit for hogging resources, because these services could always be running. Skim through your Window Services to make sure that all the currently running (or automatic ones) are ok. If a service doesn't sound familiar, ask your IT department if you can kill it. In addition to services, applications that automatically start up when the machine turns on can make for a slow system startup. CCLeaner has an option for this as well, where you can explicitly block unwanted apps from automatically starting up.

 

6. Avoid running too many programs at once

 

This is pretty obvious. Under Task Manager, the Performance and Processes tabs can show you your CPU, Commit Charge, and stuff like that.

 

7. Uninstall the programs you don't need

 

The more stuff on your machine, the slower it will run. For example, if you no longer develop with VS 2003, remove it. This is also a good reason to avoid installing all those games on your poor, overworked PC. (But if your laptop requires a certain game on it to function, that may be understandable)

 

8. Use batch scripts to turn off processes when you don't want them

Sometimes you need that heavy service running in the background, but sometimes you don't. For example, SQL Server can take a lot of resources. Consider having a batch script that starts it up and shuts it down - not just opening and closing SQL Studio, but stopping the actual service. You can use the "net" command in a batch script to start and stop services:

net start "SQL Server (SQLSERVER2005)"
net start "Distributed Transaction Coordinator"
 

net stop "SQL Server (SQLSERVER2005)"
net stop "Distributed Transaction Coordinator"

9. Startup script

I try to avoid re-booting my machine because I loose all my sessions - open windows, loaded files, running applications, etc... One thing that slightly eases the pain is having a batch script (clickable from my desktop) that re-opens all my standard stuff, like NotePad, Browsers, Cmd, and Windows Explorer. I don't necessary want this as part of my startup, but it saves me a minute to just click the batch and have several applications all re-open themselves.

 

10. Run Disk Cleanup

Sometimes your machine may run slow because of a bad disk. Consider running Disk Cleanup. This MSDN article describes more (it also mentions freeing up disk space and defragmenting).

 

Other ideas

There's always more you can do. I found these other articles to be informative reads: