Sunday, October 19, 2008

A New Code Metric : Destroyed Lines Of Code (DLOC)

We've all heard that counting lines of code, or SLOC, is a terrible way to measure a developer's performance. Function point counting is not a much better metric, just slightly different than SLOC. Instead, conventional wisdom says that good developers write fewer lines that get the job done better, so SLOC and function points do not reward that good behavior. (Although I admit SLOC does have its place in comparing two systems written by similar teams of developers)

I propose a new metric that rewards good coding practice and simple, brief style : DLOC: Destroyed Lines of Code. You measure the bad lines of code that you remove from the system. If you can destroy lines of code, and the system is still working, then those lines of code were either bad, or just misleading bloat. You can also destroy lines of code by using packages that solve your problems - like destroying JDBC calls and replacing them with Hibernate mappings, or destroying your Factory Patterns and Service Locators and replacing them with Spring dependency injection.

DLOC in Action

To illustrate DLOC, I worked on a system where I was asked to add some features. This system had an admin web interface with a few database tables. So I looked at the admin interface system - about 3000 lines of code, plus JSPs and 10 database tables. First run, I added a table, then changed all the old JDBC calls to Hibernate (which went faster than I expected). I ended up deleting some JDBC code so there were some destroyed lines of code right there. Next, I started really asking users about the app only to find out that nobody really needed the admin web interface. In fact, they really didn't need to edit the data that often at all. They would be perfectly happy with deploying changes at each release instead of making admins use a web interface. So I proposed removing the database tables and going to an XML config file with a similar schema, then removing the web interface altogether. The users could edit the XML and deploy it with each release when changes were needed. In the end, I destroyed all the tables, all the JSPs and much of the Java code, saving only the POJOs. I added some XML parsing, and in the end the whole system was only about 300 SLOC.

If you analyze my performance by SLOC, I scored a terrible score - negative 2700. But by DLOC, I destroyed 2700 lines of code, and the users still got what they wanted. Hidden benefit: new data and features were simple to add to the tiny and simple system.

The rules: DLOC Methodology

Any metric needs rules so that performance can be measured fairly, so DLOC needs some rules:

  • count the lines of code that were destroyed (removed) from the system, bigger numbers are better (in contrast to golf scores where lower numbers are better)
  • destroyed comments also count - comments can lie, and removing bad comments is value added to your system
  • lines of code you added while changing the system do not count for you nor against you, after all, we don't know if those new lines are any good. We already know that we dislike SLOC, so we'll avoid looking at added lines and only count destroyed lines

Obviously this DLOC system can be cheated just as easily as SLOC or function points: I could write a few thousand lines, only to purposely destroy them later. Also, you would expect a low DLOC for a brand new system where you've got to start from scratch (it seems I rarely have this luxury). DLOC is truly interesting on a system that is aging and even more interesting if the system was not constructed quite so perfectly.

Proper DLOC Usage

Truly caring for a software system has caused me to apply different techniques to make the software better. I try to use Agile Development, Test Driven, and even Broken Windows Theory when I am working on a system. DLOC is more of an observation than a methodology, but I think it's rewarding and fun to measure DLOC while improving a system. So here are some ways you can make your software system better, and increase your DLOC score too:

  • Introduce new technologies into your system: Hibernate, Spring, even converting from Java 1.4 to Java 5 can reduce complexity and destroy lines of code (e.g. annotations, typed collections)
  • Ask your customers what they like/dislike about the system, you may find that certain parts of the system can be removed or refactored
  • When you see code that could be improved / destroyed, change them immediately, don't procrastinate, this is Broken Windows for software

So improving software is the goal. And DLOC is a fun metric you can use to measure the changes you make. You can use DLOC to prove to yourself that the changes you made were a big impact on the system. Be proud of yourself and please help that poor software become better by destroying those unneeded lines of code. If you don't destroy them, who will?

-Jay Meyer

jmeyer at harpoontech dot com

8 comments:

Rajesh Patel said...

This is a great article. I was just telling a co-worker last week that I enjoyed deleting code almost as much as writing it.

JP said...

Hi,

Nice article.I even enjoy deleting code some times.

Anonymous said...

I agree with your basic premise that removing lines of code can help a system. Unfortunently all metrics can gamed and this is no exception.

What happens when they remove all the whitespace in their code? What happens when the remove all the comments? It requires sophisticated parsing to determine if each line contains only one "programming action". And it is currently impossible for a computer to determine if comments were valuable or not.

In the end I believe it comes down to human managers reading the code their team produces and making a judgement calls. This metric could inform that decision but it doesn't hold on its own.

Good post!

Dennis Sellinger said...

A good post and a novel idea. We all like to remove bad code. Unfortunately, when we do have to add functionality we do have to add code.

I think we should ignore the fact that all metrics can be hacked. If people care more about hacking metrics than about using them properly, they are of no use to anyone.

So my problem is, how do you measure destroyed lines of code when we are adding more code? I think a metric is useful only if I can calculate it easily and if I can understand what it is saying.

If the version control system could give a measure of this (like repository churn - # of lines added, modified and deleted) we might be able to make sense of all this. Currently, I my SCM (Perforce) does not give me this information (although I have never really made an effort to ask).

It is true that everyone is happy when we can say we deleted a bunch of complex code, so it would be nice to have a measurement for this. So other than calculating SLOC and hoping that it decreases, how do we measure DLOC?

Anonymous said...

http://msr.uwaterloo.ca/

The mining software community has found that removing code is actually much more rare than adding. So looking at code removed is like looking at rare events. So it is often worthwhile.

Jay Meyer said...

Evan,
Well said. As I pointed out in the posting, this metric is not a good measure of productivity either - but its certainly more fun to measure for your own amusement and sense of accomplishment.

Perhaps in the future when an unenlightened manager berates you about the importance of metrics, you can ask them to measure DLOC as well. Then maybe they can see how too much focus on metrics detracts from the goal - good software.

Unknown said...

Good post.

While it may be difficult to come up with accurate stats, it's still a great mindset to get developers and their managers to embrace.

The inability to codify this is probably a good thing, since hard stats might even have a negative effect, in terms of developers going overboard with perl-like expressions (perl golf).

Another way we kill code and get additional benefit is when we apply the DRY principle.

I think you need a better name though. It's hard to pronounce dloc. How about "kloc"? As in how many lines of code have you kloc'd today?

Anonymous said...

I agree with evan's conclusion - this could be a great methodology that could improve the quality of code but it could also make it unreadable. There are plenty of high quality code principles that requires additional 2 or 3 rows but increase the readability a lot. Therefore lines of code are subjective factor of evaluation after all.