Aug 30 2010

Keeping Dry

Published by jfrank at 4:57 pm under coldfusion, open source

Did you ever wonder what is shared between all those CF frameworks? I put together a copy paste detector implementation of the interface for cfm and cfc files, as part of the PMD project. I also plugged it into hudson via the DRY visualizer of the output, but that is just candy on top.

Basically it works like this:

  1. It goes through your code line by line, character by character and builds up overlapping hashes of the content.
  2. Based on your tokenizer, it will ignore certain parts of the code (like whitespace) and so the hashes will be able to handle your 2 tabs vs my 3 spaces and shows that two sections of code are ‘the same’. The tokenizer I am using is the ‘anyTokenizer’, not optimized for CF at all.
  3. Based on a threshold of how big a token range the hashes should cover, it builds a report of all the code you give it and outputs it in a structured format.

Now presenting, all of the code duplication between the following ColdFusion projects, run with a token threshold of 200:

  • ColdBricks
  • Coldbox
  • ModelGlue
  • coldmock
  • mura
  • MangoBlog
  • coldspring
  • mxunit
  • machii
  • farcry
  • fusebox5

Well… I’m not going to put them all here. Its a large set.

Here are a few samples:

Farcry and mura both use cfformprotect. Cool.

51,515: <file line=”288″ path=”/home/jfrank/temp/codeprojects/farcry/core/webtop/cffp/cfformprotect/cffpVerify.cfc”/>
51,516: <file line=”287″ path=”/home/jfrank/temp/codeprojects/mura-5.2.2709/www/requirements/cfformprotect/cffpVerify.cfc”/>

BlogCFC and MangoBlog both share xmlrpc bits. Nice.

53,956: <file line=”141″ path=”/home/jfrank/temp/codeprojects/BlogCFC5/client/xmlrpc/xmlrpc.cfc”/>
53,957: <file line=”122″ path=”/home/jfrank/temp/codeprojects/MangoBlog_1.5/api/xmlrpc.cfc”/>

The full results are not quite valid xml, because of encrypted cfms. They also contain a lot of boilerplate licenses as you can imagine. It would take a minor amount of cleanup of this output to make it parseable, and removing the licenses would make it much more compact. Also some of the duplication shown may be intentional due to code generation/plugin dependency domains.

Want to run it yourself?

Get the current PMD jar and grab my cfm-cpd.jar and this build file if you’d like to run it with ant.

Point Ant at a lib directory with those two jars, and run the build against your own code. Don’t forget the Dry Plugin if you run Hudson!

(Note, the cfm-cpd jar contains a couple overlapping classes with PMD, and relies on the fact that jars are loaded alphabetically and so it will win. This is lame, but so is PMD for making me hard code things in the ant task!)

2 responses so far

2 Responses to “Keeping Dry”

  1. Tim Beadleon 02 Feb 2012 at 6:50 am


    Have you install this in Eclipse-PMD at all? Just trying to figure out how to add CF support to its copy-paste detection.



  2. jfrankon 06 Feb 2012 at 3:23 pm

    I haven’t tried that. It would be easy to include this as a plugin though. Its just a matter of extending the basic text version with what to ignore.

Trackback URI | Comments RSS

Leave a Reply