Dec 09 2011


Published by jfrank at 3:35 pm under python

About a week before it closed I decided to start working on the Darpa Shredder Challenge. This challenge is to reassemble shredded documents using only the shredded pieces. Here is a quick video depicting an early part of the algorithm that I worked on.

Algorithm Notes

This video is a bunch of visualizations put back to back of the process described below. The algorithm is run against piece 1 of puzzle 1 of the challenge, shown below. You’ll note in my visualization it is upside down, this is simply due to the start of x,y in the graphing application is at the lower left, while the image library gave me x and y starting from upper left.


Since I had very little time, I wanted to use as much existing code as possible. To find edges I used gimp/python-fu to posterize each piece 5 times. Each time that ran, it simplified the colors of the image. I then saved out the green channel. This made the edge finding as simple as finding the uniform green edge pieces and rolling around each piece. The edge I found is shown in the right hand side as black pixels.


Because each piece could match on only part of another piece, I wanted to find good matches based on local shape part matching. I chose to segment the piece by rolling through each location with surrounding bits by distance. The red X represents the current location, surrounded on either side by an arbitrary sized window shown as blue highlight.

Rotation Invariance

Each piece is given to you in a random rotation. So the segments between pieces needed to be fitted together in such a way as to ignore which way they were originally rotated. So next I found a ‘landscape horizon’ using a linear average of the segment. I chose PCA because of the x/y arbitraryness in this problem. This is because ordinary least squares only penalizes on the y term, and I needed total least squares so it would penalize on both terms equally. This is shown as a red line.


After finding the horizon, this becomes the new relative x axis for that segment. I use a linear transformation to rotate to a normalized inward facing view, and then transpose it to be centered at zero zero. This is the lower left portion of the video. Most flat sides are nearly equal to the x axis at that point, with variation easily seen. If you watch this window alone as you see the video it is as if a camera is going around the piece looking in on it with smooth transitions.

Representing Shape

Next I built a representation of surrounding shape at every edge location based on the new normalized x,y from the previous step. Surrounding shape should include other markings page lines and pen marks, but I haven’t included them yet. I converted the rotated Cartesian coordinates to a log polar system, and then used binning to produce a log polar 2d histogram of the surrounding shape. This is the upper right picture. I adjusted my binning to be more sensitive to changes in y at the horizon rather than in the middle because most shapes are fairly close to the x axis after they have been rotated. Each histogram results in 100 total bins, each with a count of the items in the segment. The degrees are left to right, while distance is the vertical dimension. Inspiration for this shape representation came from Shape Matching and Object Recognition.


Using the 100 integer histogram arrays of shape I then clustered them using K means so that I could simplify the comparisons between groups of ordered segments. The result is that I can classify a segment’s shape as a single cluster membership, and an array of segments (a side of a piece) as an ordered list of those cluster assignments. (Not Shown)

Fuzzy Segment-Cluster-Array Matching and Reassembly

The next task is to match segment cluster arrays with others that are closest, indicating a general shape match. (Not Shown) This produces lists of candidate matches to aid in a manual or automatic assembly.

Running Out Of Time

At about this point the contest ended. I am still working on getting the first puzzle completely automatically assembled. I learned a lot, and had a good time but the primary takeaway was this: visualize, visualize and then when in doubt, visualize. Visualization costs a lot, and building pretty graphs for yourself is a complete waste of time if you are a perfect coder. Countless times throughout this process I had implemented a part of it, and moved on but only later came back and visualized what I had actually done. Often that immediately showed what went wrong. One mistake that it helped me find was that the rotation and transposition that I had originally done were rotated correctly but transposed based on my original center, not the rotated center, so the histograms were erratic. Once I could see that the middle was not 0,0 (in the lower left picture) I immediately knew what the issue was. Slowly I came to learn that every minute spent visualizing a problem like this pays off immediately.

No responses yet

Trackback URI | Comments RSS

Leave a Reply