Mar 05 2013

What is the optimal number of FTP files to download at once?

Published by jfrank under internet


You want to download multiple files from an FTP (or similar tcp based download protocol)


To optimize for throughput you need to download more than one file at a time. The reason is simple: TCP spends a lot of time waiting for ACKs, and FTP clients and servers seem to want to do things very serially. This is probably because of the REST(art) command and keeping the implementation simple to debug. The longer the internet distance (defined by hops/latency, not necessarily physical distance) the more apparent this lack of efficiency is. You may have plenty of bandwidth to spare, but if you spend time waiting it will go unused.  When downloading more than one file you have the chance that your bandwidth is utilized rather than waiting.


How many files are optimal? In my very non scientific trials, 2 was too few, and 10 were too many. With 10 files I seemed to hit stutters which were perhaps buffer issues somewhere along the line, causing human noticeable hiccups across all active downloads simultaneously.  Other constraints with 10 could be server random read performance, or network congestion. With 2 files my bandwidth was underutilized, I assume because I was waiting for ACKs for a significant window of time.

Adjusting the downloads to 5 seemed to arrive at a semi- optimal max throughput for this case. In other servers/tasks I could see the optimal number being different. It calls for not a fixed value but an adaptive algorithm. My ftp program asks, “how many files would you like to download simultaneously?” I would like to select “adaptively maximize my throughput.”


  • 2 Files @ 180 – 360KB/s
  • 5 Files @ 130 – 650KB/s
  • 10 Files @50 – 500KB/s

The data suggests a curve which could be tested, approximated and then the adaptive algorithm could pick and re-adjust as needed.

No responses yet

Jun 22 2012

Oscilloscope Test

Published by jfrank under open source

Recently I bought an oscilloscope to help diagnose some problems I’ve been having with a Newhaven Displays model NHD-3.12 blue OLED screen. It uses the SPI interface, I read the datasheet to say that it should respect MSB first and is a mode 3 spi device.

I am just going to share a bit of what I learned by turning it on and testing various output pins from my micro controller program that is running on my Teensy, sort of an Arduino. I’ll use Y and B for the upper yellow, and lower blue signals respectively. I’ve been working with the author of the u8glib, Oliver Kraus to add support for this display and this is a test of the initial version of that program. With every diagnostic that I can perform the C program he sent seems to be doing everything correctly but the display will not respond.

This is a high level view of the clock pin (Y) and the output pin (B). You can see that the clock doesn’t run continuously as I had assumed, but instead runs only when there is data to be sent. The whole hello world display program takes about 600ms in a loop and waits 500 ms at the end of each data load. The low activity area you see around the center is the pause.

Y is the clock pin, showing device init the pin goes high just once as an artifact of power up . B is the reset pin, showing pulling reset low and then high again which should start the display.

Y is again clock during init, and B is the chip select pin. Chip select for this device should be low as active, and it exhibited that behavior. It was only high while data/commands were not being sent, and this brief period during the reset signal from the last image.

The next four sets of two images will show three pins during the initial byte phrases of the Hello World paging loop. Challenge, can you can calculate the MSB first hex hex values for the output pin? First picture per set, Y is clock, B is the output pin. Second picture, Y is mode and B is again output. Mode is irrelevant for the challenge, but for the device the mode tells whether to count the byte as an instruction/command (Y low) or as an argument/data (Y high).

Byte 1

Byte 2

Byte 3

Byte 4

One response so far

Jun 11 2012

Star Trek Analysis

Published by jfrank under python

After a recent sleep study I asked the doctor if I could take the data with me. He laughed and agreed after admitting that he had to have his data too when he first took a study. I realized after I came home that I potentially had an interesting set of data because before I drifted off in the laboratory I finished up an episode of Star Trek the Next Generation, Season 1 episode 18. The doctor immediately noticed that I was watching something by glancing at the EEG and eye tracking chart the next morning. I decided that I would try to correlate that data with my EEG data. This post is a precursor to that mini-project.

In order to do that I needed something about the video to use as my triggering events, something to correlate against. I decided to analyze the episode frame by frame and compare each frame with the next one to detect the ‘amount of differentness’. I used a simple metric, taking the RGB values of each pixel in frame at time T, and then the absolute value of the difference between the positions in frame T and the corresponding positions in frame T+1. As you may guess the super sharp spikes are scene cuts where nearly all the pixels have a large difference between T and T+1.

What you see below is a clip from that episode at about 32:25, 10 frames turned into an animated GIF which is trough to trough of the ‘double mountain’ shape in the center of line plot below. This caught my eye as a repetitious and mechanically generated shape so I pulled up the video at that moment and indeed it was Dr. Crusher doing computer analysis of medical data from the doomed denizens of Aldea. By comparison the smooth line to the left of the first cut-spike is her face toward the camera head slowly turning toward the computer.

Next is a two-frame loop from my most memorable scene in this episode. The planet demonstrates its power by hurling the Enterprise through space like a toy.

This is two frames animating, and you can see the ship is moving slightly but the rest of the scene doesn’t change. These two frames are the bottom of the trough most of the way down the slope of the jagged mountain center of the plot. This seemed bizzare, and I found it also through looking at the plot below and noticing the jagged edges. After looking at several special effect scenes that have this same characteristic I realized the jagged spikes in the chart indicate 12 fps (or some other partial frame rate) with interpolation.  Some of this effect could be due to the poor quality compressed digital copy that I have, but the visual analysis (some things moving, while everything should be moving) indicates there is a real difference in the broadcast. In this repetition, the star field would freeze every fourth frame while the ship continued to move.

All ~60000 frames (at 24 fps)  are available here.

Full Episode Frame Change Data

No responses yet

May 08 2012

Glass Ceramics Are So Cool

Published by jfrank under Uncategorized

How do you make very hard glass ceramics with low coefficient of thermal expansion (CTE)?

Answer: Take a combination of SiO2+Al2O3 (about 80%-90%) some TiO2+ZrO2 (about 4.5%) and LiO2 (also about 5%). Heat it to 1650 C.

Anneal it at 650C for two hours.

Raise the temp slowly to about 680C and hold it for an hour while crystal nucleation begins, then up to about 850 to transform the glass into a few different crystals, ceramizing it.

Now the CTE of the final product will be the amalgam of the CTE’s of the component parts, possibly even producing a material with a negative coefficient! Imagine something that shrinks when you heat it.

I’d love to make some some day, however the temperatures involved are extreme.

Microstructure and Properties of Li2O-Al2O3-SiO2-P2O5 Glass-Ceramics

No responses yet

Mar 12 2012

Making Games

Published by jfrank under games

“I want to make my cat shoot acorns! Ok?” a grinning sixth grade girl proclaims while pointing to her laptop.

“Sure thing.” I reply, smiling back.

I spent the last three months teaching a class about making video games for Saturday Academy. Working from two schools in north Portland that participate in the SUN after school program, I got to hold conversations like these during each class. During the end of 2011 I began wanting to teach a high school programming course, and when I heard about this opportunity via twitter I knew it wasn’t exactly what I wanted but it was close.

The software environment for the class was pre determined to be Game Maker from YoYo Games. I was skeptical at first wanting a cross platform open system, but it turned out to be good for getting students into game programming AND design all in one environment. I started by having each student sketch a character in four positions, run1, run2, stand, and jump. Using these as different sprites for the same object, we animated the characters and built basic platforming games over the course of 8 weeks.

This video is a montage of various game concepts that students came up with. Enjoy!

One response so far

Jan 01 2012

It’s Not the Critic Who Counts: 2011 Part 3

Published by jfrank under open source

Welcome to Part 3 of my year in review. Part 1 was all about changing things up. Part 2 is about a few projects I did, and here are a few more.

linked in left out

I had an thought and built a… sketch of the concept at The idea is basically linked-in for people experiencing homelessness. Instead of professional high fives, they would exchange bits of reputation, like ebay’s based on how many positive impacts they make in the community big and small. Non monetary, since credit/capital isn’t usually available to that population. The card could function as some sort of ad-hoc resume or a way to gauge trust in someone you meet. I’m not an expert in this area so I don’t know if its a good idea, but I wanted to submit it as a sort of working sketch to show it to people in the social work field.

Technologies Used: Google App Engine, Gaelyk, Groovy, Java

(un)shredding like a boss

I jumped in to the Darpa Shredder Challenge at the last minute. I wrote up a small post about this so I won’t go into too many details. This was a fun chance to apply some of my machine learning skills that I talked about in part 1. I used a clustering algorithm to sort piece segments by similarity before a search algorithm attempted to piece together likely candidate neighbors. Here is a video visualization to make you happy.

Technologies Used: NumPy, Python, Gimp-Fu/Python-Fu

look no hands, brain surgery for a hotplate

I started playing with Node at the end of this year. If you’re going to play with node, you should also be doing something with because why not they are made for each other! I decided to hack a hotplate stirrer that I had acquired for science. For science, you monster. First I plugged in a locally made Teensy Arduino clone chip to a solderless breadboard and loaded up a slightly modified Arduinoscope. Using that software oscilliscope and a digital multimeter I mapped the pins of the old Motorolla microcontroller (seen right disconnected). I rebuilt the functionality of the original chip in Arduino’s (easy) C like environment, and then began improving it.
I added a usb – serial api to the Teensy to control the hardware, I implemented the other side on my laptop in Node. Because that was running on my laptop, it was available on my local wireless network. I chose JQuery Mobile to build a cross platform, ridiculously simple UI with live events on the slider bars. Socket.IO pushes down the hardware state (rpm’s and temp) and pushes up commands (change rpm or temp) and the teensy sketch gives serial state output and waits for commands.
Technologies Used: Node, Socket.Io, Arduino, Corning Hotplate Stirrer, JQuery Mobile

One response so far

Dec 31 2011

It’s Not the Critic Who Counts: 2011 Part 2

Published by jfrank under open source, resources

In part one I moved and started new things… here is what else I did.

helicopters that fly upside down

One of my big goals this year was to study machine learning. I worked my way though Andrew Ng’s Stanford Machine Learning course. One of his demos is flying a helicopter upside down using reinforcement learning.

Technologies used: Python’s NumPy, Octave

transformations in 3d

I quickly learned that in order to grasp most machine learning algorithms, an understanding of linear algebra was a must. I put Machine Learning on hold and took Gilbert Strang’s MIT Linear Algebra course next. This guy can describe your haircut before and after in terms of a matrix operation. I earned 83,802 Khan Academy points mostly working through semi-advanced mathematics.

Technologies used: Python’s NumPy, Octave

the basic tools of science

I think everyone should own the basic tools of science, that’s why this year I set out to buy them. I acquired a 1600x microscope, a digital thermometer, a magnetic stirring hotplate, lab glassware and a balance that is sensitive down to a thousandth of a gram. Somehow in modern media portrayals home science equipment has nothing but bad connotations. I wholly reject that idea. Science belongs at home! In a great article Bill Nye put it best

People who want to make meth will find ways to do it that don’t require an Erlenmeyer flask. But raising a generation of people who are technically incompetent is a recipe for disaster

combine at 80 degrees celsius

Women’s deodorant is bad. Not because it doesn’t work, it does, but because it contains aluminum. For those with sensitive skin, its awful. There are also a few potential hazards with applying nano sized metals that are easily absorbable into the human body.  My brother is a PhD chemist, so he helped me with my lab technique. I asked a large chemical company for a sample of sodium sterate (they sent me 2 kilos free!), bought some propylene glycol locally, and made a variant on old spice. Colorless, odorless. My wife loved it, but it needs some work.

wingdings and matisse

An Android app that uses Wingdings AND Matisse? You’re thinking “No you didn’t” but in fact I did. I did. In case you forgot these fonts.. let me give you a refresher:

I built a task manager with Android push notifications for a client who really really wanted one. I wanted to build an Android app, so we had a deal. A few weeks later it was done. I don’t know why people complain about Android development, I found it fairly painless.

Technologies Used: Java, Android SDK, Eclipse, PHP (client backend), Google’s Android Push API

no pie for you

A friend of a friend approached me with an idea, and I took it on as a co-founder. We built out a QR powered profile site for hot-rods called After printing up some stickers we got some enthusiastic responses. We even applied for PIE, an incubator here in Portland but we were denied. There was stiff competition, but perhaps they were right. This site never really took off as planned. It costs next to nothing to run, so my partner and I will continue to let it run, but it was a good lesson in the cost of prototyping. I still think the idea is good, especially if adopted by a whole car club for the purposes of a unified car show.

Technologies Used: Google App Engine, Java, Groovy, QR codes

Next Post: Un-Shredding and Electronic Brain Surgery

2 responses so far

Dec 30 2011

It’s Not the Critic Who Counts: 2011 Part 1

Published by jfrank under magnolia, open source, python

This has been a big year for me. So big that you all get a year-end recap of it because you are here, reading my blog. Except it’s too big to write in one post. So if you didn’t think you were going to get personal stuff mixed in to this mostly tech blog, now is the time to unsubscribe.

One of my favorite quotes is this:

It is not the critic who counts: not the man who points out how the strong man stumbles or where the doer of deeds could have done better. The credit belongs to the man who is actually in the arena, whose face is marred by dust and sweat and blood, who strives valiantly, who errs and comes up short again and again, because there is no effort without error or shortcoming, but who knows the great enthusiasms, the great devotions, who spends himself for a worthy cause; who, at the best, knows, in the end, the triumph of high achievement, and who, at the worst, if he fails, at least he fails while daring greatly, so that his place shall never be with those cold and timid souls who knew neither victory nor defeat – Roosevelt

What this means to me is that I am getting comfortable not with success, but with failure. And with failure, slowly, haltingly, comes some measure of success.

cloud surfing

At the beginning of 2010 at my job at Mentor I enjoyed moving some more systems to AWS from an acquisition’s internal server farms. I closed out six years at Mentor; an excellent chapter in my life that I was sad to see end. I miss my co-workers, and agree with Barney “Twitter is such a poor excuse for seeing them every day.”  When I put my notice in, most were happy for me, some could hardly believe what I was doing. Others didn’t actually believe that I did not have a position to go to at another safe corporation. They kept asking what my real plan was, and I kept replying: “I’m taking a year off to study, to grow, to try new things.”

Technologies used: Bash, Railo, AWS API’s, Python

life changes. no really it does!

quit my job in March,
moved out of down town in June,
rented a big old house a week later,
became a foster parent in July,
and enrolled as a fake student at PSU.

it slices, it dices. well no, actually it only slices.

I had a poor experience with a cloud dashboard company and built as a response. It makes snapshots for AWS volumes on a schedule. So simple. Happy dance. It broke even almost immediately, and although it doesn’t make tons of money, there is a lot of room here for growth. A customer is asking to pay for me to develop new features for it currently so I may revisit it and roll out new things. I wanted to build this as compartmentalized as possible. One of my design goals was that the front end know as little as possible about the backing AWS services as it could. So it talks exclusively to the middle python tier, even though it is powerful enough to accomplish both functions. This means that if I needed to I could scale those components separately, and keep the user facing process in a separate linux user/group as the process that talks to AWS and has the security keys.

Technologies used: Java Magnolia and Railo templating front end, python web service backend,  AWS SimpleDb storage. Scaleable! Stateless! Cloud!

on being a fake student

Being a fake student is the best! When you take a one credit class at PSU you have access to super high speed internet, a research library, and a place to go work. I spent a large part of my year hacking code in Food For Thought cafe, next to artists, musicians, and hippies. But that’s not all. I also have access to the new(ish) Rec Center which has a pool, spa, rock wall and all the normal workout stuff.

Oh, and my one credit class? Yoga. So stressful.

Next post: Basic Tools of Science and Flying Helicopters Upside Down

3 responses so far

Dec 09 2011


Published by jfrank under python

About a week before it closed I decided to start working on the Darpa Shredder Challenge. This challenge is to reassemble shredded documents using only the shredded pieces. Here is a quick video depicting an early part of the algorithm that I worked on.

Algorithm Notes

This video is a bunch of visualizations put back to back of the process described below. The algorithm is run against piece 1 of puzzle 1 of the challenge, shown below. You’ll note in my visualization it is upside down, this is simply due to the start of x,y in the graphing application is at the lower left, while the image library gave me x and y starting from upper left.


Since I had very little time, I wanted to use as much existing code as possible. To find edges I used gimp/python-fu to posterize each piece 5 times. Each time that ran, it simplified the colors of the image. I then saved out the green channel. This made the edge finding as simple as finding the uniform green edge pieces and rolling around each piece. The edge I found is shown in the right hand side as black pixels.


Because each piece could match on only part of another piece, I wanted to find good matches based on local shape part matching. I chose to segment the piece by rolling through each location with surrounding bits by distance. The red X represents the current location, surrounded on either side by an arbitrary sized window shown as blue highlight.

Rotation Invariance

Each piece is given to you in a random rotation. So the segments between pieces needed to be fitted together in such a way as to ignore which way they were originally rotated. So next I found a ‘landscape horizon’ using a linear average of the segment. I chose PCA because of the x/y arbitraryness in this problem. This is because ordinary least squares only penalizes on the y term, and I needed total least squares so it would penalize on both terms equally. This is shown as a red line.


After finding the horizon, this becomes the new relative x axis for that segment. I use a linear transformation to rotate to a normalized inward facing view, and then transpose it to be centered at zero zero. This is the lower left portion of the video. Most flat sides are nearly equal to the x axis at that point, with variation easily seen. If you watch this window alone as you see the video it is as if a camera is going around the piece looking in on it with smooth transitions.

Representing Shape

Next I built a representation of surrounding shape at every edge location based on the new normalized x,y from the previous step. Surrounding shape should include other markings page lines and pen marks, but I haven’t included them yet. I converted the rotated Cartesian coordinates to a log polar system, and then used binning to produce a log polar 2d histogram of the surrounding shape. This is the upper right picture. I adjusted my binning to be more sensitive to changes in y at the horizon rather than in the middle because most shapes are fairly close to the x axis after they have been rotated. Each histogram results in 100 total bins, each with a count of the items in the segment. The degrees are left to right, while distance is the vertical dimension. Inspiration for this shape representation came from Shape Matching and Object Recognition.


Using the 100 integer histogram arrays of shape I then clustered them using K means so that I could simplify the comparisons between groups of ordered segments. The result is that I can classify a segment’s shape as a single cluster membership, and an array of segments (a side of a piece) as an ordered list of those cluster assignments. (Not Shown)

Fuzzy Segment-Cluster-Array Matching and Reassembly

The next task is to match segment cluster arrays with others that are closest, indicating a general shape match. (Not Shown) This produces lists of candidate matches to aid in a manual or automatic assembly.

Running Out Of Time

At about this point the contest ended. I am still working on getting the first puzzle completely automatically assembled. I learned a lot, and had a good time but the primary takeaway was this: visualize, visualize and then when in doubt, visualize. Visualization costs a lot, and building pretty graphs for yourself is a complete waste of time if you are a perfect coder. Countless times throughout this process I had implemented a part of it, and moved on but only later came back and visualized what I had actually done. Often that immediately showed what went wrong. One mistake that it helped me find was that the rotation and transposition that I had originally done were rotated correctly but transposed based on my original center, not the rotated center, so the histograms were erratic. Once I could see that the middle was not 0,0 (in the lower left picture) I immediately knew what the issue was. Slowly I came to learn that every minute spent visualizing a problem like this pays off immediately.

No responses yet

Feb 01 2011

10 Ideas for How to Administer Windows Servers

Published by jfrank under resources

For the impatient: treat it like linux.

  1. Don’t run IIS, run Apache or generically “Don’t run the GUI one, run the daemon one.” Version control your configuration. Version controlling your config in a usable way is way more difficult than it has to be or impossible using point and click tools. You’re running a server, not a candy store. It doesn’t need to look good, it needs to be reproducible. Ask someone who uses IIS to show you the changes that have been made in the last couple weeks, and why each one was made; they can’t. If you version control your config, thats all in your commit history. Love those log files.
  2. Don’t run FTP, run Rsync, and use that to transfer files to and from your server over the internet. But don’t install the windows rsync server because thats not secure. More on point 3.
  3. Install ICW Copssh instead, it is an openssh installation for windows packaged by ITeF!x Consulting. Next you install the rsync package on top and then you can ssh/sftp/scp/rsync to the box in the normal linux fashion complete with public key authentication. Use the same rsync client package that you installed on top of ICW on the other end to connect to your server from another windows box.
  4. Never install server applications that are cross platform (aka not built only for windows) in the default location. Programs that are developed to work on multiple platforms will often miss conventions that apply only to windows paths. Recent default paths for 32 bit programs on newer Windows contain not only spaces, but parenthesis as well a.k.a C:\Program Files (x86)\My Program Blah\blah. Use your own convention, lower case, and keep it simple: D:\apps\mysql5\ for example.
  5. Install the excellent sysinternals suite. Microsoft bought it, but this labor of love by Mark Russinovich was developed long before they got their hands on it. Process explorer in tree view is an excellent alternative to task manager. Junction is a great way to get simlink behavior.
  6. Install notepad++ or use textpad. Notepad ++ is an excellent notepad alternative because it replaces notepad inline so whenever you would have seen notepad, you’ll just get the better one.
  7. Use volume shadow copy to snapshot your data. Its nice. Good job Microsoft.
  8. Put both the ICW and Sysinternals binary directories on your system path. This will give you many useful tools to use on the windows command line. If for example you type ls on accident, you will actually get a file listing back just as you had expected. “tail -f _filename_” will just work. Junctioning (symbolic linking) will be nearby.
  9. Script administrative tasks, and version control your scripts. I don’t have experience with it, but the Windows Power Shell allows you to script many things, and if you’re serious about windows it is the strongest system platform. Good old bat/cmd is also available and still useful for many things.
  10. Last one is a double bonus! Shut down services you aren’t using. The indexing service isn’t helpful if you aren’t searching for example. Run an anti virus, love those Microsoft system updates and lock down a firewall to only the ports you need; 22 for ssh, 80 for a website plus a few others.

One response so far

Next »