Mar 05 2013

What is the optimal number of FTP files to download at once?

Published by jfrank at 12:37 pm under internet

Assumptions

You want to download multiple files from an FTP (or similar tcp based download protocol)

Argument

To optimize for throughput you need to download more than one file at a time. The reason is simple: TCP spends a lot of time waiting for ACKs, and FTP clients and servers seem to want to do things very serially. This is probably because of the REST(art) command and keeping the implementation simple to debug. The longer the internet distance (defined by hops/latency, not necessarily physical distance) the more apparent this lack of efficiency is. You may have plenty of bandwidth to spare, but if you spend time waiting it will go unused.  When downloading more than one file you have the chance that your bandwidth is utilized rather than waiting.

Optimal

How many files are optimal? In my very non scientific trials, 2 was too few, and 10 were too many. With 10 files I seemed to hit stutters which were perhaps buffer issues somewhere along the line, causing human noticeable hiccups across all active downloads simultaneously.  Other constraints with 10 could be server random read performance, or network congestion. With 2 files my bandwidth was underutilized, I assume because I was waiting for ACKs for a significant window of time.

Adjusting the downloads to 5 seemed to arrive at a semi- optimal max throughput for this case. In other servers/tasks I could see the optimal number being different. It calls for not a fixed value but an adaptive algorithm. My ftp program asks, “how many files would you like to download simultaneously?” I would like to select “adaptively maximize my throughput.”

Data

  • 2 Files @ 180 – 360KB/s
  • 5 Files @ 130 – 650KB/s
  • 10 Files @50 – 500KB/s

The data suggests a curve which could be tested, approximated and then the adaptive algorithm could pick and re-adjust as needed.

No responses yet

Trackback URI | Comments RSS

Leave a Reply