Original idea from [Transfer files from an FTP server to S3 by "Hack N Cheese"](http://hackncheese.com/2014/11/24/Transfer-files-from-an-FTP-server-to-S3/). I moved roughly a terrabyte in less than an hour. Granted, I couldn't take advantage of `lftp`'s `--parallel=30` switch due to my ftp source limiting me to one connection at a time, but `use-pget-n=N` did seem to help out. * Get a fast Ubuntu 14.4 EC2 box on Amazon for temporary usage (I went with `m1.xlarge`) so data tranfers aren't limited by your local bandwidth at least. I also attached a fat 2TB EBS volume and symlinked it to `/bigdisk`, and made sure the EBS volume was deleted after I terminated this EC2 box. I hope `lftp` 2.6.4 is available as a stable package by the next time I attempt this. * Build [`lftp`](https://github.com/lavv17/lftp) 2.6.4+ (Not easy to compile, so read the [`INSTALL`](https://github.com/lavv17/lftp/blob/master/INSTALL) file and plow through all your missing dependencies - you'll also need to re-run `sudo ./configure && sudo make && sudo make install` if you were in my case, without sudo they just won't work). Presently the Ubunutu apt package is at `lftp/trusty,now 4.4.13-1 amd64 [residual-config]`, so uninstall this if you had it previously since the `mirror` options in that version of `lftp` are severely limited and not available till at least 2.6.4 version. * Run all these in [mosh](https://mosh.mit.edu/) and [tmux](https://tmux.github.io/) window sessions just incase... * Run this on your ec2 box: ``` lftp -e " \ debug -t 2; \ set net:max-retries 3000; \ set net:timeout 10m; \ set ftp:charset iso-8859-1; \ open ftp.yoursite.com; \ mirror \ --log log.txt \ --use-pget-n=1000 \ --use-cache \ --continue \ --loop \ /your/ftp/remote/path /your/ec2/local/path \ exit; \ " ``` * Note: `lftp mirror` command with `--parallel=30` is only possible if your FTP server lets you connect 30 simultaneous connections. In my case, I was limited to only 1 connection :(. * Then `wget` a copy of `s3-parallel-put.py` (my (fork)[https://github.com/weftio/s3-parallel-put] for regionalized buckets if you need to): * Do the parallel s3 put dance: `/your/ec2/local/path$ python s3-parallel-put --bucket=weft-wind-data --secure --put=update --processes=50 --content-type=guess --verbose --log-filename=/tmp/s3pp.log /your/local/ec2/path` Wow, not so bad. Kinda. Except I had to hack a [pull request for `s3-parallel-put`](https://github.com/twpayne/s3-parallel-put/pull/23) to support my bucket which lives outside the US Standard region - you may need this as well.