Skip to content

Instantly share code, notes, and snippets.

@mikeatlas
Last active September 28, 2024 07:20
Show Gist options
  • Save mikeatlas/bdfef665e0535bf5996c to your computer and use it in GitHub Desktop.
Save mikeatlas/bdfef665e0535bf5996c to your computer and use it in GitHub Desktop.

Revisions

  1. Mike Atlas revised this gist Aug 30, 2015. 1 changed file with 1 addition and 1 deletion.
    2 changes: 1 addition & 1 deletion sync_ftp_to_s3.md
    Original file line number Diff line number Diff line change
    @@ -31,4 +31,4 @@ lftp -e " \

    * Do the parallel s3 put dance: `/your/ec2/local/path$ python s3-parallel-put --bucket=weft-wind-data --secure --put=update --processes=50 --content-type=guess --verbose --log-filename=/tmp/s3pp.log /your/local/ec2/path`

    Wow, not so bad. Kinda.
    Wow, not so bad. Kinda. Except I had to hack a [pull request for `s3-parallel-put`](https://github.com/twpayne/s3-parallel-put/pull/23) to support my bucket which lives outside the US Standard region - you may need this as well.
  2. Mike Atlas revised this gist Aug 30, 2015. 1 changed file with 3 additions and 1 deletion.
    4 changes: 3 additions & 1 deletion sync_ftp_to_s3.md
    Original file line number Diff line number Diff line change
    @@ -29,4 +29,6 @@ lftp -e " \

    * Then `wget` a copy of `s3-parallel-put.py` (my (fork)[https://github.com/weftio/s3-parallel-put] for regionalized buckets if you need to):

    `/your/ec2/local/path$ python s3-parallel-put --bucket=weft-wind-data --secure --put=update --processes=50 --content-type=guess --verbose --log-filename=/tmp/s3pp.log /your/local/ec2/path`
    * Do the parallel s3 put dance: `/your/ec2/local/path$ python s3-parallel-put --bucket=weft-wind-data --secure --put=update --processes=50 --content-type=guess --verbose --log-filename=/tmp/s3pp.log /your/local/ec2/path`

    Wow, not so bad. Kinda.
  3. Mike Atlas revised this gist Aug 30, 2015. 1 changed file with 1 addition and 1 deletion.
    2 changes: 1 addition & 1 deletion sync_ftp_to_s3.md
    Original file line number Diff line number Diff line change
    @@ -27,6 +27,6 @@ lftp -e " \

    * Note: `lftp mirror` command with `--parallel=30` is only possible if your FTP server lets you connect 30 simultaneous connections. In my case, I was limited to only 1 connection :(.

    *Then `wget` a copy of `s3-parallel-put.py` (my (fork)[https://github.com/weftio/s3-parallel-put] for regionalized buckets if you need to):
    * Then `wget` a copy of `s3-parallel-put.py` (my (fork)[https://github.com/weftio/s3-parallel-put] for regionalized buckets if you need to):

    `/your/ec2/local/path$ python s3-parallel-put --bucket=weft-wind-data --secure --put=update --processes=50 --content-type=guess --verbose --log-filename=/tmp/s3pp.log /your/local/ec2/path`
  4. Mike Atlas revised this gist Aug 30, 2015. 1 changed file with 1 addition and 1 deletion.
    2 changes: 1 addition & 1 deletion sync_ftp_to_s3.md
    Original file line number Diff line number Diff line change
    @@ -3,7 +3,7 @@ Original idea from [Transfer files from an FTP server to S3 by "Hack N Cheese"](
    I moved roughly a terrabyte in less than an hour. Granted, I couldn't take advantage of `lftp`'s `--parallel=30` switch due to my ftp source limiting me to one connection at a time, but `use-pget-n=N` did seem to help out.

    * Get a fast Ubuntu 14.4 EC2 box on Amazon for temporary usage (I went with `m1.xlarge`) so data tranfers aren't limited by your local bandwidth at least. I also attached a fat 2TB EBS volume and symlinked it to `/bigdisk`, and made sure the EBS volume was deleted after I terminated this EC2 box. I hope `lftp` 2.6.4 is available as a stable package by the next time I attempt this.
    * Build [`lftp`](https://github.com/lavv17/lftp) 2.6.4+ (Not easy to compile, so read the [`INSTALL`](https://github.com/lavv17/lftp/blob/master/INSTALL) file and plow through all your missing dependencies - you'll also need to re-run `sudo ./configure && sudo make && sudo make install` if you were in my case, without sudo they just won't work). Presently the Ubunutu apt package is at `lftp/trusty,now 4.4.13-1 amd64 [residual-config]`, so uninstall this if you had it previously since the `mirror` options in that version of `lftp` are severely limited.
    * Build [`lftp`](https://github.com/lavv17/lftp) 2.6.4+ (Not easy to compile, so read the [`INSTALL`](https://github.com/lavv17/lftp/blob/master/INSTALL) file and plow through all your missing dependencies - you'll also need to re-run `sudo ./configure && sudo make && sudo make install` if you were in my case, without sudo they just won't work). Presently the Ubunutu apt package is at `lftp/trusty,now 4.4.13-1 amd64 [residual-config]`, so uninstall this if you had it previously since the `mirror` options in that version of `lftp` are severely limited and not available till at least 2.6.4 version.
    * Run all these in [mosh](https://mosh.mit.edu/) and [tmux](https://tmux.github.io/) window sessions just incase...

    * Run this on your ec2 box:
  5. Mike Atlas revised this gist Aug 30, 2015. 1 changed file with 4 additions and 1 deletion.
    5 changes: 4 additions & 1 deletion sync_ftp_to_s3.md
    Original file line number Diff line number Diff line change
    @@ -6,6 +6,7 @@ I moved roughly a terrabyte in less than an hour. Granted, I couldn't take advan
    * Build [`lftp`](https://github.com/lavv17/lftp) 2.6.4+ (Not easy to compile, so read the [`INSTALL`](https://github.com/lavv17/lftp/blob/master/INSTALL) file and plow through all your missing dependencies - you'll also need to re-run `sudo ./configure && sudo make && sudo make install` if you were in my case, without sudo they just won't work). Presently the Ubunutu apt package is at `lftp/trusty,now 4.4.13-1 amd64 [residual-config]`, so uninstall this if you had it previously since the `mirror` options in that version of `lftp` are severely limited.
    * Run all these in [mosh](https://mosh.mit.edu/) and [tmux](https://tmux.github.io/) window sessions just incase...

    * Run this on your ec2 box:
    ```
    lftp -e " \
    debug -t 2; \
    @@ -24,6 +25,8 @@ lftp -e " \
    "
    ```

    then `wget` a copy of `s3-parallel-put.py` (my (fork)[https://github.com/weftio/s3-parallel-put] for regionalized buckets if you need to):
    * Note: `lftp mirror` command with `--parallel=30` is only possible if your FTP server lets you connect 30 simultaneous connections. In my case, I was limited to only 1 connection :(.

    *Then `wget` a copy of `s3-parallel-put.py` (my (fork)[https://github.com/weftio/s3-parallel-put] for regionalized buckets if you need to):

    `/your/ec2/local/path$ python s3-parallel-put --bucket=weft-wind-data --secure --put=update --processes=50 --content-type=guess --verbose --log-filename=/tmp/s3pp.log /your/local/ec2/path`
  6. Mike Atlas revised this gist Aug 30, 2015. 1 changed file with 1 addition and 1 deletion.
    2 changes: 1 addition & 1 deletion sync_ftp_to_s3.md
    Original file line number Diff line number Diff line change
    @@ -4,7 +4,7 @@ I moved roughly a terrabyte in less than an hour. Granted, I couldn't take advan

    * Get a fast Ubuntu 14.4 EC2 box on Amazon for temporary usage (I went with `m1.xlarge`) so data tranfers aren't limited by your local bandwidth at least. I also attached a fat 2TB EBS volume and symlinked it to `/bigdisk`, and made sure the EBS volume was deleted after I terminated this EC2 box. I hope `lftp` 2.6.4 is available as a stable package by the next time I attempt this.
    * Build [`lftp`](https://github.com/lavv17/lftp) 2.6.4+ (Not easy to compile, so read the [`INSTALL`](https://github.com/lavv17/lftp/blob/master/INSTALL) file and plow through all your missing dependencies - you'll also need to re-run `sudo ./configure && sudo make && sudo make install` if you were in my case, without sudo they just won't work). Presently the Ubunutu apt package is at `lftp/trusty,now 4.4.13-1 amd64 [residual-config]`, so uninstall this if you had it previously since the `mirror` options in that version of `lftp` are severely limited.
    * Run all these in mosh and tmux window sessions just incase...
    * Run all these in [mosh](https://mosh.mit.edu/) and [tmux](https://tmux.github.io/) window sessions just incase...

    ```
    lftp -e " \
  7. Mike Atlas revised this gist Aug 30, 2015. 1 changed file with 2 additions and 2 deletions.
    4 changes: 2 additions & 2 deletions sync_ftp_to_s3.md
    Original file line number Diff line number Diff line change
    @@ -3,8 +3,8 @@ Original idea from [Transfer files from an FTP server to S3 by "Hack N Cheese"](
    I moved roughly a terrabyte in less than an hour. Granted, I couldn't take advantage of `lftp`'s `--parallel=30` switch due to my ftp source limiting me to one connection at a time, but `use-pget-n=N` did seem to help out.

    * Get a fast Ubuntu 14.4 EC2 box on Amazon for temporary usage (I went with `m1.xlarge`) so data tranfers aren't limited by your local bandwidth at least. I also attached a fat 2TB EBS volume and symlinked it to `/bigdisk`, and made sure the EBS volume was deleted after I terminated this EC2 box. I hope `lftp` 2.6.4 is available as a stable package by the next time I attempt this.
    * Build [`lftp`](https://github.com/lavv17/lftp) 2.6.4+ (Not easy to compile, so read the [`INSTALL`](https://github.com/lavv17/lftp/blob/master/INSTALL) file and plow through all your missing dependencies - you'll also need to re-run `sudo ./configure && sudo make && sudo make install` if you were in my case). Presently the Ubunutu apt package is at `lftp/trusty,now 4.4.13-1 amd64 [residual-config]`, so uninstall this if you had it previously since the `mirror` options in that version are severely limited.
    * Run these in mosh and tmux sessions just incase...
    * Build [`lftp`](https://github.com/lavv17/lftp) 2.6.4+ (Not easy to compile, so read the [`INSTALL`](https://github.com/lavv17/lftp/blob/master/INSTALL) file and plow through all your missing dependencies - you'll also need to re-run `sudo ./configure && sudo make && sudo make install` if you were in my case, without sudo they just won't work). Presently the Ubunutu apt package is at `lftp/trusty,now 4.4.13-1 amd64 [residual-config]`, so uninstall this if you had it previously since the `mirror` options in that version of `lftp` are severely limited.
    * Run all these in mosh and tmux window sessions just incase...

    ```
    lftp -e " \
  8. Mike Atlas revised this gist Aug 30, 2015. 1 changed file with 1 addition and 1 deletion.
    2 changes: 1 addition & 1 deletion sync_ftp_to_s3.md
    Original file line number Diff line number Diff line change
    @@ -2,7 +2,7 @@ Original idea from [Transfer files from an FTP server to S3 by "Hack N Cheese"](

    I moved roughly a terrabyte in less than an hour. Granted, I couldn't take advantage of `lftp`'s `--parallel=30` switch due to my ftp source limiting me to one connection at a time, but `use-pget-n=N` did seem to help out.

    * Get a fast Ubuntu 14.4 EC2 box on Amazon for temporary usage (I went with `m1.xlarge`) so data tranfers aren't limited by your local bandwidth at least. I also attached a fat 2TB EBS volume and symlinked it to `/bigdisk`, and made sure the EBS volume was deleted after I terminated this EC2 box. I hope `lftp` 2.6.4 is available as a stable package by then.
    * Get a fast Ubuntu 14.4 EC2 box on Amazon for temporary usage (I went with `m1.xlarge`) so data tranfers aren't limited by your local bandwidth at least. I also attached a fat 2TB EBS volume and symlinked it to `/bigdisk`, and made sure the EBS volume was deleted after I terminated this EC2 box. I hope `lftp` 2.6.4 is available as a stable package by the next time I attempt this.
    * Build [`lftp`](https://github.com/lavv17/lftp) 2.6.4+ (Not easy to compile, so read the [`INSTALL`](https://github.com/lavv17/lftp/blob/master/INSTALL) file and plow through all your missing dependencies - you'll also need to re-run `sudo ./configure && sudo make && sudo make install` if you were in my case). Presently the Ubunutu apt package is at `lftp/trusty,now 4.4.13-1 amd64 [residual-config]`, so uninstall this if you had it previously since the `mirror` options in that version are severely limited.
    * Run these in mosh and tmux sessions just incase...

  9. Mike Atlas revised this gist Aug 30, 2015. 1 changed file with 1 addition and 1 deletion.
    2 changes: 1 addition & 1 deletion sync_ftp_to_s3.md
    Original file line number Diff line number Diff line change
    @@ -3,7 +3,7 @@ Original idea from [Transfer files from an FTP server to S3 by "Hack N Cheese"](
    I moved roughly a terrabyte in less than an hour. Granted, I couldn't take advantage of `lftp`'s `--parallel=30` switch due to my ftp source limiting me to one connection at a time, but `use-pget-n=N` did seem to help out.

    * Get a fast Ubuntu 14.4 EC2 box on Amazon for temporary usage (I went with `m1.xlarge`) so data tranfers aren't limited by your local bandwidth at least. I also attached a fat 2TB EBS volume and symlinked it to `/bigdisk`, and made sure the EBS volume was deleted after I terminated this EC2 box. I hope `lftp` 2.6.4 is available as a stable package by then.
    * Build [`lftp`](https://github.com/lavv17/lftp) 2.6.4+ needed (pain to compile so read the [`INSTALL`](https://github.com/lavv17/lftp/blob/master/INSTALL) file). Presently the Ubunutu apt package is at `lftp/trusty,now 4.4.13-1 amd64 [residual-config]`
    * Build [`lftp`](https://github.com/lavv17/lftp) 2.6.4+ (Not easy to compile, so read the [`INSTALL`](https://github.com/lavv17/lftp/blob/master/INSTALL) file and plow through all your missing dependencies - you'll also need to re-run `sudo ./configure && sudo make && sudo make install` if you were in my case). Presently the Ubunutu apt package is at `lftp/trusty,now 4.4.13-1 amd64 [residual-config]`, so uninstall this if you had it previously since the `mirror` options in that version are severely limited.
    * Run these in mosh and tmux sessions just incase...

    ```
  10. Mike Atlas revised this gist Aug 30, 2015. 1 changed file with 1 addition and 1 deletion.
    2 changes: 1 addition & 1 deletion sync_ftp_to_s3.md
    Original file line number Diff line number Diff line change
    @@ -1,6 +1,6 @@
    Original idea from [Transfer files from an FTP server to S3 by "Hack N Cheese"](http://hackncheese.com/2014/11/24/Transfer-files-from-an-FTP-server-to-S3/).

    I moved roughly a terrabyte in less than an hour. Granted, I couldn't take advantage of `lftp`'s `--parallel=30` switch due to my ftp source limiting me to one connection at a time.
    I moved roughly a terrabyte in less than an hour. Granted, I couldn't take advantage of `lftp`'s `--parallel=30` switch due to my ftp source limiting me to one connection at a time, but `use-pget-n=N` did seem to help out.

    * Get a fast Ubuntu 14.4 EC2 box on Amazon for temporary usage (I went with `m1.xlarge`) so data tranfers aren't limited by your local bandwidth at least. I also attached a fat 2TB EBS volume and symlinked it to `/bigdisk`, and made sure the EBS volume was deleted after I terminated this EC2 box. I hope `lftp` 2.6.4 is available as a stable package by then.
    * Build [`lftp`](https://github.com/lavv17/lftp) 2.6.4+ needed (pain to compile so read the [`INSTALL`](https://github.com/lavv17/lftp/blob/master/INSTALL) file). Presently the Ubunutu apt package is at `lftp/trusty,now 4.4.13-1 amd64 [residual-config]`
  11. Mike Atlas revised this gist Aug 30, 2015. 1 changed file with 1 addition and 1 deletion.
    2 changes: 1 addition & 1 deletion sync_ftp_to_s3.md
    Original file line number Diff line number Diff line change
    @@ -1,6 +1,6 @@
    Original idea from [Transfer files from an FTP server to S3 by "Hack N Cheese"](http://hackncheese.com/2014/11/24/Transfer-files-from-an-FTP-server-to-S3/).

    I moved roughly a terrabyte in less than an hour. Granted, I couldn't take advantage of the `--parallel=30` switch due to my ftp source limiting me to one connection at a time.
    I moved roughly a terrabyte in less than an hour. Granted, I couldn't take advantage of `lftp`'s `--parallel=30` switch due to my ftp source limiting me to one connection at a time.

    * Get a fast Ubuntu 14.4 EC2 box on Amazon for temporary usage (I went with `m1.xlarge`) so data tranfers aren't limited by your local bandwidth at least. I also attached a fat 2TB EBS volume and symlinked it to `/bigdisk`, and made sure the EBS volume was deleted after I terminated this EC2 box. I hope `lftp` 2.6.4 is available as a stable package by then.
    * Build [`lftp`](https://github.com/lavv17/lftp) 2.6.4+ needed (pain to compile so read the [`INSTALL`](https://github.com/lavv17/lftp/blob/master/INSTALL) file). Presently the Ubunutu apt package is at `lftp/trusty,now 4.4.13-1 amd64 [residual-config]`
  12. Mike Atlas revised this gist Aug 30, 2015. 1 changed file with 1 addition and 1 deletion.
    2 changes: 1 addition & 1 deletion sync_ftp_to_s3.md
    Original file line number Diff line number Diff line change
    @@ -1,4 +1,4 @@
    Original idea from [Transfer files from an FTP server to S3 by "Hack N Cheese](http://hackncheese.com/2014/11/24/Transfer-files-from-an-FTP-server-to-S3/).
    Original idea from [Transfer files from an FTP server to S3 by "Hack N Cheese"](http://hackncheese.com/2014/11/24/Transfer-files-from-an-FTP-server-to-S3/).

    I moved roughly a terrabyte in less than an hour. Granted, I couldn't take advantage of the `--parallel=30` switch due to my ftp source limiting me to one connection at a time.

  13. Mike Atlas created this gist Aug 30, 2015.
    29 changes: 29 additions & 0 deletions sync_ftp_to_s3.md
    Original file line number Diff line number Diff line change
    @@ -0,0 +1,29 @@
    Original idea from [Transfer files from an FTP server to S3 by "Hack N Cheese](http://hackncheese.com/2014/11/24/Transfer-files-from-an-FTP-server-to-S3/).

    I moved roughly a terrabyte in less than an hour. Granted, I couldn't take advantage of the `--parallel=30` switch due to my ftp source limiting me to one connection at a time.

    * Get a fast Ubuntu 14.4 EC2 box on Amazon for temporary usage (I went with `m1.xlarge`) so data tranfers aren't limited by your local bandwidth at least. I also attached a fat 2TB EBS volume and symlinked it to `/bigdisk`, and made sure the EBS volume was deleted after I terminated this EC2 box. I hope `lftp` 2.6.4 is available as a stable package by then.
    * Build [`lftp`](https://github.com/lavv17/lftp) 2.6.4+ needed (pain to compile so read the [`INSTALL`](https://github.com/lavv17/lftp/blob/master/INSTALL) file). Presently the Ubunutu apt package is at `lftp/trusty,now 4.4.13-1 amd64 [residual-config]`
    * Run these in mosh and tmux sessions just incase...

    ```
    lftp -e " \
    debug -t 2; \
    set net:max-retries 3000; \
    set net:timeout 10m; \
    set ftp:charset iso-8859-1; \
    open ftp.yoursite.com; \
    mirror \
    --log log.txt \
    --use-pget-n=1000 \
    --use-cache \
    --continue \
    --loop \
    /your/ftp/remote/path /your/ec2/local/path \
    exit; \
    "
    ```

    then `wget` a copy of `s3-parallel-put.py` (my (fork)[https://github.com/weftio/s3-parallel-put] for regionalized buckets if you need to):

    `/your/ec2/local/path$ python s3-parallel-put --bucket=weft-wind-data --secure --put=update --processes=50 --content-type=guess --verbose --log-filename=/tmp/s3pp.log /your/local/ec2/path`