Skip to content

Instantly share code, notes, and snippets.

@craigpatten
Last active August 29, 2015 14:17
Show Gist options
  • Select an option

  • Save craigpatten/45e7fe4381cfb40bfa55 to your computer and use it in GitHub Desktop.

Select an option

Save craigpatten/45e7fe4381cfb40bfa55 to your computer and use it in GitHub Desktop.

Revisions

  1. craigpatten revised this gist Mar 18, 2015. 1 changed file with 3 additions and 3 deletions.
    6 changes: 3 additions & 3 deletions readme.md
    Original file line number Diff line number Diff line change
    @@ -2,9 +2,9 @@ The [Hitachi Content Platform (HCP)](http://www.hds.com/products/file-and-conten

    However, when attempting to access objects that are a multiple of 4GB in size, the HCP returns multiple, inconsistent Content-Length headers; one says zero, the other has the correct size. This is an RFC violation and will likely crash any application that isn't a little bit creative in way it parses these headers.

    When boto attempts a lookup on such an object, it throws an exception, which is an acceptable action because the invalid Content-Length renders the entire response invalid.
    For example, when boto attempts a lookup on such an object, it throws an exception, which is an acceptable action because the invalid Content-Length renders the entire response invalid. The effect of this exception on the application or end-user entirely depends on how boto is used.

    Example boto request:
    Request from boto:

    ```
    HEAD /[...] HTTP/1.1
    @@ -16,7 +16,7 @@ Authorization: AWS [...]
    User-Agent: Boto/2.36.0 Python/2.7.6 Darwin/14.1.0
    ```

    Example response from the HCP:
    Response from the HCP:

    ```
    HTTP/1.1 200 OK
  2. craigpatten revised this gist Mar 18, 2015. 1 changed file with 1 addition and 1 deletion.
    2 changes: 1 addition & 1 deletion readme.md
    Original file line number Diff line number Diff line change
    @@ -33,7 +33,7 @@ Resultant boto stacktrace:

    ```
    Traceback (most recent call last):
    File "./example.py", line 14, in <module>
    File "./hcp-4gb.py", line 14, in <module>
    object = bucket.get_key("[...]")
    File "/Library/Python/2.7/site-packages/boto/s3/bucket.py", line 192, in get_key
    key, resp = self._get_key_internal(key_name, headers, query_args_l)
  3. craigpatten renamed this gist Mar 18, 2015. 1 changed file with 0 additions and 0 deletions.
    File renamed without changes.
  4. craigpatten created this gist Mar 18, 2015.
    14 changes: 14 additions & 0 deletions example.py
    Original file line number Diff line number Diff line change
    @@ -0,0 +1,14 @@
    #!/usr/bin/env python

    import boto, base64, hashlib
    from boto.s3.connection import S3Connection

    server = "your-hcp-endpoint.acme.com"

    hs3_id = base64.b64encode("your-hcp-username")
    hs3_secret = hashlib.md5("your-hcp-password").hexdigest()
    hs3 = S3Connection(aws_access_key_id = hs3_id, aws_secret_access_key = hs3_secret, host = server, debug = 2)

    bucket = hs3.get_bucket("your-bucket-name")

    print bucket.get_key("object-that-is-a-multiple-of-4GB-in-size")
    45 changes: 45 additions & 0 deletions readme.md
    Original file line number Diff line number Diff line change
    @@ -0,0 +1,45 @@
    The [Hitachi Content Platform (HCP)](http://www.hds.com/products/file-and-content/content-platform) provides an interface which pretends to be [Amazon S3](http://aws.amazon.com/s3), so you can access it with packages such as [boto](https://pypi.python.org/pypi/boto). This is nice.

    However, when attempting to access objects that are a multiple of 4GB in size, the HCP returns multiple, inconsistent Content-Length headers; one says zero, the other has the correct size. This is an RFC violation and will likely crash any application that isn't a little bit creative in way it parses these headers.

    When boto attempts a lookup on such an object, it throws an exception, which is an acceptable action because the invalid Content-Length renders the entire response invalid.

    Example boto request:

    ```
    HEAD /[...] HTTP/1.1
    Host: [...]
    Accept-Encoding: identity
    Date: Wed, 18 Mar 2015 00:48:06 GMT
    Content-Length: 0
    Authorization: AWS [...]
    User-Agent: Boto/2.36.0 Python/2.7.6 Darwin/14.1.0
    ```

    Example response from the HCP:

    ```
    HTTP/1.1 200 OK
    Date: Wed, 18 Mar 2015 00:48:06 GMT
    Server: HCP V7.0.1.17H1004
    ETag: "c9a5a6878d97b48cc965c1e41859f034"
    Last-Modified: Mon, 23 Feb 2015 04:42:36 GMT
    Content-Type: application/octet-stream
    Content-Length: 4294967296
    Content-Length: 0
    ```

    Resultant boto stacktrace:

    ```
    Traceback (most recent call last):
    File "./example.py", line 14, in <module>
    object = bucket.get_key("[...]")
    File "/Library/Python/2.7/site-packages/boto/s3/bucket.py", line 192, in get_key
    key, resp = self._get_key_internal(key_name, headers, query_args_l)
    File "/Library/Python/2.7/site-packages/boto/s3/bucket.py", line 216, in _get_key_internal
    k.size = int(response.getheader('content-length'))
    ValueError: invalid literal for int() with base 10: '4294967296, 0'
    ```

    This fault is present in HCP V7.0.1.17H1004, though it's probably trivial to fix.