Last active
August 29, 2015 14:17
-
-
Save craigpatten/45e7fe4381cfb40bfa55 to your computer and use it in GitHub Desktop.
Revisions
-
craigpatten revised this gist
Mar 18, 2015 . 1 changed file with 3 additions and 3 deletions.There are no files selected for viewing
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters. Learn more about bidirectional Unicode charactersOriginal file line number Diff line number Diff line change @@ -2,9 +2,9 @@ The [Hitachi Content Platform (HCP)](http://www.hds.com/products/file-and-conten However, when attempting to access objects that are a multiple of 4GB in size, the HCP returns multiple, inconsistent Content-Length headers; one says zero, the other has the correct size. This is an RFC violation and will likely crash any application that isn't a little bit creative in way it parses these headers. For example, when boto attempts a lookup on such an object, it throws an exception, which is an acceptable action because the invalid Content-Length renders the entire response invalid. The effect of this exception on the application or end-user entirely depends on how boto is used. Request from boto: ``` HEAD /[...] HTTP/1.1 @@ -16,7 +16,7 @@ Authorization: AWS [...] User-Agent: Boto/2.36.0 Python/2.7.6 Darwin/14.1.0 ``` Response from the HCP: ``` HTTP/1.1 200 OK -
craigpatten revised this gist
Mar 18, 2015 . 1 changed file with 1 addition and 1 deletion.There are no files selected for viewing
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters. Learn more about bidirectional Unicode charactersOriginal file line number Diff line number Diff line change @@ -33,7 +33,7 @@ Resultant boto stacktrace: ``` Traceback (most recent call last): File "./hcp-4gb.py", line 14, in <module> object = bucket.get_key("[...]") File "/Library/Python/2.7/site-packages/boto/s3/bucket.py", line 192, in get_key key, resp = self._get_key_internal(key_name, headers, query_args_l) -
craigpatten renamed this gist
Mar 18, 2015 . 1 changed file with 0 additions and 0 deletions.There are no files selected for viewing
File renamed without changes. -
craigpatten created this gist
Mar 18, 2015 .There are no files selected for viewing
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters. Learn more about bidirectional Unicode charactersOriginal file line number Diff line number Diff line change @@ -0,0 +1,14 @@ #!/usr/bin/env python import boto, base64, hashlib from boto.s3.connection import S3Connection server = "your-hcp-endpoint.acme.com" hs3_id = base64.b64encode("your-hcp-username") hs3_secret = hashlib.md5("your-hcp-password").hexdigest() hs3 = S3Connection(aws_access_key_id = hs3_id, aws_secret_access_key = hs3_secret, host = server, debug = 2) bucket = hs3.get_bucket("your-bucket-name") print bucket.get_key("object-that-is-a-multiple-of-4GB-in-size") This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters. Learn more about bidirectional Unicode charactersOriginal file line number Diff line number Diff line change @@ -0,0 +1,45 @@ The [Hitachi Content Platform (HCP)](http://www.hds.com/products/file-and-content/content-platform) provides an interface which pretends to be [Amazon S3](http://aws.amazon.com/s3), so you can access it with packages such as [boto](https://pypi.python.org/pypi/boto). This is nice. However, when attempting to access objects that are a multiple of 4GB in size, the HCP returns multiple, inconsistent Content-Length headers; one says zero, the other has the correct size. This is an RFC violation and will likely crash any application that isn't a little bit creative in way it parses these headers. When boto attempts a lookup on such an object, it throws an exception, which is an acceptable action because the invalid Content-Length renders the entire response invalid. Example boto request: ``` HEAD /[...] HTTP/1.1 Host: [...] Accept-Encoding: identity Date: Wed, 18 Mar 2015 00:48:06 GMT Content-Length: 0 Authorization: AWS [...] User-Agent: Boto/2.36.0 Python/2.7.6 Darwin/14.1.0 ``` Example response from the HCP: ``` HTTP/1.1 200 OK Date: Wed, 18 Mar 2015 00:48:06 GMT Server: HCP V7.0.1.17H1004 ETag: "c9a5a6878d97b48cc965c1e41859f034" Last-Modified: Mon, 23 Feb 2015 04:42:36 GMT Content-Type: application/octet-stream Content-Length: 4294967296 Content-Length: 0 ``` Resultant boto stacktrace: ``` Traceback (most recent call last): File "./example.py", line 14, in <module> object = bucket.get_key("[...]") File "/Library/Python/2.7/site-packages/boto/s3/bucket.py", line 192, in get_key key, resp = self._get_key_internal(key_name, headers, query_args_l) File "/Library/Python/2.7/site-packages/boto/s3/bucket.py", line 216, in _get_key_internal k.size = int(response.getheader('content-length')) ValueError: invalid literal for int() with base 10: '4294967296, 0' ``` This fault is present in HCP V7.0.1.17H1004, though it's probably trivial to fix.