Accessing Stream In Job.get_output('body')

May 30, 2024 Post a Comment

Sample code import boto3 glacier = boto3.resource('glacier') job = glacier.Job(accountID, vaultlist[0], id=joblist[0]) r = job.get_output() print(r0['body']) That print only yie

Solution 1:

Here's a solution that worked for me to save a glacier archive that showed up as a StreamingBody to a file. This in particular was an mp3 file.

import boto3

glacier = boto3.resource('glacier')
job = glacier.Job(accountID, vaultName, jobID)

r = job.get_output()

f1 = open('my file',"wb")
f1.write(r['body'].read())
f1.close

Solution 2:

OK I couldn't get the other way to work at all, mostly my own lack of skills I'm sure. But I was able to use the HTTP GET to download the inventory into a file. This is how I did that. You will see lots of I had two vaults, one job each, you could modify this and loop in other ways or just use [0] for both lists if you have one vault and one job, but the important part is the sample from Amazon EC2 that I modified to retrieve the Inventory from a completed Glacier Job.

I know my code it not very well written, but it worked for my one-shot need. Hope this is helpful to others.

import requests, sys, os, hashlib, hmac, json
from datetime import datetime

# ************* REQUEST VALUES *************
method = 'GET'
service = 'glacier'
region = '<YOUR_REGION'
host = 'glacier.' + region + '.amazonaws.com'
endpoint = 'https://glacier.' + region + '.amazonaws.com'
request_parameters = ''
accountid = '<YOUR_ACCOUNT_ID'
vaultlist = ["VAULT_ONE", "VAULT_TWO"]
joblist = ['JOB_ID_ONE',
           'JOB_ID_TWO']
rangelist = ['JOB_SIZE_ONE',
             'JOB_SIZE_TWO',]
url0 = "/" + accountid + "/vaults/" + vaultlist[0] + "/jobs/" + joblist[0] + "/output"
url1 = "/" + accountid + "/vaults/" + vaultlist[1] + "/jobs/" + joblist[1] + "/output"
filename =['archive0.json', 'archive1.json'] #filenames# Key derivation functions. See:# http://docs.aws.amazon.com/general/latest/gr/signature-v4-examples.html#signature-v4-examples-python
def sign(key, msg):
    return hmac.new(key, msg.encode('utf-8'), hashlib.sha256).digest()

def getSignatureKey(key, dateStamp, regionName, serviceName):
    kDate = sign(('AWS4' + key).encode('utf-8'), dateStamp)
    kRegion = sign(kDate, regionName)
    kService = sign(kRegion, serviceName)
    kSigning = sign(kService, 'aws4_request')
    return kSigning

# Read AWS access key from env. variables or configuration file. Best practice is NOT# to embed credentials in code.
access_key = os.environ.get('AWS_ACCESS_KEY')
secret_key = os.environ.get('AWS_SECRET_KEY')
if access_key is None or secret_key is None:
    print('No access key is available via your environment variables.')
    sys.exit()

# Create a date for headers and the credential string
t = datetime.utcnow()
amzdate = t.strftime('%Y%m%dT%H%M%SZ')
datestamp = t.strftime('%Y%m%d') # Date w/o time, used in credential scope# ************* TASK 1: CREATE A CANONICAL REQUEST *************# http://docs.aws.amazon.com/general/latest/gr/sigv4-create-canonical-request.html# Step 1 is to define the verb (GET, POST, etc.)--already done.# Step 2: Create canonical URI--the part of the URI from domain to query# string (use '/' if no path)
canonical_uri = url1

# Step 3: Create the canonical query string. In this example (a GET request),# request parameters are in the query string. Query string values must# be URL-encoded (space=%20). The parameters must be sorted by name.# For this example, the query string is pre-formatted in the request_parameters variable.
canonical_querystring = request_parameters

# Step 4: Create the canonical headers and signed headers. Header names# and value must be trimmed and lowercase, and sorted in ASCII order.# Note that there is a trailing \n.
canonical_headers = 'host:' + host + '\n' + 'x-amz-date:' + amzdate + '\n'# Step 5: Create the list of signed headers. This lists the headers# in the canonical_headers list, delimited with ";" and in alpha order.# Note: The request can include any headers; canonical_headers and# signed_headers lists those that you want to be included in the# hash of the request. "Host" and "x-amz-date" are always required.
signed_headers = 'host;x-amz-date'# Step 6: Create payload hash (hash of the request body content). For GET# requests, the payload is an empty string ("").
payload_hash = hashlib.sha256("".encode()).hexdigest()

# Step 7: Combine elements to create create canonical request
canonical_request = method + '\n' + canonical_uri + '\n' + canonical_querystring + '\n' + canonical_headers +\
                    '\n' + signed_headers + '\n' + payload_hash

# ************* TASK 2: CREATE THE STRING TO SIGN*************# Match the algorithm to the hashing algorithm you use, either SHA-1 or# SHA-256 (recommended)
algorithm = 'AWS4-HMAC-SHA256'
credential_scope = datestamp + '/' + region + '/' + service + '/' + 'aws4_request'
string_to_sign = algorithm + '\n' +  amzdate + '\n' +  credential_scope + '\n' + \
                 hashlib.sha256(canonical_request.encode()).hexdigest()


# ************* TASK 3: CALCULATE THE SIGNATURE *************# Create the signing key using the function defined above.
signing_key = getSignatureKey(secret_key, datestamp, region, service)

# Sign the string_to_sign using the signing_key
signature = hmac.new(signing_key, string_to_sign.encode('utf-8'), hashlib.sha256).hexdigest()


# ************* TASK 4: ADD SIGNING INFORMATION TO THE REQUEST *************# The signing information can be either in a query string value or in# a header named Authorization. This code shows how to use a header.# Create authorization header and add to request headers
authorization_header = algorithm + ' ' + 'Credential=' + access_key + '/' + credential_scope + ', ' +\
                       'SignedHeaders=' + signed_headers + ', ' + 'Signature=' + signature

# The request can include any headers, but MUST include "host", "x-amz-date",# and (for this scenario) "Authorization". "host" and "x-amz-date" must# be included in the canonical_headers and signed_headers, as noted# earlier. Order here is not significant.# Python note: The 'host' header is added automatically by the Python 'requests' library.# headers = {'x-amz-date':amzdate, 'Authorization':authorization_header}


headers0 = {'x-amz-date': amzdate,
            'Authorization': authorization_header,
            'x-amz-glacier-version': '2012-06-01',
            'Range': '0 - ' + rangelist[0],
            }
headers1 = {'x-amz-date': amzdate,
           'Authorization': authorization_header,
            'x-amz-glacier-version': '2012-06-01',
           'Range': rangelist[1],
            }
headers = headers1

# ************* SEND THE REQUEST *************
request_url = endpoint + url1
print(url0)
print('\nBEGIN REQUEST++++++++++++++++++++++++++++++++++++')
print('Request URL: ' + request_url + '\n')
print('Headers: ' + json.dumps(headers))
print('Auth : ' + authorization_header + '\n' )
r = requests.get(request_url, headers=headers, stream = True)

print('\nRESPONSE++++++++++++++++++++++++++++++++++++')
print('Response code: %d\n' % r.status_code)
# print(r.text) #This is in the original Sample and useful for debugging. But not if your inventory is large.# *********** Write it to file ***********
f = open(filename[1], mode='w')
f.write(r.text)
f.close()

Python Programming Language

Accessing Stream In Job.get_output('body')

Solution 1:

Solution 2:

Post a Comment for "Accessing Stream In Job.get_output('body')"