Chapter 6 Amazon Web Services

6.1 AWS Setup

6.1.1 AWS Setup

Go back to fan’s Python Code Examples Repository (bookdown site) or the pyfan Package (API).

6.1.1.1 Installation on Local Machine

First install anaconda, git, and associated programs.

Putty
access to .pem key
conda aws environment below

6.1.1.2 Conda AWS Environment

Can Start and Stop instances from Conda Prompt after this.

conda deactivate
conda list env
conda env remove -n wk_aws
conda create -n wk_aws -y
conda activate wk_aws

# Install External Tools
conda install -c anaconda pip -y

# Command line interface
conda install -c conda-forge awscli -y
# Programmatically send JSON instructions with boto3
conda install -c anaconda boto3 -y

6.1.1.3 AWS Account set-up

Sign-up for AWS web services account (can be the same as your Amazon shopping account)
Register for AWS Educate to get student or faculty voucher.

The University of Houston is a part of AWS Educate, choose educator or student, should hear back within 24 hours with coupon code.
UH students can get $100, faculty can get $200.

6.1.1.4 Start a AWS Instance and Link Local to Remote

Amazon has a lot of tutorials. Here is an outline.

Generate keypair on AWS, aws guide
- this gives you a .pem file which you download and Amazon also remembers
- local computers with the right .pem file can talk to your AWS instances
- You might need to invoke the chmod command below to set permission:
```
chmod 400 "C:/Users/fan/Documents/Dropbox (UH-ECON)/Programming/AWS/fan_wang-key-pair-us_east_nv.pem"
```
Launching Instance: Go to your console, choose EC2, choose launch instance, select Amazon Linux Instance (review and launch)
Instance security: select VPC security group: I have for example: fan_wang_SG_us_east_nv_VPC (edit security group and submit)
- Security group can allow any IP address to access your instance or just specific ones.
- AWS has a tool here that just allows your current IP to access the EC2 instance
Instance access key: Select right keypair (your .pem key), fan_wang-key-pair-us_east_nv (prompted after submitting)
For SSH in, you can use Putty. aws guide
- tell Putty your AWS instance DNS address and where your pem key is
- Can use a Putty client to enter an EC2 instance
For SSH, can also do the process below:
- open git bash (install putty before)
```
ssh-agent -s
eval $(ssh-agent -s)
```
- Tell SSH where pem key is:
```
ssh-add "C:/Users/fan/Documents/Dropbox (UH-ECON)/Programming/AWS/fan_wang-key-pair-us_east_nv.pem"
```
- You will find a public DNS address for your aws instance on the AWS user interface page
```
# ssh git bash command line
# for ubuntu machine
ssh ubuntu@ec2-54-197-6-153.compute-1.amazonaws.com
# for aws linux
ssh ec2-user@ec2-52-23-218-117.compute-1.amazonaws.com
# quit aws instance
# ctrl + D
```
- if get: Permission denied (publickey), see:
  1. Trying to connect with the wrong key. Are you sure this instance is using this keypair?
  2. Trying to connect with the wrong username. ubuntu is the username for the ubuntu based AWS distribution, but on some others it’s ec2-user (or admin on some Debians, according to Bogdan Kulbida’s answer)(can also be root, fedora, see below)
  3. Trying to connect the wrong host. Is that the right host you are trying to log in to?
- You can log in generally like this, note the instance gets new public DNS IP address every time you restart it:
```
LOCALPEM="C:/Users/fan/Documents/Dropbox (UH-ECON)/Programming/AWS/fan_wang-key-pair-us_east_nv.pem"
IPADD=34.207.250.160
REMOTEIP=ec2-user@$IPADD
ssh-keygen -R $IPADD
ssh -i "$LOCALPEM" $REMOTEIP
```

6.1.1.5 Use AWSCLI to Start and Stop an Instance

Install AWS CLI
Create individual IAM users
Follow instructions to Configure your awscli, and profit access key id and secrete access key when prompted.
- do not copy and paste the Key ID and Access Key. They are example, type these in as answers given config prompt:
```
# aws configure
AWS Access Key ID [None]: XXXXIOSFODNN7EXAMPLE
AWS Secret Access Key [None]: wXalrXXtnXXXX/X7XXXXX/bXxXfiCXXXXXXXXXXX
Default region name [None]: us-west-1
Default output format [None]: json
```
- this creates under a folder like this: C:/Users/fan/.aws, inside the folder these info will be stored in a configuration file.
```
# the credentials file
[default]
aws_access_key_id = XXXXIOSFODNN7EXAMPLE
aws_secret_access_key = wXalrXXtnXXXXX7XXXXXbXxXfiCXXXXXXXXXXX
```
- then when you use aws cli, you will automatically be authenticated

Start an instance in console first (or directly in command line). Stop it. do not terminate. Now this instance will have a fixed instance ID. Its DNS IP address will change every time you restart it, but its instance ID is fixed. Instance ID is found easily in the EC2 Console.

Launch an instance

aws ec2 run-instances --image-id ami-xxxxxxxx --count 1 --instance-type t2.micro --key-name MyKeyPair --security-group-ids sg-xxxxxxxx --subnet-id subnet-xxxxxxxx

Start an instance

aws ec2 start-instances --instance-ids i-XXXXXXXX
aws ec2 start-instances --instance-ids i-040c856530b2619bc

Stop an instance

aws ec2 stop-instances --instance-ids i-XXXXXXXX
aws ec2 stop-instances --instance-ids i-040c856530b2619bc

6.1.1.6 Set-up SSM on EC2 Instance

To execute commandlines etc remote on EC2, need to set up SSM: AWS Systems Manager Agent (SSM Agent)

SSM-agent is already installed in Amazon Linux.

Error Message regarding InvalidInstanceId. The following scenarios can result in this error message:

Instance id is invalid (in the comments you have verified it isn’t)
Instance is in a different region (in the comments you have verified it isn’t)
Instance is not currently in the Running state
Instance does not have the AWS SSM agent installed and running.

“You have to create and attach the policy AmazonSSMFullAccess to the machine (thats maybe more broad than you need) but that was why it wasn’t working for me… You do that by clicking on (when selected on the ec2 instance) Action > Instance Settings > Attach/Replace IAM Role then create a role for ec2 that has that permission then attach, should take like 5-10 mins to pop up in SYSTEMS MANAGER SHARED RESOURCES - Managed Instances as mark mentions. – Glen Thompson Sep 20 ’18 at 16:31”

# Start SSM Agent with
sudo systemctl start amazon-ssm-agent

6.1.2 AWS Boto3

Go back to fan’s Python Code Examples Repository (bookdown site) or the pyfan Package (API).

6.1.2.1 Basics

Create local .aws folder under user for example that has credential information, this will be useful for AWS command line operations.

# IN C:\Users\fan\.aws
# config file
[default]
region = us-east-1
output = json
# credentials file
[default]
aws_access_key_id = XKIXXXGSXXXBZXX43XXX
aws_secret_access_key = xxTgp9r0f4XXXXXXX1XXlG1vTy07wydxXXXXXX11

Additionally, or alternatively, for boto3 operations, store in for example a yml file, so that appropriate value could be obtained.

- main_aws_id: 710673677961,
  aws_access_key_id: XKIXXXGSXXXBZXX43XXX
  aws_secret_access_key: xxTgp9r0f4XXXXXXX1XXlG1vTy07wydxXXXXXX11
  region: us-east-1
  main_ec2_instance_id: i-YYYxYYYYYYx2619xx
  main_ec2_linux_ami: ami-0xYYYYYxx95x71x9
  main_ec2_public_subnet: subnet-d9xxxxYY
  fargate_vpc_name: FanCluster
  fargate_vpc_id: vpc-xxx5xYYY
  fargate_public_subnet: subnet-e3dYYYxx
  fargate_security_group: sg-17xxxxYx
  fargate_task_executionRoleArn: ecsTaskExecutionRole
  batch_task_executionRoleArn: ecsExecutionRole
  fargate_route_table: rtb-5xxxYx25
  date_start: 20180701

6.1.2.2 Start Client Service

For the various AWS services, could use Boto3 to access and use programmatically. To use any particular service, first start the client for that service: boto3 client.

We load AWS access key and secret acess key etc in from a yaml file to start boto3 client. We then start the client for AWS Batch. And then describe a compute environment.

import boto3
import yaml
import pprint

# Load YAML file
son_aws_yml = "C:/Users/fan/fanwangecon.github.io/_data/aws.yml"
fl_yaml = open(son_aws_yml)
ls_dict_yml = yaml.load(fl_yaml, Loader=yaml.BaseLoader)
# Get the first element of the yml list of dicts
aws_yml_dict_yml = ls_dict_yml[0]

# Use AWS Personal Access Keys etc to start boto3 client
aws_batch = boto3.client('batch',
  aws_access_key_id=aws_yml_dict_yml['aws_access_key_id'],
  aws_secret_access_key=aws_yml_dict_yml['aws_secret_access_key'],
  region_name=aws_yml_dict_yml['region'])

# Show a compute environment Delete some Personal Information
ob_response = aws_batch.describe_compute_environments(computeEnvironments=["SpotEnv2560"])
ob_response['ResponseMetadata'] = ''
ob_response['computeEnvironments'][0]['ecsClusterArn'] = ''
ob_response['computeEnvironments'][0]['serviceRole'] = ''
ob_response['computeEnvironments'][0]['computeResources']['instanceRole'] = ''
pprint.pprint(ob_response, width=1)

## {'ResponseMetadata': '',
##  'computeEnvironments': [{'computeEnvironmentArn': 'arn:aws:batch:us-east-1:710673677961:compute-environment/SpotEnv2560',
##                           'computeEnvironmentName': 'SpotEnv2560',
##                           'computeResources': {'desiredvCpus': 4,
##                                                'ec2KeyPair': 'fan_wang-key-pair-us_east_nv',
##                                                'instanceRole': '',
##                                                'instanceTypes': ['optimal'],
##                                                'maxvCpus': 2560,
##                                                'minvCpus': 0,
##                                                'securityGroupIds': ['sg-e6642399'],
##                                                'spotIamFleetRole': 'arn:aws:iam::710673677961:role/AmazonEC2SpotFleetRole',
##                                                'subnets': ['subnet-d9abbe82'],
##                                                'tags': {},
##                                                'type': 'SPOT'},
##                           'ecsClusterArn': '',
##                           'serviceRole': '',
##                           'state': 'ENABLED',
##                           'status': 'VALID',
##                           'statusReason': 'ComputeEnvironment '
##                                           'Healthy',
##                           'tags': {},
##                           'type': 'MANAGED'}]}

6.2 S3

6.2.1 S3 Usages

Go back to fan’s Python Code Examples Repository (bookdown site) or the pyfan Package (API).

6.2.1.1 Upload Local File to S3

A program runs either locally or on a remote EC2 machine inside a docker container. Upon exit, data does not persist in the docker container and needs to be exported to be saved. The idea is to export program images, csv files, json files, etc to S3 when these are generated, if the program detects that it is been executed on an EC2 machine (in a container).

First, inside the program, detect platform status. For Docker Container on EC2, AWS Linux 2 has platform.release of something like 4.14.193-194.317.amzn2.x86_64.

import platform as platform
print(platform.release())
# This assums using an EC2 instance where amzn is in platform name

## 10

if 'amzn' in platform.release():
    s3_status = True
else:
    s3_status = False
print(s3_status)

## False

Second, on s3, create a bucket, fans3testbucket for example (no underscore in name allowed). Before doing this, set up AWS Access Key ID and AWS Secrete Acccess KEy in /Users/fan/.aws folder so that boto3 can access s3 from computer. Upon successful completion of the push, the file can be accessed at https://fans3testbucket.s3.amazonaws.com/_data/iris_s3.dta.

import boto3
s3 = boto3.client('s3')
spn_local_path_file_name = "C:/Users/fan/Py4Econ/aws/setup/_data/iris_s3.dta"
str_bucket_name = "fans3testbucket"
spn_remote_path_file_name = "_data/iris_s3.dta"
s3.upload_file(spn_local_path_file_name, str_bucket_name, spn_remote_path_file_name)

6.2.1.2 Download File from S3 to Local Machine

On a local computer, download a particular file from S3. Download back the file we just uploaded onto S3.

import boto3
s3 = boto3.client('s3')
spn_local_path_file_name = "C:/Users/fan/Py4Econ/aws/setup/_data/iris_s3_downloaded.dta"
str_bucket_name = "fans3testbucket"
spn_remote_path_file_name = "_data/iris_s3.dta"
s3.download_file(str_bucket_name, spn_remote_path_file_name, spn_local_path_file_name)

6.2.1.3 Download File from S3 to EC2 Machine

On a EC2 machine, Amazon Linux 2 AMI, for example. Install pip, and install boto3, then download file to the /data/ folder.

# ssh into EC2 linux 2 AMI
ssh -i "G:/repos/ThaiJMP/boto3aws/aws_ec2/pem/fan_wang-key-pair-us_east_nv.pem" ec2-user@3.81.101.142
# generate data folder
mkdir data
# install boto3
sudo yum install python-pip python3-wheel && Pip install boto3 --user
# try download file using boto3
# go into python
python

Now inside python, download the iris_s3.dta to the data folder under root in the EC2 Machine.

import boto3
s3 = boto3.client('s3')
spn_ec2_path_file_name = "/home/ec2-user/data/iris_s3_downloaded.dta"
str_bucket_name = "fans3testbucket"
spn_s3_path_file_name = "_data/iris_s3.dta"
s3.download_file(str_bucket_name, spn_s3_path_file_name, spn_ec2_path_file_name)

6.2.1.4 Download File from S3 to Active Docker Container

Working inside an active docker container.

First activate and enter into a docker container:

# inside EC2 AMI Linux 2, start dockers
sudo service docker start
sudo service docker status
# see docker images
docker images
# run docker container and enter inside
docker run -t -i fanconda /bin/bash
# make a data directory and a esti subdirectory
mkdir data
cd data
mkdir esti
# enter python
python

Inside docker, which has boto3 installed already. The Path is different than on EC2, the path root structure is shorter, but otherwise the same.

import boto3
s3 = boto3.client('s3')
spn_container_path_file_name = "/data/esti/iris_s3_downloaded.dta"
str_bucket_name = "fans3testbucket"
spn_s3_path_file_name = "_data/iris_s3.dta"
s3.download_file(str_bucket_name, spn_s3_path_file_name, spn_container_path_file_name)

6.2.1.5 Forward and Backward Slashes

Following up on the uploading example above, suppose that rather than using forward slash, backward slash is used, then AWS gets confused about folder. This will not appear under the _data folder, but will appear as a file with file name: *_data/iris_s3.dta* under the bucket.

Note that the folder for S3 is a GUI trick, but still, we want to use forward slash properly, so that all double backslash that might be generated by default path tools need to be converted to forward slashes

import os
# This generates a file directly under bucket _data\iris_s3:
spn_remote_path_file_name_backslash = "_data\\iris_s3_slashbackforward.dta"
s3.upload_file(spn_local_path_file_name, str_bucket_name, spn_remote_path_file_name_backslash)
# This allows the folder structure to be clickable:
spn_remote_path_file_name_forwardslash = spn_remote_path_file_name_backslash.replace(os.sep, '/')
s3.upload_file(spn_local_path_file_name, str_bucket_name, spn_remote_path_file_name_forwardslash)
# Print slashs
print(f'{spn_remote_path_file_name_backslash=}')

## spn_remote_path_file_name_backslash='_data\\iris_s3_slashbackforward.dta'

print(f'{spn_remote_path_file_name_forwardslash=}')

## spn_remote_path_file_name_forwardslash='_data/iris_s3_slashbackforward.dta'

6.2.1.6 Sync Local Drive with S3 of a Particular File Type

Boto3 does not offer directory upload/download for s3. Needs to rely on aws command line.

To sync local folder with all files from a particular AWS folder, exclude image files:

# CD into a directory
cd /d "G:\S3\fanconda202010\esti"
# Make a new directory making S3 Directory Name
mkdir e_20201025x_esr_medtst_list_tKap_mlt_ce1a2
# cd into the directory just made
cd /d "G:\S3\thaijmp202010\esti\e_20201025x_esr_medtst_list_tKap_mlt_ce1a2"
# copy all results from the s3 folder's subfolders including subfolders, excluding images
aws s3 cp ^
    s3://fanconda202010/esti/e_20201025x_esr_medtst_list_tKap_mlt_ce1a2/ . ^
    --recursive --exclude "*.png"

6.3 Batch

6.3.1 AWS Batch Run

Go back to fan’s Python Code Examples Repository (bookdown site) or the pyfan Package (API).

6.3.1.1 Preparing a Docker Image and a Python Function for Batch Array Job

We want to set-up a function that can be used jointly with AWS Batch Array. With Batch Array, can run many simulations concurrently. All simulations might only differ in random seed for drawing shocks. This requires setting up the proper dockerfile as well as modifying the python function that we want to invoke slightly.

First, create and push a docker image, see this dockerfile. Following the AWS ECR instructions, this registers a docker image in AWS ECR with a URI: XXXX7367XXXX.dkr.ecr.us-east-1.amazonaws.com/fanconda

The dockerfile has for CMD: CMD [“python,” “/pyfan/pyfan/graph/exa/scatterline3.py”]. This runs the function scatterline3.

Second, the scatterline3 function checks if AWS_BATCH_JOB_ARRAY_INDEX is in the os.environ. AWS_BATCH_JOB_ARRAY_INDEX, if exists, is used as a random seed to generate data for the graph. When the function is run in a docker container via batch, the function saves the graph output to a bucket in AWS s3. The pushing the s3 is achieved by pyfan.aws.general.path.py.

In the batch job, when arrayProperties = {‘size’: 10}, this will generate AWS_BATCH_JOB_ARRAY_INDEX from 1 through 10 in 10 sub-tasks of a single batch task. These AWS_BATCH_JOB_ARRAY_INDEX could be used as different random seeds, and could be used as folder suffixes.

Here, the scatterline3 function generates a graph, that will be stored for testing purpose in pyfan_gph_scatter_line_rand folder of fans3testbucket bucket, the images saved has seed_0.png, seed_1.png, …, seed_10.png as names when arrayProperties = {‘size’: 10}.

6.3.1.2 Register A Batch Job Definition

Given the docker image we created: XXXX7367XXXX.dkr.ecr.us-east-1.amazonaws.com/fanconda, we can use this to register a batch job.

computing requirements: memory and cpu: vCpus = 1 and Memory=7168 for example
which container to pull from (ECR): List the image name: XXXX7367XXXX.dkr.ecr.us-east-1.amazonaws.com/fanconda for example
job role ARN: arn:aws:iam::XXXX7367XXXX:role/ecsExecutionRole to allow for proper in and out from and to the container.

These can be registered programmatically by using boto3: Boto3 Batch Documentation

In the example below, will register a new job definition, this will add pyfan-scatterline3-test-rmd to job definition as an additional job definition.

Everytime, when the code below is re-run, a new batch revision number is generated. AWS allows per batch job to have potential hundreds of thousands of revisions.

import boto3
import yaml
import pprint

# Load YAML file with security info
srn_aws_yml = "C:/Users/fan/fanwangecon.github.io/_data/aws.yml"
fl_yaml = open(srn_aws_yml)
ls_dict_yml = yaml.load(fl_yaml, Loader=yaml.BaseLoader)
aws_yml_dict_yml = ls_dict_yml[0]

# Dictionary storing job definition related information
job_dict = {"jobDefinitionName": 'pyfan-scatterline3-test-rmd',
            "type": "container",
            "containerProperties": {
                "image": aws_yml_dict_yml['main_aws_id'] + ".dkr.ecr." +
                         aws_yml_dict_yml['region'] + ".amazonaws.com/fanconda",
                "vcpus": int(1),
                "memory": int(1024),
                "command": ["python",
                            "/pyfan/pyfan/graph/exa/scatterline3.py",
                            "-A", "fans3testbucket",
                            "-B", "111"],
                "jobRoleArn": "arn:aws:iam::" + aws_yml_dict_yml['main_aws_id'] +
                              ":role/" + aws_yml_dict_yml['batch_task_executionRoleArn']
            },
            "retryStrategy": {
                "attempts": 1
            }}

# Use AWS Personal Access Keys etc to start boto3 client
aws_batch = boto3.client('batch',
  aws_access_key_id=aws_yml_dict_yml['aws_access_key_id'],
  aws_secret_access_key=aws_yml_dict_yml['aws_secret_access_key'],
  region_name=aws_yml_dict_yml['region'])

# Register a job definition
response = aws_batch.register_job_definition(
        jobDefinitionName = job_dict['jobDefinitionName'],
        type = job_dict['type'],
        containerProperties = job_dict['containerProperties'],
        retryStrategy = job_dict['retryStrategy'])

# Print response
pprint.pprint(response, width=1)

## {'ResponseMetadata': {'HTTPHeaders': {'connection': 'keep-alive',
##                                       'content-length': '169',
##                                       'content-type': 'application/json',
##                                       'date': 'Tue, '
##                                               '29 '
##                                               'Dec '
##                                               '2020 '
##                                               '04:02:05 '
##                                               'GMT',
##                                       'x-amz-apigw-id': 'YS-JkEZroAMF20g=',
##                                       'x-amzn-requestid': 'f6129b5e-609f-4561-919c-1552c12fbf48',
##                                       'x-amzn-trace-id': 'Root=1-5feaaa3d-24bf3db471454a1f31c9bc12'},
##                       'HTTPStatusCode': 200,
##                       'RequestId': 'f6129b5e-609f-4561-919c-1552c12fbf48',
##                       'RetryAttempts': 0},
##  'jobDefinitionArn': 'arn:aws:batch:us-east-1:710673677961:job-definition/pyfan-scatterline3-test-rmd:90',
##  'jobDefinitionName': 'pyfan-scatterline3-test-rmd',
##  'revision': 90}

6.3.1.3 Submit a Batch Array

Given the batch job definition that has been created. Create also Job Queues and related compute environments. Then we can run Batch Array. Upon submitting the batch array, you can monitor AWS EC2 instances, should notice potentially many instances of EC2 starting up. AWS is starting EC2 instances to complete the batch array jobs.

create a batch compute environment that uses spot price instances, which will be much cheaper than on demand costs. Will need to set proper AMI roles, arn:aws:iam::XXXX7367XXXX:role/AmazonEC2SpotFleetRole for Spot fleet role, and also proper securities.

When the array_size parameter is equal to 100, that starts 100 child processes, with 1 through 100 for AWS_BATCH_JOB_ARRAY_INDEX, which, could be used directly by the python function by taking in the parameter from the os environment as shown earlier. For demonstration purposes, will only set array_size=3 in the example below.

Outputs from the scatterline3 has a timestamp, so each time the code below is run, will generate several new images, with the same set of random seeds, but different date prefix. The output s3 folder is public.

import boto3
import yaml
import pprint

import datetime as datetime

# Using the "jobDefinitionName": 'pyfan-scatterline3-test-rmd' from registering
jobDefinitionName = 'pyfan-scatterline3-test-rmd'

# How many child batch processes to start
# child process differ in: AWS_BATCH_JOB_ARRAY_INDEX
array_size = 3

# job name
timestr = "{:%Y%m%d%H%M%S%f}".format(datetime.datetime.now())
timesufx = '_' + timestr
st_jobName = jobDefinitionName + timesufx

# job queue (needs to design own queue in batch)
st_jobQueue = 'Spot'

# start batch service
# Load YAML file with security info
srn_aws_yml = "C:/Users/fan/fanwangecon.github.io/_data/aws.yml"
fl_yaml = open(srn_aws_yml)
ls_dict_yml = yaml.load(fl_yaml, Loader=yaml.BaseLoader)
aws_yml_dict_yml = ls_dict_yml[0]
# Use AWS Personal Access Keys etc to start boto3 client
aws_batch = boto3.client('batch',
                         aws_access_key_id=aws_yml_dict_yml['aws_access_key_id'],
                         aws_secret_access_key=aws_yml_dict_yml['aws_secret_access_key'],
                         region_name=aws_yml_dict_yml['region'])

# aws batch submit job
dc_json_batch_response = aws_batch.submit_job(
    jobName=st_jobName,
    jobQueue=st_jobQueue,
    arrayProperties={'size': array_size},
    jobDefinition=jobDefinitionName)

# Print response
pprint.pprint(dc_json_batch_response, width=1)

## {'ResponseMetadata': {'HTTPHeaders': {'connection': 'keep-alive',
##                                       'content-length': '198',
##                                       'content-type': 'application/json',
##                                       'date': 'Tue, '
##                                               '29 '
##                                               'Dec '
##                                               '2020 '
##                                               '04:02:05 '
##                                               'GMT',
##                                       'x-amz-apigw-id': 'YS-JoFnDIAMFV6Q=',
##                                       'x-amzn-requestid': '7c96cbb3-585f-4118-bca8-0eac23a4e3f7',
##                                       'x-amzn-trace-id': 'Root=1-5feaaa3d-6ded2cfd5219094502659cc2'},
##                       'HTTPStatusCode': 200,
##                       'RequestId': '7c96cbb3-585f-4118-bca8-0eac23a4e3f7',
##                       'RetryAttempts': 0},
##  'jobArn': 'arn:aws:batch:us-east-1:710673677961:job/b7a12a78-f187-423c-ae75-5088e7c2efd4',
##  'jobId': 'b7a12a78-f187-423c-ae75-5088e7c2efd4',
##  'jobName': 'pyfan-scatterline3-test-rmd_20201228220157556998'}

6.3.1.4 Track the Status of a Submitted Batch Array Until it Finished

To automate certain processes, often need to check and wait for job to complete. Can do this on web interface. Easier to do this via boto3 operations: describe_job and list_jobs

Given the batch array job we just generated above, first, parse out the job ID from the response from the batch array submission above. Then use list_jobs to check the length of JobSummaryList, and then use describe_jobs to check overall job completion status.

import time 
# Get Job ID
st_batch_jobID = dc_json_batch_response['jobId']
# Print Job ID
print(f'{st_batch_jobID=}')
# While loop to check status

## st_batch_jobID='b7a12a78-f187-423c-ae75-5088e7c2efd4'

bl_job_in_progress = True
it_wait_seconds = 0
while bl_job_in_progress and it_wait_seconds <= 600:
    # describe job
    dc_json_batch_describe_job_response = aws_batch.describe_jobs(jobs=[st_batch_jobID])
    # pprint.pprint(dc_json_batch_describe_job_response, width=1)
    it_array_size = dc_json_batch_describe_job_response['jobs'][0]['arrayProperties']['size']
    dc_status_summary = dc_json_batch_describe_job_response['jobs'][0]['arrayProperties']['statusSummary']
    if dc_status_summary:
        # check status
        it_completed = dc_status_summary['SUCCEEDED'] + dc_status_summary['FAILED']
        if it_completed < it_array_size:
            bl_job_in_progress = True
            # sleep three seconds
            time.sleep(10)
            it_wait_seconds = it_wait_seconds + 10
        else:
            bl_job_in_progress = False
            
        print(f'{it_wait_seconds=}, ArrayN={it_array_size},' \
              f'SUCCEEDED={dc_status_summary["SUCCEEDED"]}, FAILED={dc_status_summary["FAILED"]}, ' \
              f'RUNNING={dc_status_summary["RUNNING"]}, PENDING={dc_status_summary["PENDING"]}, ' \
              f'RUNNABLE={dc_status_summary["RUNNABLE"]}')
    else:
        #empty statussummary
        bl_job_in_progress = True
        time.sleep(10)
        it_wait_seconds = it_wait_seconds + 10
        print(f'{it_wait_seconds=}, ArrayN={it_array_size}')

## it_wait_seconds=10, ArrayN=3
## it_wait_seconds=20, ArrayN=3,SUCCEEDED=0, FAILED=0, RUNNING=0, PENDING=0, RUNNABLE=3
## it_wait_seconds=20, ArrayN=3,SUCCEEDED=3, FAILED=0, RUNNING=0, PENDING=0, RUNNABLE=0