• Data Structures and Cloud Services with Python
  • Preface
  • 1 Data Structures
    • 1.1 Numbers, Strings, Lists and Tuples
      • 1.1.1 Numeric Basics
      • 1.1.2 Tuple
      • 1.1.3 List
      • 1.1.4 Strings
    • 1.2 Dictionary
      • 1.2.1 Dictionary
    • 1.3 Numpy Arrays
      • 1.3.1 Generate Matrix from Arrays
  • 2 Pandas
    • 2.1 Panda Basics
      • 2.1.1 Generate Matrix from Arrays
      • 2.1.2 Select Rows and Columns from Dataframe
      • 2.1.3 Pandas Importing and Exporting
  • 3 Functions
    • 3.1 Function Arguments and Returns
      • 3.1.1 Data Types
      • 3.1.2 Function Arguments
      • 3.1.3 Python Command-line Arguments Parsing
      • 3.1.4 Function Returns
    • 3.2 Exceptions
      • 3.2.1 Exception Handling
  • 4 Statistics
    • 4.1 Markov Process
      • 4.1.1 Close Value Comparison
  • 5 Tables and Graphs
    • 5.1 Matplotlib Base Plots
      • 5.1.1 Line and Scatter Plots
      • 5.1.2 Text Plot
  • 6 Amazon Web Services
    • 6.1 AWS Setup
      • 6.1.1 AWS Setup
      • 6.1.2 AWS Boto3
    • 6.2 S3
      • 6.2.1 S3 Usages
    • 6.3 Batch
      • 6.3.1 AWS Batch Run
  • 7 Docker Container
    • 7.1 Docker Setup
      • 7.1.1 Docker Setup
      • 7.1.2 ECR Setup
  • 8 Get Data
    • 8.1 Environmental Data
      • 8.1.1 ECMWF ERA5 Data
  • 9 System and Support
    • 9.1 Command Line
      • 9.1.1 Python Command Line
      • 9.1.2 Run Matlab Functions
    • 9.2 File In and Out
      • 9.2.1 Check, Read, Write and Convert Files
      • 9.2.2 Folder Operations
      • 9.2.3 Parse Yaml
    • 9.3 Install Python
      • 9.3.1 Core Installations
    • 9.4 Documentation
      • 9.4.1 Numpy Doc Documentation Guide
  • Appendix
  • A Index and Code Links
    • A.1 Data Structures links
      • A.1.1 Section 1.1 Numbers, Strings, Lists and Tuples links
      • A.1.2 Section 1.2 Dictionary links
      • A.1.3 Section 1.3 Numpy Arrays links
    • A.2 Pandas links
      • A.2.1 Section 2.1 Panda Basics links
    • A.3 Functions links
      • A.3.1 Section 3.1 Function Arguments and Returns links
      • A.3.2 Section 3.2 Exceptions links
    • A.4 Statistics links
      • A.4.1 Section 4.1 Markov Process links
    • A.5 Tables and Graphs links
      • A.5.1 Section 5.1 Matplotlib Base Plots links
    • A.6 Amazon Web Services links
      • A.6.1 Section 6.1 AWS Setup links
      • A.6.2 Section 6.2 S3 links
      • A.6.3 Section 6.3 Batch links
    • A.7 Docker Container links
      • A.7.1 Section 7.1 Docker Setup links
    • A.8 Get Data links
      • A.8.1 Section 8.1 Environmental Data links
    • A.9 System and Support links
      • A.9.1 Section 9.1 Command Line links
      • A.9.2 Section 9.2 File In and Out links
      • A.9.3 Section 9.3 Install Python links
      • A.9.4 Section 9.4 Documentation links
  • Py4Econ Bookdown

Data Structures and Cloud Services with Python

A Index and Code Links

A.1 Data Structures links

A.1.1 Section 1.1 Numbers, Strings, Lists and Tuples links

  1. Basic Number Numeric Manipulations: rmd | r | pdf | html
    • Loop over a list of numbers where the first and second digits have different interpretations.
    • py: int(np.floor(it_num/10)) + it_num%10
    • numpy: floor
  2. Define and Unpack Tuple: rmd | r | pdf | html
    • Define/deal multiple variables on the same line
    • Define tuple in python with and without parenthesis, unpack tuple, get subset of elements.
    • Access tuple element and fail to mutate tuple element.
    • py: isinstance(tp_abc, tuple)
  3. List Manipulations and Defaults: rmd | r | pdf | html
    • Conditional statements based on list length and element value.
    • Provide default for element of a list when list does not have that element.
    • py: lambda + join + append() + if len(X) >= 3 and X[2] is not None + if elif else
  4. Python String Manipulation Examples: rmd | r | pdf | html
    • Count unique elements of a string array, generate frequency list.
    • Search for substring, replace string, wrap string.
    • Display and format numeric string with fstring.
    • Change the decimal rounding given a list of estimates and standard error string arrays.
    • py: zip() + upper() + join() + round() + float() + split() + replace() + ascii_lowercase() + set()
    • textwrap: fill(st, width = 20)
    • fstring: f + f’{fl_esti_rounded:.{it_round_decimal}f}’
    • random: choice

A.1.2 Section 1.2 Dictionary links

  1. Python Dictionary Examples and Usages: rmd | r | pdf | html
    • Generate a dictionary, loop through a dictionary.
    • List comprehension with dictionary.
    • py: dc = {‘key’: “name,” ‘val’: 1}
    • copy: deepcopy

A.1.3 Section 1.3 Numpy Arrays links

  1. Numpy Combine Arrays to Matrix: rmd | r | pdf | html
    • Arrays to matrix.
    • numpy: column_stack() + random.choice() + reshape()

A.2 Pandas links

A.2.1 Section 2.1 Panda Basics links

  1. Pandas Generate Dataframes with Random Numeric and String Data: rmd | r | pdf | html
    • Generate a dataframe from arrays.
    • Generate a dataframe with random integers as well as random string variables.
    • np: random.randint() + reshape() + column_stack()
    • pandas: DataFrame()
  2. Python Pandas Conditional Selection of Selectiotn Rows and Columns: rmd | r | pdf | html
    • Select subset of rows or columns based on cell value conditions.
    • pandas: pd.DataFrame() + replace([‘Zvcss,’ ‘Dugei’], ‘Zqovt’) + df.loc[df[‘c5’] == ‘Zqovt’]
  3. Dataframe Export as CSV with Automatic File Path and Name: rmd | r | pdf | html
    • Export a pandas dataframe to csv, store automatically in user home download folder.
    • File name contains the variable name, use fstring to get variable name as file string.
    • pandas: df2export.to_csv(spn_csv_path, sep=“,”)
    • pathlib: home() + joinpath() + mkdir(parents=True, exist_ok=True)
    • fstring: f’{mt_abc=}‘.split(’=’)[0]
    • time: strftime(“%Y%m%d-%H%M%S”)

A.3 Functions links

A.3.1 Section 3.1 Function Arguments and Returns links

  1. Python Function Data Type Handling: rmd | r | pdf | html
    • Check if parameter is string or integer, conditional execution and exception handling.
    • Check if parameter is string or an integer between some values.
    • py: type + isinstance(abc, str) + isinstance(abc, int) + raise + try except
  2. Tuple and Dictionary as Arguments with args and kwargs: rmd | r | pdf | html
    • Update default parameters with dictionary that replaces and appends additional key-value pairs using kwargs.
    • Pass a dictionary for named arguments to a function.
    • Python function None as default for mutable list argument.
    • python: dict3 = {dict1, dict2} + dict1.update(dict2) + func(par1=‘val1,’ kwargs)
  3. Command Line Argument Parsing Positional and Optional Arguments: rmd | r | pdf | html
    • Parse parameters entered via command line to call a python script.
    • Optional and positional arguments of different data types (int, str, etc.).
    • Default values, allowed list of values.
    • argparse: parser.add_argument() + parser.parse_args()
  4. Function value returns: rmd | r | pdf | html
    • Return one or multiple values from function.
    • python: return a, b, c

A.3.2 Section 3.2 Exceptions links

  1. Python Raise, Try and Catch Exceptions: rmd | r | pdf | html
    • Raise an Exception in a python function, try and catch and print to string.
    • Trace full exception stack.
    • python: raise + try except + ValueError + TypeError
    • traceback: print_exc()

A.4 Statistics links

A.4.1 Section 4.1 Markov Process links

  1. Markov Transition Conditional Probability Check Sum to 1: rmd | r | pdf | html
    • Generate a sample 50 by 50 markov transition matrix.
    • Check row sums for approximate equality to 1.
    • numpy: allclose + reshape + sum

A.5 Tables and Graphs links

A.5.1 Section 5.1 Matplotlib Base Plots links

  1. Mabplotlib Scatter and Line Plots: rmd | r | pdf | html
    • Plot several arrays of data, grid, figure title, and line and point patterns and colors.
    • Plot out random walk and white noise first-order autoregressive processes.
    • matplotlib: subplots() + ax.plot() + ax.legend() + ylabel() + xlabel() + title() + grid() + show()
    • numpy: random.normal() + random.seed() + cumsum() + arange()
  2. Mabplotlib Text Plots: rmd | r | pdf | html
    • Print text as figure.
    • matplotlib: ax.text()
    • textwrap: fill()
    • json: dump()

A.6 Amazon Web Services links

A.6.1 Section 6.1 AWS Setup links

  1. AWS Account Set-up and Start Instance: rmd | r | pdf | html
    • Generate keypair on AWS, launch instance, launch security, ssh access, and AWSCLI.
    • ssh: ssh-agent + ssh-keygen + ssh ec2-user@ec2-52-23-218-117.compute-1.amazonaws.com
    • aws: aws ec2 start-instances + aws ec2 stop-instances + systemctl start amazon-ssm-agent
  2. Boto3 Client Service Communications: rmd | r | pdf | html
    • Start AWS services, send requests etc via boto3.
    • boto3: boto3.client(service, aws_access_key_id, aws_secret_access_key, region_name)

A.6.2 Section 6.2 S3 links

  1. AWS S3 Uploading, Downloading and Syncing, Locally, EC2 and in Docker Container: rmd | r | pdf | html
    • From EC2 or local computer upload files to S3 folders.
    • Download sync folders with exclusions between local and S3 folders.
    • Download file from S3 to local computer, an EC2 Linux computer, or into a Docker Container.
    • py: platform.release()
    • boto3: boto3.client(‘s3’) + s3.upload_file() + s3.download_file()
    • os: sep

A.6.3 Section 6.3 Batch links

  1. AWS Batch, Batch Array: rmd | r | pdf | html
    • Set up python function that uses AWS_BATCH_JOB_ARRAY_INDEX.
    • Register batch task and submit batch array tasks using ECR image, and save results to S3.
    • Batch Array status check until success.
    • yaml: load()
    • boto3: client() + register_job_definition(jobDefinitionName, type, containerProperties, retryStrategy) + aws_batch.submit_job(jobName, jobQueue, arrayProperties={‘size’:10}, jobDefinition) + aws_batch.describe_jobs()

A.7 Docker Container links

A.7.1 Section 7.1 Docker Setup links

  1. Docker Container Set-Up and Run on AWS: rmd | r | pdf | html
    • Install Docker on AWS and build Docker image.
    • Start docker container and run programs inside Docker.
    • aws: ssh + yum update -y + amazon-linux-extras install docker -y
    • docker: service docker start + service docker status + docker build + docker images + docker image prune + docker run -t -i fanconda /bin/bash + python /fanProg/invoke/run.py + docker ps -a + docker system df + docker container ls -a
  2. AWS Docker Elastic Container Registery (ECR) Update and Push: rmd | r | pdf | html
    • Update and push to Elastic Container Registry (ECR) with newly built Docker image.
    • Pull from Elastic Container Registry docker image.
    • scp: scp -o StrictHostKeyChecking=accept-new -i
    • aws: aws ecr get-login
    • docker: docker login + docker build + docker tag + docker push + docker pull

A.8 Get Data links

A.8.1 Section 8.1 Environmental Data links

  1. CDS ECMWF Global Enviornmental Data Download: rmd | r | pdf | html
    • Using Python API get get ECMWF ERA5 data.
    • Dynamically modify a python API file, run python inside a Conda virtual environment with R-reticulate.
    • r: file() + writeLines() + unzip() + list.files() + unlink()
    • r-reticulate: use_python() + Sys.setenv(RETICULATE_PYTHON = spth_conda_env)

A.9 System and Support links

A.9.1 Section 9.1 Command Line links

  1. Execute Python from Command Line and Run Command Line in Python: rmd | r | pdf | html
    • Run python functions from command line.
  2. Run Matlab Command Line Operations: rmd | r | pdf | html
    • Generate a matlab script and run the script with parameters.
    • subprocess: cmd = Popen(ls_str, stdin=PIPE, stdout=PIPE, stderr=PIPE) + cmd.communicate()
    • decode: decode(‘utf-8’)
    • os: chdir() + getcdw()

A.9.2 Section 9.2 File In and Out links

  1. Searching for Programs, Reading and Writing to File Examples: rmd | r | pdf | html
    • Check the path to a particular installed program.
    • Get the parent folder of the current file.
    • Reading from file and replace strings in file.
    • Convert text file to latex using pandoc and clean.
    • py: open() + write() + replace() + [c for b in [[1,2],[2,3]] for c in b]
    • subprocess: call()
    • pathlib: Path().rglob() + Path().stem + Path(spn).parents[1]
    • os: remove() + listdir() + path.isfile() + path.splitdrive() + path.splitext() + path.split()
    • shutil: which()
  2. Python Directory and Folder Operations: rmd | r | pdf | html
    • Join folder names to form absolute path.
    • Folder path slash conversion from system os.sep to forward slash.
    • Generate new folders and files, with existing folder substrings.
    • Generate subfolder recursively.
    • py: open(srt, ‘w’) + write() + close()
    • os: os.sep + os.listdir() + os.path.abspath() + os.path.abspath(os.path.join(os.sep, ‘users,’ ‘fan’)) + os.path.join(‘/,’ ‘c:’ ‘fa,’ ‘fb’) + spn_path.replace(os.sep, ‘/’)
    • pathlib: Path(srt).mkdir(parents=True, exist_ok=True) + [Path(spn).stem for spn in Path(srt).rglob(st)]
    • shutil: shutil.copyfile(‘/fa/fl.txt,’ ‘/fb/fl.txt’) + shutil.copy2(‘/fa/fl.txt,’ ‘/fb’) + shutil.rmtree(‘/fb’)
    • distutils: dir_util.copy_tree(‘/fa,’ ‘/fb’)
  3. Python Yaml File Parsing: rmd | r | pdf | html
    • Parse and read yaml files.
    • yaml: load(fl_yaml, Loader=yaml.BaseLoader) + dump()
    • pprint: pprint.pprint(ls_dict_yml, width=1)

A.9.3 Section 9.3 Install Python links

  1. Basic Conda Setup Instructions: rmd | r | pdf | html
    • Conda and git installations
    • bash: where

A.9.4 Section 9.4 Documentation links

  1. Python Documentation Numpy Doc: rmd | r | pdf | html
    • Numpy documentation examples.
Xie, Yihui. 2020. Bookdown: Authoring Books and Technical Documents with r Markdown. https://CRAN.R-project.org/package=bookdown.