Skip to content

Latest commit

 

History

History
1366 lines (925 loc) · 25.4 KB

File metadata and controls

1366 lines (925 loc) · 25.4 KB

class: center, bottom

Python for DevOps


Desclaimer

???

  1. We won't talk about objects because it's not a priority for this training.
  2. This is not Algorithm 101 - We're not gonna cover all python data types

Objective

Learn how to write Python scripts to solve simple DevOps problems without a Configuration Management tool.

Approach:

  • Learn the basics of python language

--

  • Best practices

--

  • Most used modules for scripting/DevOps

--

  • Packaging / Writing CLI tools

--

Why?

--

  1. Python is easy and straight forward. --

  2. Python based Cfg. Management tools uses the things we'll learn here. --

  3. You might not have a Cfg. Management tool available in your environment


How

Extreamly hands-on. We learn a language by writing code

--

We'll use python27 instead of py3 because in the DevOps world py2 still rules.

??? Explain py3 and py2


Language Features

  1. Dynamic language

??? Interprated code

--

  1. No semicolons, identation!

--

  1. We love whitespaces

??? We often separate chunks of code with whitespaces to improve readability.

--

  1. snake_case not camelCase. Please don't write camelCase!!!

??? Class naming is an special case.

--

  1. Functions are first class citizen

Use cases

  1. DevOps and System / Network Administration / SecOps

??? Saltstack, ansible, fabric, openstack

  1. Data Science

--

  1. Scientific and Numeric computing ??? Replacement for matlab numpy, scipy

--

  1. Web Development

??? Instagram backend is python


class: top, right, fit-image layout: false background-image: url(http://localhost:8000/images/zen_of_python.png)


class: left, middle background-position: bottom;

Variables

Python is a dynamic language, we don't have to declare types!

>>> myvar = 5
>>> type(myvar)
<type 'int'>
>>> myvar = 'foo'
>>> type(myvar)
<type 'str'>

>>> myvar = 'this is nice'
>>> myvar.split(' ')
['this', 'is', 'nice']

???

  1. Show the type command on str and int
  2. Show simple aritimetic + * ** multiply string
  3. Data type conversion float(5)
  4. Multiline statement with + \
  5. Multi assignment same line with semicolon
  6. Multiple variable assginment a,b,c = foo, capa, foo
  7. Multiple variable assignment same x=y=z=foo (This is not pointers!!!)
  8. Strings format, % + * https://pyformat.info/
  9. Special characters \n, \t (tripe quotes)
  10. Comments with #

List

Lists are the workhorse of python:

# Basic list
my_list = ['foo', 5, 5.0, True]

# Index Lookup
my_list[0]

# Inserting
my_list.append('New Element')

# Removing components - By index or the last one
my_list.pop(1)

# Removing by value
my_list.remove(5)

# Slicing
my_list[0:1]
my_list[-4:]

# the in operator
'foo' in my_list

???

  1. my_list = ['foo', 5, 5.0, True]
  2. lookup my_list[0]
  3. append - my_list.append('New Element')
  4. my_list.pop(1)
  5. my_list.remove(5)

Control Flow

We have if/elif/else. We don't have semicolons. Everything is handled by identation, and we use ":" to start a block

if expression:
    statement(s)
elif expression:
    statement(s)
elif expression:
    statement(s)
...
else:
    statement(s)

Logic Conditions

if x and y:
  statement

if a or b:
  statement

Control Flow

Almost Pure English

if something is None:
  statement

if something is not None:
  statement

Ternary:

condition_is_true if condition else condition_is_false

Looping :)

Most common way to loop:

for item in ['foo', 'bar', 'biz']:
  print(item)

Range function:

for i in range(10):
    print(i)

Len function might be helpful

mah_list = ['foo', 'bar', 'biz']
for item in range(len(mah_list)):
  print(item)
count = 0
while count < 5:
    print(count)
    count += 1

Break on Loop

Your searching for items in a container, when item is found you want process it and exit the loop.

If you dont find any items in the container you want to call a special function.

--

found_items = False
for item in container:
    if search_something(item):
        # Found it!
        process(item)
        found_items = True
        break

if not found_items:
    # Didn't find anything..
    do_something()

Bizare Python for else:

for item in container:
    if search_something(item):
        # Found it!
        process(item)
        break
else:
    # Didn't find anything..
    not_found_in_container()

Functions

def foo(arg1, arg2):
  do_something

Return is optional, python will return None if you miss return

>>> def do_nothing():
...   print("foo")
...
>>> val = do_nothing()
foo
>>> val is None
True

Functions

You can have optional arguments:

def foo(name, caps=True):
  greeting = "Hello {}".format(name)
  if caps:
    print(greeting.upper())
  else:
    print(greeting)
...
>>> foo('bernardo')
HELLO BERNARDO
>>> foo('bernardo', caps=False)
Hello bernardo

Builtin Functions:

len(), abs(), print()

Functions

Returning and documentation:

>>> def stupid_sum(a, b):
...   """ This function sum two arguments and return the result """
...   return a + b
...
>>> stupid_sum.__doc__
' This function sum two arguments and return the result '
>>> stupid_sum(1, 1)
2

??? kwargs, *args


Exercise 01

Create a function called check_linux_version:

  1. Receive a parameter with kernel information
  2. That returns el5 if the kernel belongs to a Enterprise Linux 5
  3. That returns el6 if the kernel belongs to a Enterprise Linux 6
  4. That returns el7 if the kernel belongs to a Enterprise Linux 7
  5. That returns unkown_version if the none of the above

??? Running python -m unittest discover


Exercise 02

We will write a function similar to the GNU grep program.

The function will:

  1. Receive two parameters pattern and content

pattern is the string we're looking for

content is a string with the file content

The function should return a list with all lines that has the given pattern

If not found return an empty list


Exception Handling

Errors should never pass silently. ~ Zen of Python

  try:
    x = 5 + 'hey'
  except:
    print("you're stupid!")

--

Pass

Unless explicitly silenced. ~ Zen of Python

  try:
    x = 5 + 'hey'
  except:
    pass

--

Raising Exceptions

  try:
    x = 5 + 'hey'
  except:
    raise CustomCoolError("You're stupid!")

??? class CustomError(BaseException): pass

except (TypeError, ZeroDivisionError)

    try:
        child = subprocess.Popen(['java', '-version'], stderr=subprocess.PIPE)
        _, output = child.communicate()

        regex = re.search(r'(java version).*"(\d+\.\d).*', output.decode('utf-8'))

        if regex is not None:
            return regex.group(2)

    # In case Java isn't found simply return None
    except OSError:
        pass

Exception Handling

Specific errors / Finally

  try:
    x = 5 + 'hey'
  except ZeroDivisionError:
    print("Shall not see")
  finally:
    print("Aloha")

Tuples

Tuple is an ordered sequence of items same as list. The only difference is that tuples are immutable. Tuples once created cannot be modified.

Tuples are used to write-protect data and are usually faster than list as it cannot change dynamically.

???

  1. Show a tuple
  2. Show a function that returns a tuple
ITEMS = ["banana", "pera", "uva", "salada mista", "mamao"]

def get_uva():
    for i, item in enumerate(ITEMS):
        if item == 'uva':
            return i, item, "bar"

Exercise if we have time:

Implement a function that finds the first PATTERN in a file and returns a tuple with the line number and the line content.


#Dictionaries

Dictionary is an unordered collection of key-value pairs.

It is generally used when we have a huge amount of data. Dictionaries are optimized for retrieving data. We must know the key to retrieve the value.

>>> foo = {}

???

  1. Create a dict from cmd. Show how to access the value
  2. Insert/Update key
  3. Looping over a dict
  4. Find if key exists with in and has_key
  5. Method keys/iterkeys, values
  6. Show iteritems method
  7. del dict[key]
  8. dict.get(key, default)

JSON

foo = {}

import json

content = json.dumps(foo)

print(content)

with open('fruits.json', 'w') as stream:
    json.dump(foo, stream, indent=4)

Exercices

Given a string return a dict where letters from the string are keys and the value are the count of it in the string.

Example: banana

{'b': 1, 'a': 3, 'n': 2}

Python Modules

A module is a file containing Python definitions and statements. The file name is the module name with the suffix .py appended. Within a module, the module’s name (as a string) is available as the value of the global variable name. For instance, use your favorite text editor to create a file called fibo.py in the current directory with the following contents:

_MAGIC_VAR="HELLO"

def fib(n):    # write Fibonacci series up to n
    a, b = 0, 1
    while b < n:
        print b,
        a, b = b, a+b

def fib2(n):   # return Fibonacci series up to n
    result = []
    a, b = 0, 1
    while b < n:
        result.append(b)
        a, b = b, a+b
    return result

???

  1. Show how to import the code import fib
  2. Show method dir() that shows what was imported
  3. Show how to import only fib function. as alias too
  4. Add some code to fibo.py to show that it will be executed
  5. import * to show that global _MAGIC_VAR won't get imported

Modules

How do we execute fibo and also make the functions available to other scripts without executing when fibo is imported?

--

if __name__ == "__main__":
    import sys
    fib(int(sys.argv[1]))

???

  1. First special variable __name__

Import fib and show its name: fib.__name__ Show current __name__

  1. Explain that imports can come in any part of the code we want. That's not antipattern
import sys
sys.path

for path in sys.path:
  print(path)

Module Search Path

When a module named fibo is imported, the interpreter first searches for a built-in module with that name.

If not found, it then searches for a file named fibo.py in a list of directories given by the variable sys.path. sys.path is initialized from these locations:

  1. The directory containing the input script (or the current directory).
  2. PYTHONPATH (a list of directory names, with the same syntax as the shell variable PATH).
  3. The installation-dependent default

??? Move fib file to a custom directory and update pythonpath variable


Python Packages

Packages are a way of structuring Python’s module namespace by using dotted module names.

--

For example, the module name sys.path designates a submodule named path in a package named sys.


Python Packages

A package is a directoy containing an __init__py file.

The __init__.py files are required to make Python treat the directories as containing packages.

sound/                          Top-level package
      __init__.py               Initialize the sound package
      formats/                  Subpackage for file format conversions
              __init__.py
              wavread.py
              wavwrite.py
              aiffread.py
              aiffwrite.py
              auread.py
              auwrite.py
              ...
      effects/                  Subpackage for sound effects
              __init__.py
              echo.py
              surround.py
              reverse.py
              ...
      filters/                  Subpackage for filters
              __init__.py
              equalizer.py
              vocoder.py
              karaoke.py

??? Example of init.py file: https://github.com/docker/compose/blob/master/compose/cli/__init__.py

create a package example


I/O

Handling Files

f = open("test.txt", 'r')
f.read()
f.close()

???

  1. read, than readlines than readline
  2. line loop, print
  3. Loop using the file for line in f:
  4. Remove the newline with read().splitlines()
  5. try,finally
  6. with context manager

If you have time do this:

  1. Show the seek example to read the content again Show the old way py26 than the new way

--

Writing

with open("test.txt", mode='w') as writer:
  writer.write("my content\n")
  writer.write("this is nice\n")

???

  1. Writing file
  2. Append a file

Exercise xx

Write a grep script, for real this time! Use your grep engine (Exercise 02) to save some time.

The file name should be exact: grep.py

The script should receive two arguments. First, the search pattern, Second the file path.

The script should exit with returncode 1 and return a message if:

  1. Both arguments were not provided. Print message: Two arguments are required, pattern and file path.
  2. The file doesn't exists. Print message: File filename doesn't exists

The script should return:

  1. The same way as GNU grep, return each match in a new line.
  2. Nothing if no lines matches

Remember the pythonic way is ask for forgiveness not permission!

BONUS: Replace GNU grep with your new script using a shell alias: alias grep="/path/to/grep.py"


#Generators

Generators are lazy iterators!

To understand generators we must first understand Iterators!.


Iterators

Iterators are objects that implements the iterator protocol.

--

Python lists, tuples, dicts and sets are all examples of inbuilt iterators.

--

Python magic method __iter__ and next!

Method next is called whenever global function next() is called.

StopIteration when exception when there is no longer any new value to return to signal the end of the iteration.

??? method that is called on initialization of an iterator. This should return an object that has a next method (In python 3 this is changed to next).


Iterators

class Fib:                                        
    def __init__(self, max):                      
        self.max = max

    def __iter__(self):                          
        self.a = 0
        self.b = 1
        return self

    def next(self):                          
        fib = self.a
        if fib > self.max:
            raise StopIteration                  
        self.a, self.b = self.b, self.a + self.b
        return fib     

??? When an iterator is used with a for loop, the for loop implicitly calls next()


Generators

Is a function which returns a generator iterator (just an object we can iterate over) by calling yield.

--

yield return the control of the producer to the caller.

--

The next time a caller call next() the producer gets the control back.

--

return x yield

return implies that the function is returning control of execution to the point where the function was called.

yield however, implies that the transfer of control is temporary and voluntary, and our function expects to regain it in the future.


Generators

Without Generators:

def fib(max):
    numbers = []
    a, b = 0, 1
    while a < max:
        numbers.append(a)
        a, b = b, a + b
     return numbers

--

With generators:

def fib(max):
    a, b = 0, 1
    while a < max:
        yield a
        a, b = b, a + b

--

??? Paste function in shell and show the object and next function Show in a loop.


Exercise - Fix grep function

What is wrong with our grep function?

--

Fix your grep script using generators.


Interacting with Underlying OS

import os

Interacting with the File System

Getting file information ??? os.getcwd() os.listdir() os.stat(file) // os.makedirs('foo/bar') os.listdir() // or give path

Modification time example

mod_time = os.stat('.gitignore').st_mtime print(datetime.datetime.fromtimestamp(mod_time))


OS

Environment Variables

os.getenv('PATH')
os.environ('PATH', '/usr/bin:/usr/local/bin')

--

Creating a file at home. How can we do that?

??? Example: LOGNAME to collect username

print(os.path.join(os.getenv('HOME'), 'foo.txt')) print(os.path.expanduser("~/foo.txt"))

Get file os.path.exists()

Remove it os.remove()

os.listdir('.') os.mkdir('foo') os.makedirs('foo/bar') os.rename('foo', 'bar')

File path manipulation os.path.basename -- filename os.path.dirname -- directory os.path.split -- split file/path os.path.splitext(f) -- split file and extension

os.uname()

os.walk to use in our exercise

https://docs.python.org/2/library/os.html


Exercise

Write a function that maps the quantity of data types found starting from the current working directory:

foo/
  bar/
    file1.sh
  biz.txt
folder/
  one.txt
{
  "txt": 2,
  "sh": 1,
  "directory": 3 
}

SHUTIL

import shutil

???

  1. Create a file
  2. shutil.copy to /tmp
  3. shutil.move to user home
  4. Create a bunch of temp files
  5. Use shutil.copytree
  6. Remove it using shutil.rmtree
  7. Create a backup using shutil.make_archive
shutil.make_archive('backup', format='zip', root_dir='/Users/bvale/projects/python-devops/exercise01')

Exercises

  1. Write a script that compress files older than 7 days from a given folder
./script /tmp/backup
  1. Write a second script that removes all files from a folder that has an specific extension:
./script2 /tmp/files bkp

Subprocess :)

The subprocess module allows you to spawn new processes, connect to their input/output/error pipes, and obtain their return codes. This module intends to replace several older modules and functions:

os.system
os.spawn*
os.popen*
popen2.*
commands.*

PEP 324 – PEP proposing the subprocess module

--

Let's give it a try :)

subprocess.call(cmd)

??? subprocess.call(cmd) wait for cmd exec and return resultcode

  1. Explain result code
  2. Explain list of commands
  3. Explain shell=True with exit and ls and the threats with ; plus rm -rf --no-preserve-root
  4. Explain stdout with a file

Subprocess :)

What if we don't wanna write a logic to identify errors ?

--

subprocess.check_output(cmd)

???

  1. Show the exceptions and the output

#Subprocess

Complex cases. Popen!

https://docs.python.org/2/library/subprocess.html#popen-constructor

??? Explain why popen is usefull, e.g: cwd, env, rc, stdout, stderr

Open process and explain the lifecycle child = subprocess.Popen(["jconsole"])

  1. child.poll()

  2. child.kill()

  3. Open a shell with child = subprocess.Popen("/bin/bash", stdin=subprocess.PIPE, stdout=subprocess.PIPE) and use communicate to send data


Exercice Time

Write a script to deploy a Java web app :)

  1. Install tomcat and jdk8 -- if needed
  2. Start tomcat -- if needed
  3. Deploy your WAR file (I will provide a WAR file if you don't have one)
  4. Reload tomcat
  5. Treat your deployer to be idempotent
docker run -it --cap-add SYS_PTRACE --rm -p 8080:8080 -v "$PWD:/code" bernardovale/pyubuntu

# Execute your deployer inside the container
./deploy.py

Argparse

The argparse module makes it easy to write user-friendly command-line interfaces. The program defines what arguments it requires, and argparse will figure out how to parse those out of sys.argv. The argparse module also automatically generates help and usage messages and issues errors when users give the program invalid arguments.

#!/usr/bin/python
import argparse

parser = argparse.ArgumentParser(description='My program is really cool.')

args = parser.parse_args()

??? In a script, parse_args() will typically be called with no arguments, and the ArgumentParser will automatically determine the command-line arguments from sys.argv.

  1. Positional arguments parser.add_argument('double', type=int, help="Double the given number")
  2. Optional arguments: parser.add_argument("--verbosity", help="increase output verbosity", action="store_true")
  3. Short options: parser.add_argument("-v", "--verbosity", help="increase output verbosity", action="store_true")
  4. Optional with values: parser.add_argument('--divide', type=int)
total = args.double*2
if args.divide:
    total = total / args.divide
print total

Exercice Argparse

Write a script using argparse to print files of a given folder.

Each file should be printed in a separate line with the format:

user group file
  1. The first positional argument should be a path
  2. Provide an optional argument --size that accepts a list of values [bytes, KB, MB, GB] if this argument is used the program will print the file and the size formatted with given option.
  3. Provide an optional argument --machine that will print the each information separated by a comma

Example:

./prog.py dist/ --machine --size MB

bvale,staff,4MB,folder1/
bvale,staff,0.1MB,foo.txt
bvale,staff,210MB,folder2/

Virtual Environment

A Virtual Environment is a tool to keep the dependencies required by different projects in separate places, by creating virtual Python environments for them. It solves the "Project X depends on version 1.x but, Project Y needs 4.x" dilemma, and keeps your global site-packages directory clean and manageable.

pip install virtualenv

Creating a virtualenv:

python -m virtualenv name/

Code Dependencies

pip is the offical python tool to handle dependencies.

We usually define a requirements.txt which is a file containing a list of items to be installed using pip install like so:

pip install -r requirements.txt

Requests - HTTP for Humans

Requests allows you to send organic, grass-fed HTTP/1.1 requests, without the need for manual labor. There's no need to manually add query strings to your URLs, or to form-encode your POST data. Keep-alive and HTTP connection pooling are 100% automatic, thanks to urllib3.

import requests
r = requests.get("https://api.chucknorris.io/jokes/random")

???

  1. r.status_code
  2. r.json()['value']

Packaging

Distributing python applications the right way.

Let's write a chucknorris CLI tool

setup.py:

from distutils.core import setup

setup(
    name='ChuckNorris',
    version='0.1.0',
    author='Bernardo Vale',
    author_email='bvale@avenuecode.com',
    packages=['chucknorris'],
    scripts=['bin/chucknorris'],
    description='ChuckNorris cli quotes generator.',
    long_description=open('README.txt').read(),
    install_requires=[
        "requests >= 1.1.1",
    ],
)

??? A Python installation has a site-packages directory inside the module directory. This directory is where user installed packages are dropped.

  1. Create the setup.py in the root folder
  2. bin folder and chucknorris module folder

Final Project - Exercise

Using your new awesome Python skills write a CI tool to replace Jenkins (lol)

Your challenge is to write a script that will execute a simple pipeline (already designed) inside our Git repository:

https://github.com/bernardoVale/devops-scripting.git

Inside the repository, we've added a pipeline.yml file. Your CI tool will lookup this file to execute the requested pipeline that will be implemented in your tool.

In a glance, your script needs to:

  1. Fetch git code (https://github.com/bernardoVale/devops-scripting.git)
  2. Parse the pipeline.yml
  3. Run selected pipeline

Pipeline.yml Format

The pipeline file has only three main hashes.

Branch: Repository branch that should be used to execute the code.

Tasks: Saved commands that can be used to create a pipeline

Pipelines: Group of ordered tasks

Script Arguments

  1. Pipeline name, E.g: build
  2. Git repository URL

Examples

Assuming your script name is pipeline - You're free to give a name to your CI tool :)

Should fetch git code and execute build pipeline:

./pipeline build https://bitbucket.org/ac-recruitment/scripting-helloworld.git

Deprecated slides

sys

This module provides access to some variables used or maintained by the interpreter and to functions that interact strongly with the interpreter. It is always available

import sys

??? sys.argv


Classes

Not going to talk about classes


PEP8

PEP8 flake8 = pep8 + pyflakes pyflakes = tox


Regular Expression

Regular expressions are handled by module import re

Most methods follows the same structure:

  1. Pattern
  2. String
  3. Flags

--

FINDALL

Return a list with all matches

import re
re.findall(r'\d+', "Bernardo has 25 years old and he's a "
"DevOps Engineer at AvenueCode")

???

Findall \d = any digit

  • = One or more occurance

Regular Expression

Compiling a regular expression

regex = re.compile('http\://|https://')
regex.match("https://avenuecode.com")

??? Demonstrate re.match() = Return match object or None

-- Searching for a pattern

print(re.match(r'avenuecode', "https://avenuecode.com"))
print(re.search(r'avenuecode', "https://avenuecode.com"))

??? .group() to return the matched string

--

Find all occurences

test = """
https://avenuecode.com
https://avenuecode.com.br
https://acdc.avenuecode.com
https://jenkins.avenuecode.com"""
print(re.search(r'avenuecode', "https://avenuecode.com"))
print(re.findAll(r'avenuecode', "https://avenuecode.com"))

Regular Expression

Replacing stuff!

re.sub(r'avenuecode', 'ac4success', "https://avenuecode.com")

-- Grouping

regex = re.match(r'(http|https)\://(\w*).(\w.*)', "https://jenkins.avenuecode.com", flags=re.MULTILINE)
regex.group(1))