Python Intermediate

Topics

data types in depth
- strings and string processing
- bytes
- sequences
object oriented programming and classes
advanced use of control structures
creating lists and dictionaries via comprehensions
throwing and catching exceptions
modules and packages
package management in Python

Data types in Python

Basic data types

int
float
bool
NoneType
string

Collections

list
tuple
dict

Other data types

decimal
complex
bytes, bytearray
set, frozenset
NamedTuple
...

None

None is a Singleton:

there is only ever a single instance of it inside a running Python program
multiple variables may refer to that same instance

Comparisons via "is"

The keyword is checks whether two references / names refer to the same object.

a = [1, 2]
b = a
x = [1, 2]

a == b # True
a is b # True
a == x # True
a is x # False

None and "is"

As None is a singleton we can check for it via is None:

if a is None:
    print("a is None")

bool

True or False

a = True
if a:
    print('hello')

bool

Internally False behaves almost like 0 and True behaves almost like 1

False + True # 1

Numbers

Operations with numbers

Integer division: 10 // 3
Remainder: 10 % 3
Power: 2 ** 3

Underscores in literals

to help us read long numbers:

earth_circumference = 40075017
earth_circumference = 40_075_017

int

integers of arbitrary size

int

Other numeral systems:

a = 42 # decimal
b = 0b101010 # binary
c = 0o52 # octal
d = 0x2a # hexadecimal

e = int('101010', 2)

float

64 bit float

a = 2.3
b = .2
c = 6e23
d = float('nan')
e = float('inf')

float

rounding errors: some numbers cannot be represented as floating point numbers, they will always be approximations

examples in the decimal system: 1/3, 1/7, π

examples in the binary system (i.e. floats): 1/10, 1/5, 1/3, π

example: π + π evaluates to 6.2 when using decimal numbers with a precision of 2 (a more exact result would be 6.3)

example: 0.1 + 0.2 evaluates to ~ 0.30000000000000004 when using 64 bit floats

float

0.1 + 0.2 == 0.3
# False

import math

math.isclose(0.1 + 0.2, 0.3)
# True

float

IEEE 754: standardized floating point arithmetic

Python mostly acts as defined in the standard

deviation from the standard: in some cases Python will throw exceptions where the standard produces a result - e.g. 1.0/0.0

Special numbers in IEEE 754:

inf and -inf (infinite values)
nan (not-a-number: undefined / unknown value)

complex

a = 2 + 3j

Augmented assignment

For binary operators there are so-called augmented assignments:

a = a + 1

short form (augmented assignment):

a += 1

other operations: -=, *=, ...

Character encodings

Unicode characters

Unicode: catalog of over 100,000 international characters, each with a unique identifying name and number (usually written in hexadecimal)

examples:

K: U+004B (Latin capital letter K)
?: U+003F (Question mark)
ä: U+00E4 (Latin small letter a with a diaeresis)
€: U+20AC (Euro sign)
🙂: U+1F642 (Slightly smiling face)

tables of all Unicode characters

Character encodings

Character encoding = mapping of characters to bit sequences

ASCII: encodes the first 128 Unicode characters, can represent characters like A, !, \$, space, line break
Latin1: encodes the first 256 Unicode characters, can represent ASCII characters and characters like ä, á, ß, §
UTF-8, UTF-16, UTF-32: encode all Unicode characters

A character encoding is necessary in order to write text to disk or transfer it over the network

Character encodings

Examples in ASCII / Latin1 / UTF-8:

! ↔ 00100001
A ↔ 01000001
a ↔ 01100001

Examples in Latin1:

Ä ↔ 11000100

Examples in UTF-8:

Ä ↔ 11000011 10100100
🙂 ↔ 11110000 10011111 10011001 10000010

UTF-8

In many areas (in particular on the web) UTF-8 has become the standard text encoding

In UTF-8 the first 128 Unicode characters can be encoded in just 8 bit

All other characters need either 16, 24 or 32 bit

UTF-32

UTF-32 encodes the Unicode code points directly

Depending on the area of application the byte order may differ (big endian or little endian)

example:

🙂 (U+1F642) ↔ 00 01 F6 42 (big endian) or 42 F6 01 00 (little endian)

Line breaks

Line breaks can be represented by the characters LF (line feed, U+000A) and / or CR (carriage return, U+000D)

LF: Standard on Linux, MacOS
CRLF: Standard on Windows, in network protocols like HTTP

In string literals LF is often represented by \n and CR is represented by \r

Strings

In Python 3 strings are sequences of Unicode characters

String literals

Examples:

a = "test"
b = 'test'

Multi-line string literals

a = """this
is a multi-line
string literal.
"""

Escape sequences

Some characters may be entered via so-called escape sequences:

a = "He said:\n\"Hi!\""

Escape sequences

\' → '
\" → "
\\ → \
\n → Line Feed (line separator on Unix)
\r\n → Carriage Return + Line Feed (line separator on Windows)
\t → Tab
\xHH or \uHHHH or \UHHHHHHHH → Unicode-Codepoint (hexadecimal)

Raw Strings

If we don't need to use any escape sequences in a string:

path = r"C:\documents\foo\news.txt"

This can be useful when writing Windows paths and regular expressions

String methods

.lower()
.upper()

String methods

.startswith(...)
.endswith(".txt")

String methods

.center(10)
.ljust(10)
.rjust(10)

String methods

.strip()
.split(' ')
.splitlines()
.join()

Exercise: formatting Othello

sources:

Exercise: formatting Othello

input:

  Rodorigo. Neuer tell me, I take it much vnkindly
That thou (Iago) who hast had my purse,
As if y strings were thine, should'st know of this

target:

Rodorigo. Neuer tell me, I take it much vnkindly           1
That thou (Iago) who hast had my purse,                    2
As if y strings were thine, should'st know of this         3

Exercise: formatting Othello

tasks:

remove leading whitespace of each line
add a line number to the end of each line and place the line number right-aligned at character 70
(place line numbers only in every 5th line)
(instead of placing line numbers at character 70, use the longest line as a reference)

String formatting

String formatting = placing values in strings

Methods:

greeting = "Hello, " + name + "!"

greeting = f"Hello, {name}!"

String formatting: methods

city = 'Vienna'
temperature = 23.7

# rather obsolete
'weather in %s: %f°C' % (city, temperature)

'weather in {0}: {1}°C'.format(city, temperature)
'weather in {}: {}°C'.format(city, temperature)
'weather in {c}: {t}°C'.format(c=city, t=temperature)

f'weather in {city}: {temperature}°C'

Format specification

.4f: four decimal places after the decimal point
.4g: four decimal places

print(f"Pi is {math.pi:.4f}")
# Pi is 3.1416
print(f"Pi is {math.pi:.4g}")
# Pi is 3.142

Format specification

>8: right-aligned (total width 8)
^8: centered
<8: left-aligned

print(f"{first_name:>8}")
print(f"{last_name:>8}")

    John
     Doe

Format specification

combination:

print(f"{menu_item:<12} {price:>5.2}$")

Burger        11.90$
Salad          8.90$
Fries          3.90$

Format specification

further options:

Python String Format Cookbook by Marcus Kazmierczak

Lists

Lists are mutable sequences of objects; they are usually used to store homogenous entries of the same type and structure

primes = [2, 3, 5, 7, 11]

users = ["Alice", "Bob", "Charlie"]

Operations on lists

The following operations will also work on other sequences - e.g. tuples, strings or bytes

accessing elements (via index): users[2]
accessing multiple elements (sublist): users[2:4]
concatenation: users + users
repetition: 3 * users
length: len(users)
for loop: for user in users:
if clause : if 'Tim' in users:

Operations on lists - mutations

Lists can be mutated directly (while strings and tuples can't be):

appending: users.append("Dan")
inserting: users.insert(2, "Max")
removing the last element: users.pop()
removing an element by index: users.pop(2)

Sorting lists

sorting by default order (alphabetically for strings)

l.sort()

sorting by custom order:

by string length
by occurence of letter "a"

l.sort(key=len)

def count_a(s):
    return s.count("a")
l.sort(key=count_a)

Exercises

shuffling cards
list of prime numbers
insertion sort

Tuples

Creating tuples

Entries are separated by commas, usually surrounded by round brackets.

empty_tuple = ()
single_value = ('Thomas', )
single_value = 'Thomas',
two_values = ('Thomas', 'Smith')
two_values = 'Thomas', 'Smith'

Unpacking of tuples

time = (23, 45, 0)

hour, minute, second = time

swapping variables:

a, b = b, a

Bytes

when reading from storage media or reading network responses we may have to deal with bytes: sequences of integers in the range of 0 to 255 (8 bits)

bytes may represent images, text, data, ...

Hexadecimal notation

bytes are often written in hexadecimal notation instead of decimal:

1_dec = 1_hex
9_dec = 9_hex
10_dec = a_hex
15_dec = f_hex
16_dec = 10_hex
17_dec = 11_hex
31_dec = 1f_hex
32_dec = 20_hex

Hexadecimal notation

hexadecimal literals in Python:

1 = 0x1
9 = 0x9
10 = 0xa
15 = 0xf
16 = 0x10
17 = 0x11
31 = 0x1f
32 = 0x20

Creation

creating bytes from a list of numbers:

a = bytes([0, 64, 112, 160, 255])
b = bytes([0, 0x40, 0x70, 0xa0, 0xff])

creating bytes from a byte string literal:

c = b"\x00\x40\x70\xa0\xff"

ASCII values can be included directly (\x40 = "@", \x70 = "p"):

d = b"\x00@p\xa0\xff"

Bytes

Standard representation in Python:

print(bytes([0x00, 0x40, 0x70, 0xa0, 0xff]))

b'\x00@p\xa0\xff'

Where possible, bytes will be represented by ASCII characters; otherwise their hex code will be shown

Bytes and Strings

Bytes will often hold encoded text

If we know the encoding we can convert between bytes and strings:

'ä'.encode('utf-8')
# b'\xc3\xa4'

b'\xc3\xa4'.decode('utf-8')
# 'ä'

Sequences

Python sequences consist of other Python objects

examples:

lists
tuples
strings
bytes

Operations on sequences

accessing an element (via index): s[2]
accessing multiple elements: s[2:4]
concatenation: s + t
repetition: 3 * s
length: len(s)
for loop: for el in s:
if clause : if el in s:

Operations

Accessing elements

users = ['mike', 'tim', 'theresa']

users[0] # 'mike'
users[-1] # 'theresa'

Operations

Changing elements

(if the sequence is mutable)

users = ['mike', 'tim', 'theresa']

users[0] = 'molly'

Operations

Accessing multiple elements

users = ['mike', 'tim', 'theresa']

users[0:2] # ['mike', 'tim']

Operations

Concatenation

users = ['mike', 'tim', 'theresa']

new_users = users + ['tina', 'michelle']

Operations

Repetition

users = ['mike', 'tim', 'theresa']

new_users = users * 3

Operations

Length

users = ['mike', 'tim', 'theresa']

print(len(users))

Operations

for loop

users = ['mike', 'tim', 'theresa']

for user in users:
    print(user.upper())

Dictionaries

Dictionaries are mappings of keys to values

person = {
    "first_name": "John",
    "last_name": "Doe",
    "nationality": "Canada",
    "birth_year": 1980
}

Dictionaries

Accessing entries

person["first_name"] # "John"

Dictionaries

Iterating over dictionaries

for entry in person:
    print(entry)

This will yield the keys: "first_name", "last_name", "nationality", "birth_year"

Since Python 3.7 the keys will always remain in insertion order

Dictionaries

Iterating over key-value-pairs:

for key, value in person.items():
    print(f'{key}, {value}')

Operations on dictionaries

d = {0: 'zero', 1: 'one', 2: 'two'}

d[2]
d[2] = 'TWO'
d[3] # KeyError
d.get(3) # None
d.setdefault(2, 'n')
d.setdefault(3, 'n')

d.keys()
d.items()

d1.update(d2)

Valid keys

Any immutable object can act as a dictionary key. The most common types of keys are strings.

Exercises

vocabulary trainer
todo list

Object-oriented programming and classes

Object orientation in Python: "Everything is an object"

a = 20

a.to_bytes(1, "big")

"hello".upper()

Types and instances

message = "hello"

type(message)

isinstance(message, str)

Classes

Classes may represent various things, e.g.:

a Message inside an e-mail program
a user of a website
a car in a racing game
a shopping basket in an online shop
a bank account
a data set to be analysed
...

Classes

The definition of a class usually encompasses:

a "data structure" (attributes)
a "behavior" (methods)

Classes

example: class TextIOWrapper can represent a text file (is created when calling open())

attributes:

closed
encoding
mode (e.g. r=read, w=write)
name (filename)
...

methods:

close()
read()
write()
...

Classes

example: class BankAccount

"data structure" (attributes)
"behavior" (methods)

Defining classes

class MyClass():

    # the method __init__ initializes the object
    def __init__(self):
        # inside any method, self will refer
        # to the current instance of the class
        self.message = "hello"

instance = MyClass()
instance.message # "hello"

Private attributes and methods

Attributes and methods that should not be used from the outside are prefixed with _

We're all consenting adults here: https://mail.python.org/pipermail/tutor/2003-October/025932.html

Example: class "Length"

a = Length(2.54, "cm")
b = Length(3, "in")

a.unit
a.value

Exercise: classes "TodoList" and "Todo"

tdl = TodoList("groceries")

tdl.add("milk")
tdl.add("bread")

print(tdl.todos)
tdl.todos[0].toggle()

tdl.stats() # {open: 1, completed: 1}

Inheritance and composition

often we can use some class(es) as the basis for another class

e.g.:

User class as the basis of the AdminUser class
TicTacToeGame as the basis of TicTacToeGameGUI

Inheritance and composition

inheritance: an AdminUser is a User

composition: TicTacToeGameGUI could use TicTacToeGame in the background

common mantra: composition over inheritance: don't overuse inheritance

Inheritance

inheritance:

class User():
    ...

class AdminUser(User):
    ...

the AdminUser class automatically inherits all existing methods from the User class

Composition

composition:

class TicTacToeGame():
    ...

class TicTacToeGameGUI():
    def __init__(self):
        self.game = TicTacToeGame()

Inheritance

example of inheritance - database model in Django:

from django.db import models

class Question(models.Model):
    question_text = models.CharField(max_length=200)
    pub_date = models.DateTimeField('date published')

Code style

PEP8

standard styleguide for Python code

official document: https://www.python.org/dev/peps/pep-0008/

cheatsheet: https://gist.github.com/RichardBronosky/454964087739a449da04

Code formatters

black
autopep8
yapf

In VS Code config: "python.formatting.provider": "black"

Code formatters

input:

a='Hello'; b="Have you read \"1984\"?"
c=a[0+1:3]

output via black:

a = "Hello"
b = 'Have you read "1984"?'
c = a[0 + 1 : 3]

Python philosophy, Zen of Python

Quotes from the zen of Python (full text via import this):

Explicit is better than implicit.
Readability counts.
Special cases aren't special enough to break the rules.
There should be one-- and preferably only one --obvious way to do it.

Docstrings

Docstrings = Strings that describe functions / classes / modules in more detail

comments in a function: help programmers who develop that function

docstring of a function: help programmers who use that function

Docstrings

Example:

def fib(n):
    """Compute the n-th fibonacci number.

    n must be a nonnegative integer
    """
    ...

Viewing docstrings

help(fib)
help(round)

Debugging

Breakpoints can be set to pause execution at a certain point.

Possibilities to set breakpoints:

in VS Code: click next to the line number
directly in Python Code via breakpoint() (since Python 3.7)

Executing in VS Code:

Symbol "Run and Debug" in the left toolbar
select "create a launch.json file" - "Python" - "Python File"

via Debug - Start Debugging or F5.

Debugging

Continuing manually:

proceed until the next breakpoint:
- Continue in VS Code
- c for continue in the Python debugger
end debugging:
- Stop in VS Code
- q for quit in the Python debugger

Debugging

Continuing manually:

proceed to the next line:
- Step Over in VS Code
- n for next in the Python debugger
proceed to the next line - potentially following function calls
- Step Into in VS Code
- s for step in the Python debugger
run the current function to its end:
- Step Out in VS Code
- r for return in the Python debugger

Debugging

Examining values in VS Code:

local variables in the variables widget
watch custom expressions in the watch widget

Printing values in the Python debugger via p:

p mylist
p mylist[0]

Control structures

if
loops
- while
- for ... in ...
- for ... in range()
try ... except ...

if

From a previous example:

if age_seconds < 1000000000:
    print("You are less than 1 billion seconds old")
else:
    print("You are older than 1 billion seconds")

Conditions

When using conditions for if / while we usually use expressions that evaluate to boolean values.

However, we can also use other types:

a = 0
if a: ...

name = input("enter your name")
if name: ...

products = []
if products: ...

These types are converted to boolean values before being used as criteria for the if condition.

Conditions

Any value may be used as a condition in Python. Most values will be "truthy".

Only these values are considered "falsy" - calling bool(...) will return False:

False
0, 0.0
None
empty collections / sequences ("", [], (), {})
(before Python 3.5: datetime.time(0, 0, 0))

Conditions

Not "pythonic":

name = input("Enter your name:")
if name != "":
    ...

"pythonic" version:

name = input("Enter your name:")
if name:
    ...

Chaining comparisons

checking if age lies in the range of 13-19:

13 <= age and age <= 19

short version:

13 <= age <= 19

checking if a and b are both 0 (short version):

a == b == 0

if expressions

An expression that evaluates to one of two possibilities based on a boolean criterion

size = 'small' if length < 100 else 'big'

in other languages this could be written as:

// JavaScript
size = length < 100 ? 'small' : 'big';

For loops

For loops with tuple unpacking

Recap: tuple unpacking

time = (23, 45, 0)

hour, minute, second = time

For loops with tuple unpacking

Enumerating list items:

l = ['Alice', 'Bob', 'Charlie']

for i, name in enumerate(l):
    print(f'{i}: {name}')

Enumerate returns a data structure that behaves like this list:

[(0, 'Alice'), (1, 'Bob'), (2, 'Charlie')]

For loops with tuple unpacking

Listing directory contents (including subfolders) via os.walk:

import os

for directory, dirs, files in os.walk("C:\\"):
    print(f"{directory} {files}")

C:\ []
C:\PerfLogs []
C:\Program Files []
C:\ProgramData []
...

Continue

keyword continue: similar to break, but only skips the rest of the current iteration

example:

for name in os.listdir("."):
    if not name.endswith(".txt"):
        # skip .txt files
        continue
    # process other files here

Comprehensions

List comprehensions

List comprehensions enable the creation of lists based on existing lists

In other programming languages this is often done via map and filter

List comprehensions

Transforming each entry:

names = ["Alice", "Bob", "Charlie"]

uppercase_names = [name.upper() for name in names]

result:

["ALICE", "BOB", "CHARLIE"]

List comprehensions

Filtering:

amounts = [10, -7, 8, 19, -2]

positive_amounts = [amount for amount in amounts if amount > 0]

result:

[10, 8, 19]

List comprehensions

Generic syntax:

new_list = [new_entry for entry in old_list]

new_list = [new_entry for entry in old_list if condition]

Dictionary comprehensions

colors = {
  'red': '#ff0000',
  'green': '#008000',
  'blue': '#0000ff',
}

m_colors = { color: colors[color][1:] for color in colors}

m_colors = {
    name: value[1:] for name, value in colors.items()
}

Exercises

todo list: add functionality to remove completed entries
bank account: get separate lists of all withdrawals / deposits

Exceptions

Types of exceptions

AttributeError, IndexError, KeyError
NameError
TypeError
ValueError
IOError
ZeroDivisionError
...

Exercise: try and trigger all of the above exceptions

Catching exceptions

age_str = input("Enter your age")
try:
    age = int(age_str)
except ValueError:
    print("Could not parse input as number")

Catching exceptions

age_str = input("Enter your age")
try:
    age = int(age_str)
except ValueError as e:
    print("Could not parse input as number")
    print(e)
    print(e.args)

Catching exceptions

Catching multiple types of exceptions:

try:
    file = open("log.txt", encoding="utf-8")
except FileNotFoundError:
    print("could not find log file")
except PermissionError:
    print("reading of file is not permitted")
except Exception:
    print("error when reading file")

Catching exceptions

Using finally:

try:
    file = open("log.txt", "w", encoding="utf-8")
    file.write("abc")
    file.write("def")
except IOError:
    print("Error when writing to file")
finally:
    file.close()

Catching exceptions

Using else:

try:
    file = open("log.txt", "w", encoding="utf-8")
except IOError:
    print("could not open file")
else:
    # no errors expected here
    file.write("abc")
    file.write("def")
file.close()

Python philosophy: EAFP

LBYL: Look before you leap

EAFP: It's easier to ask for forgiveness than permission

(example: parsing numbers)

Raising exceptions

raise ValueError('test')

Re-raising caught exceptions

try:
    ...
except ClientError as e
    if "DryRunOperation" not in str(e):
        raise

Custom exceptions

We can define custom exceptions as subclasses of other exception classes:

class MoneyParseException(Exception):
    pass

raise MoneyParseException()

Custom modules: advanced

Module as a directory:

- foo/
  - __init__.py

# __init__.py
a = 1
b = 2

Custom modules

Module as a directory with separated defintions:

- foo/
  - __init__.py
  - _a_mod.py
  - _b_mod.py

# __init__.py
from foo._a_mod import a
from foo._b_mod import b

Resolving imports

To see all search paths for imports:

import sys
print(sys.path)

Compilation of modules

Imported modules will be saved in a compiled form, making subsequent loading of the modules faster.

Compiled versions will be saved in the folder __pycache__

Module name and entrypoint

inside a an imported module, the variable __name__ gives its name

if a Python file was run directly instead of being imported, its __name__ will be "__main__"

if __name__ == "__main__":
    print("this file was run directly (and not imported)")

Package versions and virtual environments

Package versions

installing a package via PIP:

pip install cowsay

installing a specific version:

pip install cowsay==6.1

installing a compatible version (this could also install versions 6.2, 6.3, etc. - if they are available):

pip install cowsay~=6.1

Virtual environments

virtual environments: allow for installing different dependencies and dependency versions for different projects

Virtual environments

creating a virtual environment (typically named ".venv"):

python -m venv .venv

will create a new folder ".venv/" which contains the virtual environment

Virtual environments

activating an environment on Windows:

./.venv/Scripts/activate

deactivating a venv:

deactivate

if necessary: enable execution of local scripts on Windows - from an admin terminal:

Set-ExecutionPolicy -ExecutionPolicy RemoteSigned

Dependency lists

project dependencies (requirements) can be listed in a requirements.txt file

example requirements.txt:

cowsay~=6.1
requests~=2.30

Dependency lists

installation from requirements.txt:

pip install -r requirements.txt

Sharing and publishing packages

Python code can be "packaged" and shared with other users / other projects

Sharing and publishing packages

steps:

create a project with a specific directory structure
define project and packaging data in pyproject.toml
create a distribution package file from it (.whl or .tar.gz)
share with others
- upload to PyPI
- share the package directly
install it via PIP

Sharing and publishing packages

more detailed resource:

https://packaging.python.org/en/latest/tutorials/packaging-projects/

Functions

Arbitrary number of Arguments (args / kwargs)

def foo(*args, **kwargs):
    print(args)
    print(kwargs)

foo("one", "two", x="hello")
# args: ("one", "two")
# kwargs: {"x": "hello"}

args will be a tuple, kwargs will be a dictionary

Arbitrary number of Arguments (args / kwargs)

Task: recreate range() by using a while loop

Unpacking of parameter lists

numbers = ["one", "two", "three"]

# equivalent:
print(numbers[0], numbers[1], numbers[2])

print(*numbers)

Global and local scope

global / nonlocal

change the behavior of assignments

Global and local scope

Example: rock, paper, scissors

import random
wins = 0
losses = 0
def play_rock_paper_scissors():
    player = input("rock, paper or scissors?")
    opponent = random.choice(["rock", "paper", "scissors"])
    if player == opponent:
        pass
    elif (
        (player == "rock" and opponent == "scissors")
        or (player == "paper" and opponent == "rock")
        or (player == "scissors" and opponent == "paper")
    ):
        global wins
        wins += 1
    else:
        global losses
        losses += 1
while input("play? (y/n)") != "n":
    play_rock_paper_scissors()
print(f"won: {wins}, lost: {losses}")

Global and local scope

A better alternative to the global keyword is often to create a class:

import random
class RockPaperScissors():
    def __init__(self):
        self.wins = 0
        self.losses = 0
    def play(self):
        ...
    def run(self):
        while input("play? (y/n)") != "n":
            self.play()
        print(f"won: {wins}, lost: {losses}")

Object references and mutations

Reacap: What will be the value of a after this code has run?

a = [1, 2, 3]
b = a
b.append(4)

Object references and mutations

An assignment (e.g. b = a) assigns a new (additional) name to an object.

The object in the background is the same.

Object references and mutations

The statement b = a creates a new reference that refers to the same object.

Operations that create new references:

assignments (b = a)
function calls (myfunc(a) - a new internal variable will be created)
insertions into collections (e.g. mylist.append(a))
...

Object references and functions

Passing an object into a function will create a new reference to that same object (call by sharing).

def foo(a_inner):
    print(id(a_inner))

a_outer = []
foo(a_outer)
print(id(a_outer))

Side effects and pure functions

pure functions: functions that only interact with their environment by receiving parameters and returning values

side effects: actions of a function that change the environment

Side effects and pure functions

common side effects:

changing entries in self in an object method
input / output (interacting with the disk / network / ...)

side effects that are usually avoided:

changing arguments that were passed in
setting / changing global variables

Pure functions

advantages of pure functions:

easier to describe / reason about
easier to reuse
easier to test

Side effects

example of a suboptimal function that changes an argument (formats):

def list_files_by_formats(path, formats):
    if "jpg" in formats:
        formats.append("jpeg")
    files = []
    for file in os.listdir(path):
        for format in formats:
            if file.endswith("." + format):
                files.append(file)
                break
    return files

Side effects

formats = ["jpg", "png"]

print(list_files_by_formats(formats))

print(formats)

# will print: ["jpg", "png", "jpeg"]

Side effects

more "correct" implementation:

def list_files_by_formats(path, formats):
    if "jpg" in formats:
        formats = formats.copy()
        formats.append("jpeg")
    # ...

Side effects

this would be an anti-pattern (a function that modifies arguments):

mylist = [2, 1, 3]

sort(mylist)
print(mylist)
# [1, 2, 3]

Side effects

actual possibilites for sorting in Python:

pure function:

print(sorted(mylist))

method that modifies data:

mylist.sort()
print(mylist)

Mutating default arguments

Unexpected behavior in Python when default arguments are mutated:

def list_files_by_formats(path, formats=["gif", "jpg", "png"]):
    if "jpg" in formats:
        formats.append("jpeg")
    # ...

list_files_by_formats(".")
# formats: ["gif", "jpg", "png", "jpeg"]

list_files_by_formats(".")
# formats: ["gif", "jpg", "png", "jpeg", "jpeg"]

(web search: mutable default arguments)

Python versions

Python 2 vs Python 3

Strings and Bytes

major change in Python 3:

strict separation of text (strings) and binary data (bytes)

in Python 2: data types bytes, str and unicode

Print

Python 2:

print "a",

Python 3:

print("a", end="")

Division

Python 2:

10 / 3    # 3

range

in Python 2: range() returns a list, xrange() returns an object that uses less memory

in Python 3: range() returns an object that uses less memory

input

in Python 2: input() will evaluate / execute the input, raw_input() returns a string

in Python 3: input() returns a string

future imports

Getting some of the behavior of Python 3 in Python 2:

from __future__ import print_function
from __future__ import unicode_literals
from __future__ import division

Python-Future

Compatibility layer between Python 2 and Python 3

Enables supporting both Python 2 and Python 3 from the same codebase