Python/programming

Lecture 2 of 3

Mathijs de Bruin (mathijs@mathijsfietst.nl),  January 7, 2014
The WSL Institute for Snow and Avalanche Research SLF

Available online

Feel free to click along!

Lecture URL QRCode

http://tinyurl.com/slf-python-day2

Program - day 2

  1. Typical gotchas; references, immutables
  2. Best practises
    • Classes, methods and functions; use them!
    • Code formatting; PEP8
    • Don’t repeat yourself; DRY
    • Organizing your code; minimize globals, cleanout imports
    • Comments, Docstrings: documentation
  3. Structuring code, working with modules
  4. Packaging, creating reusable code

Mutable and immutable types

Quite simple

  • Mutable types can be changed
  • Immutable types cannot

But: there's a catch.

Mutable types

  • Lists:
    [1, 2, 3] 
  • Dictionaries:
    {'key': 'value'}
  • Sets:
    set([1,2,3])

Mutable types

Example


>>> # Create a list and a second reference to it
>>> ls = [1, 2, 3]
>>> ls_copy = ls
>>> # Append something to the original list
>>> ls.append(5)
>>> ls == ls_copy
True
>>> # The copy still references the same object, which is updated
>>> ls_copy
[1, 2, 3, 5]
>>> # Create a new list and copy the reference to ls
>>> new_ls = [5, 6]
>>> ls = new_ls
>>> # ls_copy still refers to the old list
>>> ls == ls_copy
False
                    

Immutable types

  • Strings:
    'blah'
  • Tuples:
    (1, 2, 3)
  • Frozen sets:
    frozenset([1, 2,3 ])

Immutable types

Example


>>> bikini_string = 'pink'
>>> marble_string = bikini_string
>>> # This creates a new object
>>> bikini_string += ' green'
>>> # Note: a += 1 is equivalent to a = a + 1
>>> bikini_string
'pink green'
>>> marble_string
'pink'
                    

Object References

Actually, variables are references to objects.

Object References

References are passed in arguments to functions and methods - not objects or variables.


def try_to_change_list_contents(the_list):
    print 'got', the_list
    the_list.append('four')
    print 'changed to', the_list

outer_list = ['one', 'two', 'three']

print 'before, outer_list =', outer_list
try_to_change_list_contents(outer_list)
print 'after, outer_list =', outer_list
                    

Credits: Blair Conrad on StackOverflow

Object References

Consequence: mutable objects can be modified from within methods and functions.


before, outer_list = ['one', 'two', 'three']
got ['one', 'two', 'three']
changed to ['one', 'two', 'three', 'four']
after, outer_list = ['one', 'two', 'three', 'four']
                    

Object References

However

Reassigning objects replaces the references so:


def try_to_change_list_reference(the_list):
    print 'got', the_list
    the_list = ['and', 'we', 'can', 'not', 'lie']
    print 'set to', the_list

outer_list = ['we', 'like', 'proper', 'English']

print 'before, outer_list =', outer_list
try_to_change_list_reference(outer_list)
print 'after, outer_list =', outer_list
                    

Yields:


before, outer_list = ['we', 'like', 'proper', 'English']
got ['we', 'like', 'proper', 'English']
set to ['and', 'we', 'can', 'not', 'lie']
after, outer_list = ['we', 'like', 'proper', 'English']                                 

Program - day 2

  1. Typical gotchas; references, immutables
  2. Best practises
    • Classes, methods and functions; use them!
    • Code formatting; PEP8
    • Don’t repeat yourself; DRY
    • Organizing your code; minimize globals, cleanout imports
    • Comments, Docstrings: documentation
  3. Structuring code, working with modules
  4. Packaging, creating reusable code

Bad code

This is what it looks like.

Grouping code

Classes, methods and functions

  • Isolating dependencies
  • Reducing scope

Leads to

  • More oversight
  • Increased simplicity
  • Less bugs

Functions

When there's a simple input ➞ output relation


import random

def random_integer(start, end=15):
    """
    Return a random integer between start and end,
    the default end being 15.
    """
    print 'Returning integer between %d and %d' % (start, end)

    result = random.randint(start, end)

    return result
                    

>>> random_integer(5)
Returning integer between 5 and 15
6
>>> random_integer(5, 5)
Returning integer between 5 and 5
5
>>> # Optionally: name parameters
>>> random_integer(start=5, end=5)
Returning integer between 5 and 5
5
                    

Classes

Grouping functions 'methods' around objects


class Food(object):
    """ Something you can eat. """

    # Class properties are shared across instances
    destination = 'mouth'

    def __init__(self, name):
        """
        The initialization method is called whenever
        instances are created.
        """
        # Store the name property on the object
        self.name = name

        # Create the object unused
        self.used = False

    def use(self):
        """ Use this particular item, marking it as used. """

        print 'Using %s in %s' % (self.name, self.destination)

        # Instance properties are specific to one instance
        self.used = True
                    

For similar types of data and related behaviour.

Classes

Creating food, and using it too


>>> lunch = Food(name='banana')
>>> lunch.used
False
>>> lunch.use()
Using banana in mouth
>>> lunch.used
True
                    

PEP8

Official Python Style Guide

Because code is read more than it is written.

PEP8

Indentation

Four (4) spaces, no tabs!


if sugar == sweet:
    # Note: there's spaces here!
    print 'Better go to the dentist!'
                    

PEP8

Line length

79 characters on a single line


print (
    'When longer lines are required, simply use ()\'s for concatenation '
    'but watch out with tuples!'
)
                    

PEP8

Imports

  • Always in the top of a file
  • Import statements on their own line, not:
    import os, sys
  • Avoid wildcard imports:
    from mountain import *
  • Group in order:
    1. Python standard library imports
    2. Third party imports
    3. 'Local' project imports

PEP8

Imports; example

From more fundamental to less fundamental.


import os
import sys

# Numpy is more fundamental than matlab
import numpy

# Group utils with the same base together
from matlab import polyplot, somethingelse
from matlab.utils import UselessUtilClass

# '.' imports from local directory (module)
from .utils import FunkyPlotHelper
                            

PEP8

Whitespace

Required and forbidden whitespace.


# Like this
my_dict = {
    'key': 'value',
    'otherkey': 'othervalue'
}
value = value + 5
do_something(1, 2, 3)
spam(ham[1], {eggs: 2})

# Not like this
my_dict = {'key' : 'value' , 'otherkey': 'othervalue'}
value=value+5
do_something(1   ,  2 ,   3 )
spam( ham[ 1 ], { eggs: 2 } )
                    

Refer to PEP8

PEP8

Comments

Write for reading!


def my_function():
    """ What does it do, what does it return? """

    # Note: empty functions require a 'pass' statement
    pass

def other_function():
    """
    This text is a bit longer so it needs to be spread over
    multiple lines. Note that the quotes are on a line by themselves.
    """

    # Now doing something useful
    x = do_something_useful()

    # Use inline comments sparingly
    x = x + 1 # Don't do this
                    

PEP8

Naming conventions

Go for descriptive!


# Good
max_value = 5
min_value = 3

class MyClass(object):
    def square_number(self, n):
        return n*n

# Bad
a = 5
B = 3

class my_Class(object):
    def doit(n):
        return n*n
                    

PEP8

Checks will lead the way!


$ pip install pep8
$ pep8 neat_code.py
neat_code.py:59:80: E501 line too long (96 > 79 characters)
neat_code.py:65:80: E501 line too long (90 > 79 characters)
neat_code.py:82:80: E501 line too long (89 > 79 characters)
neat_code.py:91:80: E501 line too long (80 > 79 characters)
neat_code.py:103:80: E501 line too long (81 > 79 characters)
neat_code.py:133:80: E501 line too long (113 > 79 characters)
neat_code.py:136:80: E501 line too long (103 > 79 characters)
                    

DRY

Don't Repeat Yourself

Don't do stuff more than twice.

Organizing code

  • Write for reading
  • Group functionality
  • Limit globals
  • Cleanout imports

Comments and Docstrings

Care about your successors!

Be verbose, be very verbose!

  • Explicit over implicit
  • What does it do (in human language)?
  • How do I use it?
  • Examples, examples, examples!
  • Ideal: 1/3 comment/code ratio

Program - day 2

  1. Typical gotchas; references, immutables
  2. Best practises
    • Classes, methods and functions; use them!
    • Code formatting; PEP8
    • Don’t repeat yourself; DRY
    • Organizing your code; minimize globals, cleanout imports
    • Comments, Docstrings: documentation
  3. Structuring code, working with modules
  4. Packaging, creating reusable code

Modules

They're easy!

  1. Add an empty __init__.py to any folder:
    $ touch __init__.py
  2. Create a Python file with some functions or classes, for example fromagerie.py.
  3. Done! Locally import using:
    import fromagerie
    or
    from fromagerie import MyClass
  4. Create package and install for global availability.

Modules

Some tips

  • Group similar classes, functions
  • Create a utils.py for the rest
  • Split
    • Data processing (calculating/simulation)
    • Controlling (starting/stopping stuff, user interface, configuration)
    • Helpers (data conversion, utils, ...)

Modules

Example project 'Snowball'


snowball --- __init__.py
          |- main.py
          |- utils.py
          |- climate.py
          `- water --- __init__.py
                    `- phases.py
                    

Project Snowball

main.py

Contains 'main' functionality typically used in external applications. Could also be named core.py or snowball.py.

No clear naming convention here - just the fact that only minimal work should be done from the 'main' module.

Project Snowball

utils.py

Non-specific util functions or classes that are otherwise ungroupable.

Examples: data conversion, common data structure manipulation, file reading wrappers.

Project Snowball

climate.py

Contains climate-related classes and functions.

In this case a WeatherProbe.

Project Snowball

water

Module grouping code related to water physics, like phase-transition logic in water/phases.py which contains WaterVapor, IceCrystal and SnowFlake.

Program - day 2

  1. Typical gotchas; references, immutables
  2. Best practises
    • Classes, methods and functions; use them!
    • Code formatting; PEP8
    • Don’t repeat yourself; DRY
    • Organizing your code; minimize globals, cleanout imports
    • Comments, Docstrings: documentation
  3. Structuring code, working with modules
  4. Packaging, creating reusable code

Packaging

Creating Eggs (Python packages)

  • Self-contained 'libraries' or modules
  • Installable, upgradable
  • Versioned
  • Distributable through:
    • PyPI
    • HTTP/FTP
    • VCS; GitHub/SVN/Mercurial
    • Files

Reference: The Hitchhiker’s Guide to Packaging

Project layout

Example project 'Snowball'


snowball --- setup.py
          |- README.rst
          |- LICENSE.rst
          `- snowball --- __init__.py
                       |- main.py
                       |- utils.py
                       |- climate.py
                       `- water --- __init__.py
                                 `- phases.py
                    

Packaging

The beginning and end:

setup.py


from setuptools import setup, find_packages

setup(
    name = 'snowball',
    version = '0.1dev',
    packages = find_packages(),
    license = 'BSD',
    long_description = open('README.rst').read(),
)
                    

Distributing

Packaging and installing

Creating a distribution (tarball)

$ python setup.py sdist

Install without publishing


$ pip install -e git+https://github.com/dokterbob/snowball.git#egg=snowball
$ pip install http://some.domain/snowball-0.1.tar.gz
                    

Distributing

Publishing

Registering it for PyPI

$ python setup.py register

Uploading and publishing

$ python setup.py sdist bdist_wininst upload

Installing

$ pip install snowball

Licensing

Use BSD, it's simple and great!

Redistribution and use in source and binary forms, with or without modification, are permitted provided that the following conditions are met:

1. Redistributions of source code must retain the above copyright notice, this list of conditions and the following disclaimer.

2. Redistributions in binary form must reproduce the above copyright notice, this list of conditions and the following disclaimer in the documentation and/or other materials provided with the distribution.

3. Neither the name of the SLF nor the names of its contributors may be used to endorse or promote products derived from this software without specific prior written permission.

Reference: Full 3-clause BSD license

Licensing

Other licenses

  • GNU Public License (GPL): 'forces' freedom
  • LGPL: 'forces' a little less freedom
  • AGPL: GPL for hosted applications
  • MIT, Mozilla, Apache: similar to BSD

Reference: choosealicense.com

README.rst

Minimal documentation

  • What is it?
  • How to use it?
  • Getting started
  • Requirements
  • License
  • Author/maintainer

ReStructuredText files

README.rst and LICENSE.rst


Title of file
=============

Subtitle
--------
* List item 1
* List item 2

`Link <https://www.google.com>`_
                    

Similar to Markdown.

See: ReStructuredText Primer

Thanks!

Any questions?