Sane Python Packaging



Sane Python Packaging

Received a couple iMessages:

Write a blog post about sane package/module management development in a python app

I can't find one

So let's give it a try!

As an example let's take something I made for an interview once, a little API->Postgres data engine app.

Here's a little thing that pulls some data requested by the interviewer. We pull information about congressional bills containing some query string, as well as profiles of their sponsors: https://github.com/bhtucker/sunroof

Developing

Let's pretend you want to collaborate on this and you're on OSX.

First, get virtualenvwrapper. If you're going to do some Python2 and some Python3, do this. Install it globally, then invoke mkvirtualenv sunroof.

Now, clone this repo and run pip install -e . This registers your package with your virtualenvironment's site-packages. You can verify this with something like:

ls ~/.virtualenvs/sunroof/lib/python2.7/site-packages/sunroof*
/Users/bhtucker/.virtualenvs/sunroof/lib/python2.7/site-packages/sunroof.egg-link

As long as you're in your virtualenv, you'll be able to import your library and its modules.

Modules

You're going to implement a feature, and you're going to do it in a nice module, yay!

mkdir sunroof/services
touch sunroof/services/__init__.py

That's a module.

I usually would make a file for my thing then. Such as cool.py. And write something cool there.

That package could have imports like: import math</p>

import sqlalchemy

from sunroof.models import bills, legislators

(FYI, stdlib -> libraries -> your package is a canonical import ordering)

Then you write something cool.

def _some_private_cool():
    assert (bills, legislators)

def something_cool():
"""Stuff"""
math.ceil(2)
_some_private_cool()
assert sqlalchemy

This is now usable from wherever your outer package is installed.

Simply: from sunroof.services.cool import something_cool

You can even let an object be importable from the services layer by adding to your sunroof/services/__init__.py:

from .cool import something_cool

Enabling this type of jazz:

(sunroof) Bensons-MacBook-Air:sunroof bhtucker$ cd 
(sunroof) Bensons-MacBook-Air:~ bhtucker$ python
Python 2.7.10 (default, Oct 23 2015, 19:19:21) 
[GCC 4.2.1 Compatible Apple LLVM 7.0.0 (clang-700.0.59.5)] on darwin
Type "help", "copyright", "credits" or "license" for more information.
>>> from sunroof import services
>>> services.cool.something_cool
<function something_cool at 0x102e6bb18>
>>> services.something_cool
<function something_cool at 0x102e6bb18>
>>> from sunroof import services
>>> services.cool
<module 'sunroof.services.cool' from '/Users/bhtucker/code/github/sunroof/sunroof/services/cool.pyc'>
>>> import sunroof.services
>>> sunroof.services.cool
<module 'sunroof.services.cool' from '/Users/bhtucker/code/github/sunroof/sunroof/services/cool.pyc'>

So now we believe we can at least write modules and objects and make sure Python is finding them the way we want.

Avoiding Circular Dependencies

But, you can't turn around and write into models.py:

from sunroof.services import cool

There needs to be some possible ordering for Python to read the modules with all their imports satisfied.

So if I'm making my database connection in main but I want to query in services without passing the connection in to every interface, I need a singleton.

These live inside some module. Often you can find them in an __init__.py like I outlined above. (Here's an example where I do that to make a Twitter API client available throughout a package).

We can modify sunroof to make it's database session available this way.

In our outer __init__.py, we add:

from sqlalchemy.orm.session import Session

session = Session()

then in our main function:

    congress_engine = engine.create_engine(SQLALCHEMY_DATABASE_URI)
...
    session.bind = congress_engine

happily we can now import this into some module that wants to actually do database stuff. So, we could go back to cool.py and add:

from sunroof import session

def get_sponsor_ids():
return [
x[0] for x in session.query(bills.c.sponsor_id.distinct()).all()
]

which we can import into main and use there without any circular dependency problems.

Other issues?

I'm curious what real world Python circular dependency problems you can get into that don't lend themselves to either of these approaches. Once you get an application going you rarely have these issues, but obviously that can be a big blocker!

Get at me on twitter @lavabenson