robb.re

My Understanding of Pyramid Traversal

Posted on February 9, 2015

I always forget how this works and so here is a brief reminder to self. This is not intended to be a useful tutorial for a newcomer to pyramid and is just a dump of my thoughts to refresh my memory.

First thing to note: traversal seems to be designed for situations where the entire tree is known up front which doesn’t really fit 100% with our (Lost Property’s) database-backed setup.

Second thing to note: traversal is a pretty nice feature. Most python web frameworks pass arguments to a view callable at which point the view uses those arguments to look up an appropriate object from a datastore of some kind and then perhaps make some permission checks. Pyramid’s traversal system can perform the object lookup (and permission checks if you are using auth) before passing that object to the view as its argument. This has a few nice side effects - it removes the boilerplate code of getting an object and returning a 404 if no matching object exists and it makes view callables (potentially) more reusable. Since a view is not coupled to retrieval of a specific object you can potentially reuse it with many different objects.

Using traversal

Traversal makes use of classes often, but not always, referred to as a Resource. For the purposes of this document we’ll use the term Resource too.

A pyramid application that uses traversal will by default use a default root Resource but this can be overridden on either a global level or per route level by passing a root_factory callable which returns an appropriate Resource to the Configurator object:

config = Configurator(root_factory=Root)

A Resource is a dict-like class that has a reference to its parent (which would be None for the case of the root Resource in the tree):

class ExampleResource(object):
  __name__ = ''
  __parent__ = None

  def __init__(self, request):
      self.request = request

  def __getitem__(self, key):
      if key == 'tea':
          resource = Beverage(self.request)
          resource.__parent__ = self
          return resource
      return None


class Beverage(object):
    __name__ = 'beverage'
    __parent__ = None


config = Configurator(root_factory=ExampleResource)

An example of how this works can be seen if we use a vivifying dictionary:

import collections

def vivify():
    return collections.defaultdict(vivify)

Now if we import (or define) this function into the repl:

>>> di = vivify()
>>> di['lemonade']['was']['a']['popular'] = 'drink'
>>> di['lemonade']['was']['a']['popular']
'drink'

We can define a root_factory that uses this dict like so:

def root_factory(request):
    return di

When a request is made, pyramid converts that request’s PATH_INFO into a dictionary lookup against the current Resource. Remember, as alluded to before, that it’s possible to use defined specific routes that use specific factories and the current Resource may not be the root Resource but instead the one registered with the route that best matches that PATH_INFO. This means that a request to /lemonade/was/a/popular gets converted to root_factory(request)['lemonade']['was']['a']['popular'] and, using the dict above, a context (in this case the string drink) is returned. This is then passed to your view callable:

def beverage_view(context, request):
    assert context == 'drink'

If Resource.__getitem__ returns None what happens next is determined by the application’s configuration. When using an authentication policy (which in turn demands the use of an authorisation policy) you will most likely see a 403 response. I’m yet to work on a pyramid project that doesn’t use auth so at one point I thought that this was the default behaviour of pyramid. What actually happens is that None is returned as the context object and, before the context is passed to the view, the authentication system checks for the presence of an __acl__ attribute that will indicate the privileges associated with this particular context. Of course None doesn’t have any __acl__ attributes which in turn means this context doesn’t have any privileges. No privileges means that this context cannot be viewed and so a 403 is returned. When not using an authentication policy then None will be passed to your view callable as context.

If Resource.__getitem__ raises KeyError then the traversal of the current path is halted. The means that Resource itself is returned instead of the whatever object would have been found at Resource['some_key']. More often than not (on the projects I’ve worked on at least) this results in a 404 as KeyError would indicate that an attempt has been made to traverse to a Resource that is not part of your tree and so there will not be a view configured to handle this event.

Traversal can also stop when a non-final Resource in the tree raises AttributeError due to not having a __getitem__ method. When this happens the current object is used as context. Note that pyramid’s traversal system does something akin to:

fn = obj.__getitem__
next = fn(segment)

which means that you cannot trigger the aforementioned AttributeError behaviour by raising that exception inside of the __getitem___ method as AttributeError is caught and handled before the __getitem__ method is actually called. Raising KeyError will achieve the exact same behaviour as AttributeError.

In the instance where all of the path elements have been consumed it’s said that the view name is ''. It is also possible for there to be unconsumed elements of the path remaining even after the context has been selected. These additional elements are then used to set the view name appropriately. As a contrived example, if we were to add an additional component to the above url like so - /lemonade/was/a/popular/edit with the same Resource as before the traversal would still end in the same place (the key popular) and return the same context but the view name would be edit. This brings us to a nice feature of pyramid’s views - there is not a 1-2-1 relationship between a url (or route in pyramid parlance) and a view callable. Most python web frameworks define views and routes as a pair where a url is defined (perhaps with dynamic components) and then mapped to one view callable. Pyramid has support for this pattern via its routes system but it’s also possible to do things in a different way using the predicate system. If an incoming request is matched (either via the Resource tree or a explicitly defined route) it is possible to have defined more than one view that could handle that request. Which view is ultimately used is determined by Pyramid’s predicate system. When a view is registered there are several criteria which can be used to specify which requests it should match. A simple example is request_method - you can define two views that respond to /login but define one to respond to GET requests and the other to respond to POST requests. This neatly sidesteps the boilerplate of:

if request.method == 'POST':
    # do something
    return SomeResponse()
return TheOtherResponse()

that we would see if the same callable handled both GET and POST.

Getting back to our view name example we were talking about previously we can define two views, one named and one unnamed:

@view_config(renderer='templates/beverage_detail.html')
def detail(context, request):
    pass

@view_config(name='edit', renderer='templates/beverage_edit.html')
def edit(context, request):
    pass

Note that one view is named whereas the other is not and so a url such as /lemonade/was/a/popular/edit will look for a view named edit whereas /lemonade/was/a/popular/ will look for an unnamed view. Note that both views will receive the same context object as their first argument.

There are many ways in which pyramid can select a view to use with a matching route so consult the docs for more information but to quickly illustrate another:

@view_config(context=int,
             renderer='templates/beverage_int.html')
def int_view(context, request):
    pass

@view_config(context=str,
             renderer='templates/beverage_str.html')
def str_view(context, request):
    pass

By providing the context argument it is possible to define different views to use based on the class (or interface) of the context object. Perhaps handling int and str differently isn’t the best example but what if you had a traversal system that returned User objects? You could potentially return a different class of User from your resource depending on the current user’s privilege level and from there use a different view callable for each different kind of user. This could make it easy to, for example, always keep admin-only actions away from a limited user or perhaps always redirect suspended users to a warning page.