Monday, April 16, 2012

Neo-Rpython

Restricted Python (Rpython) has all the limitations you would expect from a static language,
and not very hard to adapt to. However, some key language features remain missing, like: managed attributes and operator overloading.
As you will see in this post, meta-programming can easily lift these limitations, and redefine what Rpython is.



The most direct type of meta-programming is to simply write Python code that generates other Python code and eval or exec the new code, I use this technique in this post. Another technique is to parse and modify byte-code directly, using a library like BytePlay.

The most powerful meta-programming technique is to use PyPy's flow-graph and directly modify it. When using the interactive translator (pypy.translator.interactive.Translation) it parses the byte-code of a function into a flow-graph. The flow-graph has a very simple design that was laid out almost decade ago in October 2003 at a PyPy sprint in Berlin.

The flow-graph is composed of: blocks, operations, and links. The primary feature of the flow-graph is that it is easy to modified in-place, and traverse it to find all the types of each instance. This provides a way to find the class of an instance, and introspect it to check for managed attributes and operator overloading methods like: __call__, __getitem__, __setitem__.

Example - Managed Attributes



class A(object):
def set_myattr(self,v): self.myattr = v
def get_myattr(self): return self.myattr
myattr = property( get_myattr, set_myattr )

def func(arg):
a = A()
a.myattr = 'foo'
s = a.myattr
a.myattr = s + 'bar'
return 1

T = pypy.translator.interactive.Translation( func )


Initial FlowGraph


When a Translation instance is created it parses the byte-code into the initial flow-graph, at this stage the flow-graph needs to be static, but it is not yet required to be strict-Rpython.

v0 = simple_call((type A))
v1 = setattr(v0, ('myattr'), ('foo'))
v2 = getattr(v0, ('myattr'))
v3 = add(v2, ('bar'))
v4 = setattr(v0, ('myattr'), v3)


Trying to translate the above flow-graph will fail at the annotation stage, throwing an error that "myattr" degenerated to SomeObject.
The degeneration error is caused by the class using "property( get_myattr, set_myattr )" to manage the "myattr" attribute.
To make this work we must do the following:

  1. traverse the initial flow-graph and check all setattr/getattr operations

  2. modify the flow-graph in place to make it strict-Rpython compatible

  3. delete the "property" from the class.



Modified FlowGraph


The flow-graph below has been modified to make it strict-Rpython.
The "setattr(v, myattr, foo)" operation is replaced by two operations:

  1. get a pointer to the setter method

  2. call the method passing "foo"



v0 = simple_call((type A))
v5 = getattr(v0, ('set_myattr'))
v1 = simple_call(v5, ('foo'))
v6 = getattr(v0, ('get_myattr'))
v2 = simple_call(v6)
v3 = add(v2, ('bar'))
v4 = simple_call(v5, v3)


The flow-graph is now ready to pass the next steps in the translation process: annotation and rtyping.
Using this same technique of operation swapping, it becomes easy to add support for operator overloading.

Example - Operator Overloading



class A(object):
def __getitem__(self, index): return self.array[ index ]
def __setitem__(self, index, value): self.array[ index ] = value
def __init__(self): self.array = [ 100.0 ]

def func(arg):
a = A()
a[0] = a[0] + a[0]
return 1


Initial FlowGraph



v0 = simple_call((type A))
v1 = getitem(v0, (0))
v2 = getitem(v0, (0))
v3 = add(v1, v2)
v4 = setitem(v0, (0), v3)


Modified FlowGraph



v0 = simple_call((type A))
v5 = getattr(v0, ('__getitem__'))
v1 = simple_call(v5, (0))
v2 = simple_call(v5, (0))
v3 = add(v1, v2)
v6 = getattr(v0, ('__setitem__'))
v4 = simple_call(v6, (0), v3)


source code

No comments:

Post a Comment