Monday, September 30, 2013

virtual worlds without scalability limits

There are many 3D virtual world libraries and engines that claim to be high performance and highly scalable, often they tout being able to scale to 100's of simultaneous users per-server, this is basically nothing. These solutions fail because they use the classic client/server model and centralized database. Take for example EveOnline, they have one of the most optimized MMO systems, yet even with huge amounts of RAM and a SSD backed SQL database, they only scale to tens of thousands of concurrent users.

The Solution - Peer2Peer Networking

The solution to scale without limits is pretty simple, drop the client/server model and use the peer2peer model like bittorrent. I was surprised to find only a few projects have tried to use this technique: Solipsis, VAST, and NICTA's research project.

Voronoi Overlay Network

Vast has a smart solution to manage communication between peers that can do spatial queries in the same amount of time regardless of how many peers are connect to the network. This amazing research was pioneered primarily by Shun Yun Hu starting way back in 2004. Do not miss reading his research papers.

Friday, September 27, 2013

PythonScript - Optimizations

Python is a very dynamic language and not easy to optimize. Ryan Kelly has a solution for this, he is the creator of library called Promise. Promise allows you to declare your functions are: invariant, constant, pure, and sensible. This removes many of the dynamic features of Python to improve performance using runtime byte-code optimizations. Using similar restrictions we can optimize the output of the PythonScript compiler to gain much higher performance.

PythonScript already performs quite well on purely numeric code, because this is translated into pure Javascript without any extra function calls. The Browser's JIT is able to unbox the purely numeric code and run it many times faster than regular Python. With non-numeric code PythonScript performs quite badly, running at least five times slower than regular Python. Each attribute access is wrapped with the get_attribute PythonJS function to properly emulate Python's dynamic model where an attribute can be class-level or instance-level. Making things slower still, each function call requires significant overhead in order to emulate Python's calling conventions, packing and unpacking: args, keyword args, variable args, variable keyword args.


To speed up attribute access we can use a new class decorator @inline, this tells the PythonScript compiler to bypass get_attribute and directly return the attribute from the instances __dict__. The compiler is smart enough to know that if the attribute is not found on the instance, and the class has defined a custom __getattr__ method, to directly inline and call __getattr__. The compiler knows which attributes are to be found on the instance by introspecting its methods, and checking for each time an attribute is assigned to self.

assert isinstance

In order for the PythonScript compiler to perform these inline optimizations, it needs to know which class an instance is. When a class instance is created, it knows that variable name is the given class, however if the instance is passed to a function that returns it and assigns it to a new variable, the compiler will no longer know what type it is. The workaround to manually define the type of an instance is using assert isinstance( a, SomeClass ). Another way to declare the type of a variable is using PythonScript's special var() function and using keyword arguments like this: var(x=SomeType), this feature was added in this commit.

class A:

a = A()
b = a ## for this simple case, the compiler still knows that "b" is an instance of A

def opaque(x):
  return x

c = opaque( b )  ## the compiler is not sure what "c" is
assert isinstance(c, A) ## this informs the compiler that "c" is an instance of A
... ## any following code using "c" will have inline optimizations.

Closure Compiler

In my fork of PythonScript I have changed it to be compatible with the Closure compiler. Using its "Advanced Optimizations" option, it is able to inline the body of simple functions, this provides another level of in-lining that can speed up the code even more. Closure is able to compress and optimize the code in other ways as well.


The test code below compiled with PythonScript+Closure and run in Google Chrome, is faster than PyPy by two times, and 32X faster than normal Python.
def somefunc():
  return 1

class A:
  def __init__(self):
    self.x = 1
    self.y = 2
    self.z = 3

  #def __getattr__(self,name):
  def __getattr__():
    ## a method that takes no arguments tells PythonScript
    ## to optimize the call for no arguments
    return random()

a = A()

now = time()
r = 0.0
i = 0
while i < 1000000:
  r += a.some_dynamic_attribute
  i += somefunc()

print( time()-now )

Monday, September 23, 2013

PythonScript - Operator Overloading

In the previous post I introduced Amirouche's awesome PythonScript. In this post I will explain how I modified the first-stage compiler to support simple operator overloading for attribute access "." and subscript "[]" operators. This allows you to write classes using the special methods: __getattr__ and __getitem__.

PythonScript compilation to JavaScript is a two stage process. The first stage parses the original Python code and converts it into a lower subset of Python that matches the semantics of JavaScript, so called: "PythonJS". The second stage takes PythonJS code (still valid Python) and converts it into JavaScript.
To add operator overloading only required changing the first stage of the compiler, see my commits: [1], [2]. The AST NodeVistor has been changed to keep track of: class names, special methods, and instances of the classes. The logic that inspects the AST and tracks class instances is too simple at the moment, and only understands: "a = SomeClass()". Then, as the rest of the AST is traversed, any references to "a" that use attribute access "." or subscript "[]" will be replaced with a PythonJS style function. In the example below, assume that class A has defined a special __getattr__ method.


a = A()
b = a.x + a.y


a = get_attribute(A, "__call__")();
b = __A___getattr__([a, "x"]) + __A___getattr__([a, "y"]);

Sunday, September 22, 2013

In-browser Python

There are at least fourteen different in-browser implementations of Python. Dan Stromberg has put together a list that compares their basic features, here. Some of these implementations are simple AST compilers that translate a subset of Python into a subset of Javascript. Other implementations, like Skulpt, try to fully reimplement the Python language, intepreter and VM, and standard library.

Translation Test

a = 1
while a < 100:
  a += 1


Skulpt has one of the most complete implementations of Python, but this comes at a high price in terms of performance. Emulating the Python interpreter and VM in javascript produces alot of overhead. The translation test above translates into the following javascript.
Sk.execStart = new Date();
var $scope0=(function($modname){var $blk=0,$exc=[],$gbl={},$loc=$gbl,$err=undefined;$gbl.__name__=$modname;Sk.globals=$gbl;try { while(true){try{ switch($blk){

case 0: /* --- module entry --- */
// line 1:
// a=1
// ^

Sk.currLineNo = 1;
Sk.currColNo = 0
Sk.currFilename = '<stdin>.py';

$loc.a=new Sk.builtin.nmber(1,'int');
// line 2:
// while a < 100:
// ^

Sk.currLineNo = 2;
Sk.currColNo = 0
Sk.currFilename = '<stdin>.py';

if (Sk.execStart === undefined) {Sk.execStart=new Date()}if (Sk.execLimit != null && new Date() - Sk.execStart > Sk.execLimit) {throw new Sk.builtin.TimeLimitError(Sk.timeoutMsg())}$blk=1;/* jump */continue;throw new Sk.builtin.SystemError('internal error: unterminated block');

case 1: /* --- while test --- */
var $loadname1=$loc.a!==undefined?$loc.a:Sk.misceval.loadname('a',$gbl);var $compareres2=null;var $compare3=Sk.builtin.bool(Sk.misceval.richCompareBool($loadname1,new Sk.builtin.nmber(100,'int'),'Lt'));$compareres2=$compare3;var $jfalse4=($compare3===false||!Sk.misceval.isTrue($compare3));if (Sk.execStart === undefined) {Sk.execStart=new Date()}if (Sk.execLimit != null && new Date() - Sk.execStart > Sk.execLimit) {throw new Sk.builtin.TimeLimitError(Sk.timeoutMsg())}if($jfalse4){/*test failed */$blk=4;continue;}if (Sk.execStart === undefined) {Sk.execStart=new Date()}if (Sk.execLimit != null && new Date() - Sk.execStart > Sk.execLimit) {throw new Sk.builtin.TimeLimitError(Sk.timeoutMsg())}$blk=4;/* jump */continue;throw new Sk.builtin.SystemError('internal error: unterminated block');

case 2: /* --- after while --- */return $loc;throw new Sk.builtin.SystemError('internal error: unterminated block');

case 3: /* --- while body --- */
// line 3:
//   a += 1
//   ^

Sk.currLineNo = 3;
Sk.currColNo = 2
Sk.currFilename = '<stdin>.py';

var $loadname6=$loc.a!==undefined?$loc.a:Sk.misceval.loadname('a',$gbl);
var $inplbinop7=Sk.abstr.numberInplaceBinOp($loadname6,new Sk.builtin.nmber(1,'int'),'Add');

if (Sk.execStart === undefined) {Sk.execStart=new Date()}if (Sk.execLimit != null && new Date() - Sk.execStart > Sk.execLimit) {throw new Sk.builtin.TimeLimitError(Sk.timeoutMsg())}$blk=1;/* jump */continue;throw new Sk.builtin.SystemError('internal error: unterminated block');

case 4: /* --- done --- */
var $jfalse5=($compareres2===false||!Sk.misceval.isTrue($compareres2));
if (Sk.execStart === undefined) {Sk.execStart=new Date()}if (Sk.execLimit != null && new Date() - Sk.execStart > Sk.execLimit) {throw new Sk.builtin.TimeLimitError(Sk.timeoutMsg())}if($jfalse5){/*test failed */$blk=2;continue;}if (Sk.execStart === undefined) {Sk.execStart=new Date()}if (Sk.execLimit != null && new Date() - Sk.execStart > Sk.execLimit) {throw new Sk.builtin.TimeLimitError(Sk.timeoutMsg())}$blk=3;/* jump */continue;throw new Sk.builtin.SystemError('internal error: unterminated block');} }catch(err){if ($exc.length>0) { $err = err; $blk=$exc.pop(); continue; } else { throw err; }} } }catch(err){ if (err instanceof Sk.builtin.SystemExit && !Sk.throwSystemExit) { Sk.misceval.print_(err.toString() + '\n'); return $loc; } else { throw err; } } });
$scope0(new Sk.builtin.str('__main__')); 

As we can clearly see above, Skulpt is not a practical option for performance critical applications.


PythonScript by Amirouche is a Python to Javascript compiler that can output very efficient javascript. This project is still in the early stages, but it has already has good support for a subset of Python. Compared to the other translators, PythonScript has simplest design, least lines of code, and is easy to modify. Amirouche states on his blog he implemented the first version in less than 25 hours! PythonScript is able to translate same test code used in the Skulpt test above into this fully optimized javascript.
a = 1;
while (a < 100) {
  a += 1;
Note that Amirouche's original PythonScript is missing support for inplace assignment, but this was very easy to add, get my fork here.