Py4J 0.6 Released!

Py4J 0.6 has just been released.

This release includes the following new (and great) features:

  • New exception, Py4JJavaError, that enables Python client programs to access instance of Java exception thrown in the Java client code.
  • Improved Py4J setup: warnings are no longer displayed when installing Py4J.
  • Bug fixes and API additions.

In case you did not notice, Py4J moved to github so contributing is now easier than ever!

I plan to do at least another release (0.7) before Py4J leaves beta and moves to 1.0.

About Py4J
Py4J enables Python programs running in a Python interpreter to dynamically access Java objects in a Java Virtual Machine. Methods are called as if the Java objects resided in the Python interpreter and Java collections can be accessed through standard Python collection methods. Py4J also enables Java programs to call back Python objects. Py4J is distributed under the BSD license.

Py4J and Exceptions

Py4J 0.6 is almost ready to be released, thanks to Jakub L. Gustak who submitted important bug reports, feature requests, and patches. I have been trying to polish Py4J in the latest releases to make the API more consistent and predictable and the biggest “feature” of 0.6 will no doubt be how Py4J treats Exceptions.

Currently, exceptions can be raised in four places: (1) in the Py4J Python code, (2) in the Py4J Java code, (3) in the Java client code, and (4) in the network stack. An exception might be raised in the Py4J code if the client code is not correct, for example, if the client tries to call from Python a Java method that does not exist. Before 0.6, Py4J raised a Py4JError in cases 1,2,3 and a Py4JNetworkError (a subtype of Py4JError) in case 4. Moreover, if the Java exception was raised on the Java side, the Java stack trace was copied, as a string, in the Py4JError.

There are two issues with this approach. First, the client does not have access to the exception instance on the Java side, and this exception may have some important fields and methods that can help the error recovery. Second, it is very difficult for the client to determine at runtime the source of the error.

Starting from 0.6, Py4J will raise three types of exceptions: Py4JNetworkError in case #4, Py4JJavaError in case #3, Py4JError in cases #1 and #2. Py4JNetworkError and Py4JJavaError will be a subtype of Py4JError (so a client can implement a catch all). Py4JJavaError will also have a method that will return the instance of the Java exception and Py4JError will still display the Java stack trace for case #2.

Stay tuned for 0.6!

 

Py4J code moves to GitHub

It is now official. Py4J code has been moved to GitHub. The issues have been transferred there too thanks to a handy trac-to-github python script. I will be shutting down the trac instance on SourceForge. The web site, mailing list, and alternate download site will still remain on SourceForge (Py4J has always been available on pypi as well).

The decision to move to a Distributed Version Control System (DVCS) was a no brainer. I tried mercurial for another open source project I work on, Qualyzer, and DVCS are clearly superior when it comes to merging and enabling collaboration. The speed boost is also quite welcome (thank you local commit).

The main question was thus: mercurial or git? Actually, the real question was, bitbucket or github? This was not an easy decision, but in the end, the numerous outages and bugs of bitbucket and the higher number of potential collaborators on github convinced me to go with github. In addition, github seems speedier and they developed a tool, hg-git, that allows mercurial users to work with github repositories.

I must say that I was disappointed with the issue tracker at github which is slightly inferior to the bitbucket tracker (only tags, no milestones, no automatic reference between commit and issues except when closing issues, no complex queries, etc.) and I just learned that their wiki engine was only recently moved to an implicit git repository: I believe bitbucket wikis have always been backed by a mercurial repository… Anyway, I just wanted to say this because I have read many posts bashing bitbucket over github and I did not want to sound like one of them…

 

 

Py4J 0.5 Released!

Oh Yeah! It’s another release of Py4J and there are many interesting features that made their way in:

  • The ability to import packages (e.g., java_import(gateway.jvm, 'java.io.*'))
  • Support for pattern filtering in JavaGateway.help() (e.g., gateway.help(obj,'get*Foo*Bar'))
  • Support for automatic conversion of Python collections (list, set, dictionary) to Java collections.
  • Two Eclipse features: one embeds the Py4J Java library. The other provides a default GatewayServer that is started when Eclipse starts. Both features are available on the new Py4J Eclipse update site: http://py4j.sourceforge.net/py4j_eclipse
  • New module decomposition: there are no more mandatory circular dependencies among modules.
  • Have a look at the Trac 0.5 milestone for a detailed description of the issues fixed in this release!

This feature marks the return to a predictable and regular schedule: one release every two months. I am particularly proud that I could make this release given my crazy and stressful schedule these last days 😉

I should restate that I use Py4J every day since version 0.4 and most of the features that made it painful to use are now completed. The next release will focus on source code cleanup (be more pythonic), a few corner case features that are difficult to implement but that will make Py4J a killer application, and better unit test organisation. As usual, comments are welcome and if there is any feature you would like to see in the next release, now would be a good time to make your voice heard!

Py4J 0.4 released

Py4J 0.4 has just been released! New features include:

  • Polishing of existing features: fields can be set (not just read), None is accepted as a method parameter, methods are sorted alhabetically in gateway.help(), etc.
  • Java Exception Stack Trace are now propagated to Python side.
  • Changed interfaces member in Callback classes to implements.
  • Internal refactoring to adopt clearer terminology and make Py4J protocol extensible.
  • Many bug fixes: most are related to the callback feature.

Since the beginning of Py4J, I have been trying to release early and often and to release on a regular basis (2 months). Unfortunately, because I have been extremely busy during the summer (hey, we moved to our new condo!), I have made the mistake of pushing back the release date in the hope that it would give me time to complete all the tickets for 0.4.

This is not how it should work. The next release is due in 6 weeks and even if there is only one new bug fix or one new feature, it will be released at that date.

In the meantime, you should go ahead and download Py4J!

Py4J 0.4 Release Delayed

Unfortunately, I must delay the 0.4 release because I have other priorities for now. The tentative release date is by the end of August.

The SVN head, available at https://py4j.svn.sourceforge.net/svnroot/py4j/trunk, contains the most important new features and bug fixes of 0.4 so do not hesitate to use the code from the repository. In fact, I use this code every day for the tool I’m developing for my Ph.D.

As always, comments are welcome!

Py4J 0.3 is now available!

I just released Py4J 0.3, a major milestone. This release adds support for more Java collections such as array and set, but more importantly, it enables Java objects to call Python objects that implements a Java interface.

While working on Py4J, I constantly pushed the hard features and the nasty bug fixes to milestone 0.3. Let’s just say that I’m happy this release is now behind me and I am promoting Py4J to beta because all major features are now implemented and relatively well tested.

Enabling callback required a major rearchitecture of Py4J internals and I had to implement a multi-threaded and multi-socket RMI-like server on both the Python and the Java sides. Fortunately, this is not the first time I am involved in shady businesses like this (I remember a graduate software fault tolerance course where I created a redundant framework relying on an orgy of sockets, threads, and processes).

Another area that caused me some worries was the garbage collection in Python and in Java. It took me dozen of hours to reach a good level of understanding of Python handling of garbage collection, weak references, and circular references. I would have preferred to have spent these hours on Team Fortress 2, but in the end, I am glad I learned something valuable.

Getting this kind of code to work is hard and I expect it will take a few more releases to polish it and to uncover bugs, but I believe the time I took to design the new architecture on paper was worth it and the initial tests are green 🙂

Now, stop reading this post and go download Py4J 0.3!

Welcome to Py4J’s development blog

Welcome to Py4J’s development blog. We will post news and stories about Py4J here. Because Py4J is written in Java and Python, expect a fair amount of ramblings about the difference between these two languages 🙂

If you want to comment on Py4J or discuss the direction the project is taking, do not hesitate to post a comment on this blog, write an email to the mailing list or fill a feature request.