Py4J 0.9 Released

Py4J 0.9 has just been released on pypi and maven central.

This is a backward-compatible release with many features and bugfixes:

  • Python side: constructor parameters have been deprecated in favor of GatewayParameters and CallbackServerParameters. This was necessary because the number of configuration options is growing fast. Old parameters will be supported until Py4J 1.0 (at least two more minor versions).
  • Python side: IDEs and interactive interpreters such as IPython can now get help text/autocompletion for Java classes, objects, and members. This makes Py4J an ideal tool to explore complex Java APIs (e.g., the Eclipse API). Thanks to @jonahkichwacoders
  • Python side: the callback gateway server (necessary for Java to call back Python functions) can be daemonized and can be started after the main JavaGateway is started.
  • Python side: py4j.java_gateway.launch_gateway has now a cleaner implementation that discards stdout and stderr output by default. It is also possible to redirect the output from these channels to separate files, deques, or queues. Thanks to @davidcsterratt for finding the root cause and work on the fix.
  • It is now possible to install Py4J from git with pip: pip install git+https://github.com/bartdag/py4j.git. This has been working for some time but was never officially announced before.
  • The Eclipse components of Py4J have been moved to another repository. Existing forks and pull requests can still use the @before-eclipse-split branch until Py4J reaches 1.0. Fixes won’t be backported to this branch, but pull requests will be merged by the main maintainer to @master if requested.
  • Major cleanup of Python source code to make it fully flake8 (pep8 + pyflakes) compliant. This should be easier to contribute now.
  • Major test cleanup effort to make Python tests more reliable. Testing Py4J is difficult because there are many versions of Python and Java to test and Python 2.6 lacks many interesting test features. Effort to make tests even more robust will continue in the next milestone.
  • We introduced a contributing guide and an implicit contributor license agreement that indicates that anyone contributing to Py4J keeps the copyright of the contribution but gives a non-revokable right to license the code using Py4J’s license (3-clause BSD). The copyright statement has been changed to “Copyright (c) 2009-2015, Barthelemy Dagenais and individual contributors. All rights reserved.” to make it clear that individual contributors retain copyrights of their contributions. An AUTHORS.txt file has been added to the repository to keep track of contributors: if your name is not in the file and you have contributed to Py4J, do not hesitate to write on the mailing list or open a pull request.
  • Cleaned up the doc that was referring to broken links or refactored classes. Long-time users may want to review the advanced topics page.
  • Added support for Python Wheels.
  • We have a new website: https://www.py4j.org
  • We have a new blog: https://blog.py4j.org
  • Eclipse features have moved to: http://eclipse.py4j.org
  • We have a new mailing list
  • github 0.9 milestone

If you see any backward-incompatible changes, do not hesitate to fill a bug report (https://github.com/bartdag/py4j/issues/new).

This release has been made possible by the generous contributions from many users. Every bug reports, patches, pull requests, ideas or help on the mailing list (thanks @agronholm) is greatly appreciated.

Py4J 0.8.2.1 Released

Hi,

this is a minor bugfix release.

This release includes the following bugfixes:

  • Fixed constructors not being able to pass proxy (python classes implementing Java interfaces).
  • Java 6 compatibility was restored in compiled jar file.
  • Fixed unit tests for JDK 8.
  • Added a few extra paths to find_jar_path.

The details of the specific issues for Py4J 0.8.1 are available on GitHub.

Installing Py4J is one pip away: pip install py4j

Again, feel free to contact me or to write a feature request on GitHub!

For more information about Py4J:

Cheers,
Barthélémy

Py4J 0.8.1 Released

Hi,

this is a minor release featuring the long awaited availability of Py4J on maven central repository.

This release includes the following new features:

  • Fixed a bug in type inference when interface hierarchy is deeper than abstract class hierarchy.
  • Added a utility method is_instance_of in py4j.java_gateway to determine if a JavaObject is an instance of a type.
  • Released Py4J in central Maven repository. I’m still waiting for a final confirmation from sonatype before they enable the automatic syncing of Py4J maven releases.

The details of the specific issues for Py4J 0.8.1 are available on GitHub.

Installing Py4J is one pip away: pip install py4j

Again, feel free to contact me or to write a feature request on GitHub!

For more information about Py4J:

Cheers,
Barthélémy

Py4J 0.8 Released

After two years, Py4J 0.8 has just been released!

Although I merged pull requests and fixed bugs during these two years, my Ph.D., wedding, and new job made it difficult to find the time to make a proper release. All happy events that made me busy life busier 🙂

This release includes the following new features:

  • Major fix to the Java byte[] support. Thanks to @agronholm for spotting this subtle but major issue and thanks to @fdinto from The Atlantic for providing a patch!
  • Ability to fail early if the py4j.java_gateway.JavaGateway cannot connect to the JVM.
  • Added support for long primitives, BigDecimal, enum types, and inner classes on the Java side.
  • Set saner log levels
  • Many small bug fixes and API enhancements (backward compatible).
  • Wrote a section in the FAQ about security concerns and precautions with Py4J.
  • Added support of Travis-CI and cleaned up the test suite to remove hardcoded paths.

The specific issues are discussed on GitHub.

Installing Py4J is one pip away: pip install py4j

Again, feel free to contact me or to write a feature request on GitHub!

For more information about Py4J:

Cheers,
Barthélémy

 

Py4J Backlog, Bytes, and Open Source

Since my Ph.D. thesis is being printed right now, I thought I could give a status update on Py4J.

One Py4J contributor/user reported a problem with how Py4J handles byte arrays almost a year ago. Because Py4J was treating byte arrays as any other arrays (i.e., a reference), access to individual cells in the arrays were costly (one roundtrip per access). Byte arrays are special beasts because when you go down to the level of bytes, you usually want the raw power and the hanging rope that come with it: you certainly don’t want the programming language or a particular library to stand in your way. Because Py4J uses a String protocol (e.g., newlines are used as separators), transferring raw bytes would require a lot of modifications and would introduce a special case that would need more code than the usual case.

I thus implemented a naive solution that just shifted the byte by 8 bit, to make sure that I could still use my dear newlines. The same person came back at me a few months later though, and introduced me to the concept of UTF-16 surrogates and how Java did not like these special pairs of characters, even in UTF-8, the default encoding for Py4J.

I boosted the priority of this issue, but because I had started a new job and I was trying to finish my thesis during the weekends (advice: this is the fastest way to end up in an asylum), I did not have the time nor the strength to find a solution. Fortunately, a contributor from The Atlantic made a nice Christmas present to Py4J users: he implemented a fix using Base64 and opened a pull request. I merged the pull request in January, but I’m still fighting with some test glitches caused by the difference between Python 2 and Python 3. The Open Source community has been very kind to me and I have been fortunate to receive significant contributions from Py4J users in the past (Python 3 support anyone?). Because I am working for a company that is sympathetic to open source contributions, I will make sure in the near future that the effort behind the various Py4J patches were not in vain.

There are currently 5 open issues that I need to close before releasing 0.8, but all issues have some work in progress so I am confident that I will go through this backlog soon. After that, I will try to come back to a regular release cycle.

Py4J 0.7 Released!

Py4J 0.7 has just been released!

This release includes the following new features:

  • Major refactoring to support Python 3. Thanks to Alex Grönholm for his patch.
  • The build and setup files have been totally changed. Py4J no longer requires Paver to build and everything is done through ant. The setup.py file only uses distutils.
  • Added support for Java byte[]: byte array are passed by value and converted to bytearray or bytes.
  • Py4J package name changed from Py4J to py4j.
  • Bug fixes in the Python callback server and unicode support.

The specific issues are discussed on GitHub.

Installing Py4J is one pip away: pip install py4j

Although I’m still using Py4J everyday, I do not need many more features so future development will be mostly driven by feature requests and bug reports. Feel free to contact me or to write a feature request on GitHub!

Cheers,
Barthélémy

 

P.S.

I shall blog soon about a small and hidden feature that I introduced in the Py4J eclipse default server this morning and that made it into 0.7…

Py4J 0.7 soon to be released!

Just a quick post to announce that 0.7 will be released as soon as pypi renames the py4j package from Py4J to py4j (note the uppercases in the first name 🙁 ). The release will introduce compatibility with Python 3.0, thanks to the huge effort of a generous contributor! There are also a few fixes to bugs reported by users in this version.

Stay tuned!

Py4J 0.6 Released!

Py4J 0.6 has just been released.

This release includes the following new (and great) features:

  • New exception, Py4JJavaError, that enables Python client programs to access instance of Java exception thrown in the Java client code.
  • Improved Py4J setup: warnings are no longer displayed when installing Py4J.
  • Bug fixes and API additions.

In case you did not notice, Py4J moved to github so contributing is now easier than ever!

I plan to do at least another release (0.7) before Py4J leaves beta and moves to 1.0.

About Py4J
Py4J enables Python programs running in a Python interpreter to dynamically access Java objects in a Java Virtual Machine. Methods are called as if the Java objects resided in the Python interpreter and Java collections can be accessed through standard Python collection methods. Py4J also enables Java programs to call back Python objects. Py4J is distributed under the BSD license.

Py4J and Exceptions

Py4J 0.6 is almost ready to be released, thanks to Jakub L. Gustak who submitted important bug reports, feature requests, and patches. I have been trying to polish Py4J in the latest releases to make the API more consistent and predictable and the biggest “feature” of 0.6 will no doubt be how Py4J treats Exceptions.

Currently, exceptions can be raised in four places: (1) in the Py4J Python code, (2) in the Py4J Java code, (3) in the Java client code, and (4) in the network stack. An exception might be raised in the Py4J code if the client code is not correct, for example, if the client tries to call from Python a Java method that does not exist. Before 0.6, Py4J raised a Py4JError in cases 1,2,3 and a Py4JNetworkError (a subtype of Py4JError) in case 4. Moreover, if the Java exception was raised on the Java side, the Java stack trace was copied, as a string, in the Py4JError.

There are two issues with this approach. First, the client does not have access to the exception instance on the Java side, and this exception may have some important fields and methods that can help the error recovery. Second, it is very difficult for the client to determine at runtime the source of the error.

Starting from 0.6, Py4J will raise three types of exceptions: Py4JNetworkError in case #4, Py4JJavaError in case #3, Py4JError in cases #1 and #2. Py4JNetworkError and Py4JJavaError will be a subtype of Py4JError (so a client can implement a catch all). Py4JJavaError will also have a method that will return the instance of the Java exception and Py4JError will still display the Java stack trace for case #2.

Stay tuned for 0.6!

 

Py4J code moves to GitHub

It is now official. Py4J code has been moved to GitHub. The issues have been transferred there too thanks to a handy trac-to-github python script. I will be shutting down the trac instance on SourceForge. The web site, mailing list, and alternate download site will still remain on SourceForge (Py4J has always been available on pypi as well).

The decision to move to a Distributed Version Control System (DVCS) was a no brainer. I tried mercurial for another open source project I work on, Qualyzer, and DVCS are clearly superior when it comes to merging and enabling collaboration. The speed boost is also quite welcome (thank you local commit).

The main question was thus: mercurial or git? Actually, the real question was, bitbucket or github? This was not an easy decision, but in the end, the numerous outages and bugs of bitbucket and the higher number of potential collaborators on github convinced me to go with github. In addition, github seems speedier and they developed a tool, hg-git, that allows mercurial users to work with github repositories.

I must say that I was disappointed with the issue tracker at github which is slightly inferior to the bitbucket tracker (only tags, no milestones, no automatic reference between commit and issues except when closing issues, no complex queries, etc.) and I just learned that their wiki engine was only recently moved to an implicit git repository: I believe bitbucket wikis have always been backed by a mercurial repository… Anyway, I just wanted to say this because I have read many posts bashing bitbucket over github and I did not want to sound like one of them…