Showing posts with label Twisted. Show all posts
Showing posts with label Twisted. Show all posts

Friday, August 3, 2012

Bootstrapping Trial in Python 3

Initially, I had tried an "extensive" approach to porting Twisted - picking a certain error and fixing it in every module. Unfortunately, as I've found out, this isn't very practical: not only is Twisted a large code base, it's also very old. While updating this crufty code might have been doable, Twisted also has a requirement that all changes to code need to be tested (and I think this is very nice!*). This has been enforced quite strictly in the last few years, but of course, the code using the really old, Python3-incompatible idioms, is the same code which has no tests. As such, to make any sort of substantial change I would also need to write tests. This proved to be a little too much, and Itamar suggested I consider a more "intensive" approach - fixing Twisted a module at a time, starting with the core.

In this I also meandered slightly, but after discussing it with exarkun on IRC, we concluded it would be best to pick a file with tests, run it under Python 3 and fix the failures which arise. This is in line with Twisted's programming paradigm, test-driven development, and is a very comfortable way of working. The idea, of course, was to start with modules which have no dependencies on the rest of Twisted, and then work "down" the dependency tree as individual modules are ported. While this sounds ideal, I've encountered two problems: the minor one is that Twisted depends on itself a lot, and it's hard (although not impossible) to identify modules which do not use any others; the major issue is the test runner itself, Trial.

Trial is Twisted's (quite versatile) framework for testing, based on the standard unittest module. In time, the TestCase class was completely rewritten (though in a compatible way) to support various features which make testing easier. Now, when importing a file in Python 3, it needs to be syntax-compatible with Python 3, but all of it's imports need to be compatible too. So now, each test subclasses from twisted.trial.unittest.TestCase and the twisted.trial.unittest module is very large and unfortunately imports a large chunk of Twisted itself (notably, twisted.internet.reactor, but also half the twisted.python package). Therefore, it's impossible for me to actually run the tests, as I need Trial and Trial needs other things and none of this is compatible with Python 3. I had tried writing a large patch to at least make Trial importable, but it was rejected (and for good reason, I now think). Obviously, the huge patchset would need to be broken into smaller tickets, but preferably in a logical way.

Luckily, the solution came via the official unittest module - if I only change the test case to import from the official library, rather than from Trial, it will work! Then a simple ``python3.2 -m unittest twisted.test.test_whatever`` runs the tests. I have successfully used this method for several simpler files but I fear the low-hanging fruit are gone - as was to be expected, many test files do use functionality provided only by Trial's TestCase. I am still trying to "pick around" here and there, and have also submitted tickets which do not fix a specific module, but just a single issue (eg. removing __cmp__ in t.p.versions, removing uses of UserDict). It is clear, however, that this approach will not lead me to my immediate goal - running Trial itself under Python 3.

And this is where I currently am: my goal is to bootstrap Trial, to make it runnable in Python 3, which will make running tests (and, by extension, fixing relevant failures) much easier. The "pick a test_file and fix it" method cannot bring me there and I've been unable to think of a better alternative. One idea was to use an alternative TestCase implementation (where I tried testtools, which unfortunately isn't as-is compatible with Twisted's tests); using a different runner wouldn't help, as the modules would still need to be imported. Another idea is to provide some sort of temporary class, which would extend unittest from the official library with the specific methods I'm lacking; this class would then be deleted as soon as it's possible to run Trial itself. This doesn't strike me as a very clean approach, but it might be the only plausible one, unless someone has a different suggestion...

In the meantime, I'm focusing on fixing what I can (even if it doesn't directly lead to supporting Trial) and making more "general" changes, to lower the size of further patches (but there will be at least a couple of big ones, there's no avoiding it). In fact, I've been focusing on making tickets as small as possible, to ease review burden, though I've still got plenty awaiting review: any help on this front would also be very appreciated. I've also tried reviewing other tickets, to ease the general burden, though the one case where I actually "passed" a review I had to revert the change, so I'm trying to be more careful about it now.

*While I do find it very nice, I do have some issues with this policy and I feel that a few carefully thought-out exceptions would have been very helpful in my project. More thoughts on this in a future blog post.

Monday, June 18, 2012

Another year, another GSoC!

Well, this blog post took long enough, but I'm happy to announce that I've been accepted once again for the Google Summer of Code, this year for Twisted ("an event-driven networking engine written in Python"). What's more, my project is essentially the same as last year - porting Twisted to Python 3, or at least getting as close as possible (my actual proposal is available on Google Docs). Unfortunately, compared to last year, my school load was much higher this time around, so I've done much less work than I would have liked.

At the start, I've mostly focused on fixing the warnings thrown when running the test suite with "-3", taking care of most of the trivial ones (eg. has_key, apply, classical division). Currently, I'm looking into replacing "buffer()", which was a built-in but is removed in Python 3. While the work is similar, the workflow for getting changes in is quite different from SymPy. Twisted uses a svn repository, and trac for issue tracking; each change must be done in a separate branch and have a corresponding ticket; SymPy uses the classical Github + git workflow, with pull requests and reviews in the online interface. Now, I got too used to git to just give it up easily (especially as the Twisted workflow almost requires additional tools on top of vanilla svn), and this guide was very useful in setting up git svn. Although I'm getting used to the review process (eg. changesets are reviewed in total, not per-commit), I still find the Github (and git) model more productive - it streamlines review and allows small, atomic commits to be made (although I've been trying to keep my changes as small as possible, each such small change requires the opening of another ticket and creation of another SVN branch, so there's a point at which it's too much effort to do it all). Still, Twisted is unlikely to change so I will have to accommodate - at least I can follow my own practices in my own repo.

The next steps are deciding which porting strategy to ultimately pursue - some Twisted developers suggest a single code-base strategy (py2 and py3 compatible), I personally favor a single code base which relies on 2to3 for py3k compatibility and there was even an attempt at a dual code-base, by Antoine Pitrou. While I feel that approach is the least likely to succeed, as it introduces a high burden on the maintainer (and the effort has indeed stalled), the code already there will be helpful in my own work. Often, the changes made in py3k code can be reused in the "main", Python 2 code with little or no changes. Still, all approaches deserve investigation and my mind is still open to other ideas. Twisted currently supports Python 2.6+, which makes my job easier. The final piece of good news is that I may be able to get some sort of help (or at least support) from Canonical, as part of their plan to install only Python 3 in the next desktop Ubuntu release.