It wasn’t an easy decision. It was fraught with peril and disagreement. Still, it was staring us in the face. We were going to have to rewrite Emma.
It was 2008 and the application was, in parts, five years old. There were a lot of opinions on how and when, but no one really disagreed that the application needed to be regrounded. The application had been written years before on a custom, monolithic PHP framework that was showing its age in all the worst ways. It relied on outdated libraries and was difficult to maintain and even more difficult to expand. We wanted features like internationalization, less reliance on the wasteland of PEAR, and a more consistent environment across the various problem spaces in which we work (application development, backend processing, database scripting, server administration tools, etc.).
So, the most dangerous question in software development came up: What language should we use?
Energy, optimism and lots of decisions
The application had long been using PHP and the PEAR libraries. It was marginally a PHP5 application, but was only just so. An emergency effort had been undertaken to make it PHP5 compatible sometime in the spring/summer of 2008. That fall brought with it a breath of coolness hinting at the despair of winter. What we felt instead was energy and optimism with the thought this rewrite might be able to right many wrongs with the PHP application.
It’s not 100 percent clear to me now exactly why, but I know that using Python was one of the first ideas we had. Sure, Ruby (and Rails) were on the table. Java was eliminated early on because none of us wanted to learn it. Rails was clearly the front-running web framework in the Ruby world, and we wrestled with some early Ruby code.
Then, we found Django and its principles and architecture really resonated with the team. More than that, we found the ecosystem of tools and developers was growing to the point that it might rival the explosion Rails had seen. What interested us about using Python in addition to Django was the hope that our web application code might be able to look like and use some of the same code that our background processes were using. Python is a great language for scripting server-side processes. Having that code be more like what we’re writing for the web application seemed like a good goal.
Late in 2008, we released the first bit of Django/Python code to the application. I rewrote the user-facing part of our import process in Python. We hooked together this new Django app with the existing PHP app. There were several major hurdles: session management, templating/customization support, handling database communications and deployment, among others.
Four hurdles to jump
The first problem we had to solve was getting all of the PHP application’s authentication and authorization mechanisms working on the Python side of the application. We certainly didn’t want folks importing lists into accounts they don’t own. We quickly found that we could make PHP and Python use the same session data using this serialization/deserialization code. That worked extremely well, and we are still doing it today.
The second issue we faced was how to make the user interface of the new Python application look and feel like the old PHP application. There was a lot of work done in that first imports application to replicate much of our templating/customization architecture. We routinely use different stylesheets and HTML for different types of users as well as hiding/showing different parts of the app depending on permissions. All of those flags and switches had to be supported. Django’s templating engine made it pretty easy, though we had to redo some of the architecture of the templates as the new UI project came to bear in 2009.
The third big hurdle we had to jump in this PHP-to-Python conversion was how to talk to a large and complicated legacy database. We use PostgreSQL in production and are heavy users of its table inheritance. Our schema is very large and we store a lot of data. Additionally, we have horizontal sharding and apportion particular accounts to particular shards. When we first started using Django (version 0.96/7), there was very little support in the Django ORM for these types of setups. We also wanted our background processing code to use some of the same database code, so requiring the entire Django stack for those scripts seemed counter-productive. We chose to use (and still use) the excellent SQLalchemy library. That library has only gotten better in the two years we’ve used it. We’ve navigated many thorny problems, like dynamically creating classes based on tables whose structure the code doesn’t know and managing switching the connected database on an existing database session. While not using the Django ORM has limited some of the aspects of Django that we might like to employ, the reuse of the SQLalchemy code in which we’ve invested has really paid off.
The fourth issue we tackled was how to maintain this new codebase and deploy it to our servers. Around the same time we picked up Python, the entire development team switched its source control from CVS to git. It’s been a fantastic boon to our productivity and happiness. This gave us some better options for moving all this new code to the server. We initially started down the road with some home-rolled Capistrano scripts. It was manual and tedious and you really had to know what you were doing. Eventually, Kevin, our Platform Development Lead, wrote an application that can deploy any git branch from any git repo to any server with the click of a button in the browser. This has let QA be confident about what an application build contains and has made it easier to include new servers in our release pool as they come online.
The Zen of Python
That’s how we got started in our conversion to Python from PHP. We began an initiative recently to eliminate the last bits of PHP from the user-facing codebase. We should complete that in the next three months. It’s exciting and liberating. It means that we’ve grown up, we’ve entered a new phase and our horizons have expanded. This choice was never fueled by anyone’s dislike of PHP (though none of us really want to go back to PHP). We always saw this choice as trying to use better tools that eliminated some of the hurdles to development that existed for far too long.
There have been at least a couple of downsides to this choice. It’s really hard to hire quality Python developers in Nashville. Our expansion into our Portland office helped us recruit some great talent, but we still end up teaching Python to many of our new developers. That means a longer run-up time for new hires. We’ve also found we’re learning along with the Python community-at-large how to solve certain problems. So, sometimes we are unwillingly on the bleeding edge. As a developer, that’s not necessarily a terrible place to be. As a company, it can be challenging, but it’s not insurmountable.
Since we started this project in late 2008, Django has grown up a lot. Python has grown into a very popular language, especially in Silicon Valley. Several really good PHP frameworks have come to fore in that time as well. New Python frameworks exist that we are exploring for work outside our main application. The options are much more numerous now than they were then, but we made a good choice in that place and time and have benefitted from our pursuit of the Zen of Python.
Expect a follow-up post in a few months once we complete the conversion, when I’ll share some of our successes and some of the solutions we create along the way. I hope to outline how this conversion will have us poised for some big changes in our ability to build new features and deliver them to our users.