Finding a way to handle a user’s session consistently as they move from the PHP side of the app to the Python side and back was a challenge during our conversion process.
We knew that the authentication and authorization tools we had built in PHP would probably be some of the last things we would change during our conversion process. It’s a significant thing to shift in a running application, and it seemed prudent to wait until some larger majority of the app was in Python to change it. At that point, we could more easily leverage things like the Django user and permissions system.
One of the decisions we made about our PHP-to-Python conversion project was that we didn’t want to try to rebuild the app completely in Python and then one day flip the switch to activate the Python version and deactivate the PHP version. Instead, we wanted to release the Python parts as we went and have them live side-by-side. To this end, we needed a way for authenticated users to move between the two apps seamlessly. That required sharing the authenticated session data easily.
The original PHP app used file-based session storage. I know, I know, but that’s the way it was. We knew that needed to change to something that was more easily accessible to the Python app. We chose to use memcached to store our sessions, along with a database backend to which we could fail over should memcached not be available. We might also fail over if the site was churning sessions so quickly as to see memcached start dropping active sessions out of memory, though that’s never come close to happening. We don’t have that kind of traffic profile.
We accomplished this new session storage backend in PHP by writing a custom
CachedSessionHandler class that implemented methods to override those used by PHP’s built-in
session_set_save_handler. We found some side benefits from doing this, which included being able to correctly encode and decode UTF-8 data that might need to be stored in session and also build in that database backup capacity I mentioned.
In practice, overriding the
session_set_save_handler is as easy as having some code like the below be executed early in every request, certainly before
session_start(); is called.
$session = new CachedSessionHandler(); session_set_save_handler(array(&$session, 'open'), array(&$session, 'close'), array(&$session, 'read'), array(&$session, 'write'), array(&$session, 'destroy'), array(&$session, 'gc'));
With this new memcached/database session storage, we set about figuring out what data would be stored in that session and in what format. PHP isn’t terribly friendly about letting you modify the way it serializes data into its session, so the challenge was to have Python be able to read and write the PHP serialization. To do that, we used Scott Hurring’s excellent classes that allow Python to read/write PHP’s serialization.
These classes did 99 percent of the work, but we’ve added a bit of code to handle some things like
objects, which don’t have a clear serialization algorithm and need to just be skipped. The existence of those
objects in our session was a bug that we eventually fixed anyway.
We use these serialization classes by overriding Django’s session handling in our settings.py:
SESSION_COOKIE_NAME = 'PHPSESSID' SESSION_ENGINE = 'cached_db_session'
cached_db_session contains a
SessionStore class in Python similar to the PHP
CachedSessionHandler above, but it implements the PHPSerialize and PHPUnserialize classes to do stuff like this:
session_dictionary = PHPUnserialize().session_decode(content.encode('utf-8'))
Because we’re using completely custom session handling on both sides, you’ll see we need to account for the differences in the way encoding is handled in PHP and Python.
We’ll eventually dump the PHP authentication handling completely and modify the
cached_db_session.py code to not worry about PHP serialization but just use whatever the default Django serialization is. We’re also looking at moving sessions into something slightly less volatile than memcached, perhaps Redis. We’d more than likely keep the database backup of those sessions unless we really felt comfortable with our Redis server redundancy model.
With those custom session handling classes in place and the excellent work of Scott Hurring’s PHPSerialize and PHPUnserialize classes, we were able to not only have PHP and Python share session information, but also build a faster and more reliable session handling system as part of our move to Python. This has been a good, solid transitional step for us, but we look forward to using Django’s user and permission system and the attendant session handling it uses.