Chris Graham

The structure of an immersive theatre app

Published on:

You Me Bum Bum Train is an immersive theatre experience that I first became involved with in January 2016, during their previous show run. Details about the show are kept deliberately secret, an NDA covers most of it, but the general premise is that you as a patron (known in the show as a “passenger”) travel through a series of different scenarios back to back, each evoking a different experience or feeling. The show attracts an army of volunteers in the thousands and is a huge undertaking both from a personnel, and as I found out, technical, standpoint.

When I first got involved, it was mostly working backstage during the show and helping with the “get out” (the tearing down of the sets at the end of the show). During this time, I learned about the unique app that helps keep the show running smoothly. Unfortunately, this app was built on a cloud service that got shut down shortly after the last run ended… so I volunteered to help build a new version for the next show. This iteration ended up being used in the most recent 2024/2025 run, some 8 years later!

This post outlines some of the requirements of the software, the challenges the requirements brought, and what I ended up building to solve them. If you prefer to just read what the key learnings were, you can skip to Summary / Learnings

If you’re more curious about the what the show is, or how to get involved yourself there are a few links at the bottom of this post.

Requirements

As mentioned, a lot of what goes on as part of the show is secret, but the general requirements I was working to looked something like this:

The last couple of requirements above were added in response to surprise dropping of support for the previous version of the software, when the service shut down, and issues caused by it being hosted on the internet and so prone to breaks in service. Everything else was an existing requirement.

Development Process

For much of the project I was the sole developer of the software, although I had a few other people supporting me by answering my many questions on how things should work. This means a lot of the software stack and approaches taken fell into one of two categories:

As a result I fell into a development process and stack similar to work I do at Global, although due to the app being developed entirely in my free time the timescale was very much stretched, with work generally being done over periods of an hour or so a week and the occasional intense coding session when I had some time off and no plans.

The app also evolved quite a lot over the 8 years, going through fairly major changes as patterns were tried and failed, or new libraries and frameworks came about.

Version 1

The first version of the software looked something like this by the end:

The system ended up using an event model to track every event going on within the show, and a state machine, of sorts, to handle the transition between states based on the events. All these events were sent from the React app to the backend via the WebSocket following a user action. The backend then recorded the event in the database, before relaying it out to other clients.

Using this approach meant that everything reacted pretty much instantly on any event changes and was fairly resilient. If an app ever needed to be refreshed or restarted, the APIs provided the means to rebuild the state by replaying the events.

Offline capability was provided by having each app have its own version of the state machine and the ability to update it locally based on its own events, so it would behave as though everything was changing, but without persisting to the backend. On reconnecting the events would be forwarded in bulk to sync up.

Problems

While the setup above satisfied all the requirements, a few issues started to show up when it was put to the test:

State Differences

Because the app and the backend both had a state machine, but they were written in different languages (Python vs. JavaScript) sometimes subtle differences in the agreed state would sneak in due to things like slightly different timestamp handling, or just language behaviour.

This was not a deal breaker, but was a big enough issue to be spotted on many occasions.

WebSocket Reliability

Django Channels was fairly new to me, being one of the pieces of tech I’d not used at Global and had wanted to try. It turned out that this inexperience made for a less reliable setup than I’d had before using NodeJS sockets. However, I’m sure Django Channels would be up to the task if I’d spent more time understanding how to properly use it.

App Separation Issues

With several screens for different user roles sharing one app there were some sepation issues, both with state for different screens using a central Redux setup, and just from a plain “who should be able to see what” aspect.

Next Steps

These 3 main issues led me on to building a second major version, mostly from scratch, with the new requirements added:

I also had a few other upgrades in mind;

Version 2

The second and (currently) final version of the software became this:

The event model was solid, so that was kept, along with a single version of the state machine, now written in TypeScript, keeping the Django/Python side to a simple API database wrapper.

The new BFF middleman acted as the state machine of truth and also kept a snapshot of the current state for the different app roles, so they could instantly restore off the one source rather than replay events in their own state machine.

The React apps still communicated via WebSockets, but now with a NodeJS server in the mix it could be more familiar JavaScript WebSockets on both ends.

To maintain the offline capability, the apps still maintained their own state machine, but using a shared library the code was identical so the desync issues from Version 1 didn’t manifest.

Summary / Learnings

Having gone through 2 major versions of the software and now having had it operating successfully through an entire run of the show for approximately 6 months, these have been my major learnings:

Don’t Duplicate Important Logic

Having the state machine in two different places was a major pain, and I’m not sure what I was thinking trying to make it work seamlessly. If you need to split logic across different places, try and at least have it in the same language so potential differences like comparisons and timestamp parsing don’t trip you up.

Keeping Things In Sync Is Really Hard

More than just the state machine language split issues, keeping everything in sync was a real challenge, and there are still some smaller bugs that occur now and again due to the precise ordering and arrival of events under weird conditions. That being said, I’m not sure what I could have done better here without over-complicating things.

Type Safety Is Really Useful

Having TypeScript rather than plain JavaScript in the second version highlighted all kinds of issues that could have become bugs in the code. In the BFF especially, which needed to be really stable, avoiding issues due to missing or null data probably saved countless hours.

Know What You’re Getting Yourself Into

This was no small project, and what I thought would be a fun little thing to work on turned into a years-long project that required me to push some other projects aside. I’d still do it again, as I learnt a lot and tackled all sorts of interesting challenges along the way. In hindsight, I should definitely have thought a bit more about what the scope of the project might be before committing.

The website of the You Me Bum Bum Train show itself

A review from The Independant of the 2024/2025 show

An interview from The Guardian with the show’s creators