OnBook

2025-02-25

A Multi-player Full-body Mixed-reality Rehearsal Tool

Inspiration

Our team started from our desired tech stack. We met during team formation because we all wanted to build a multiplayer, mixed reality app on either Quest 3 or Apple Vision Pro.

Thematically, we decided to use mixed reality technology to create a tool for theatre makers and live performers. Modern day theatre productions are notoriously under-resourced and operate on a very strict time schedule; performers usually only have one chance to work on the staging of a scene with their director and fellow actors before they are expected to have every detail memorized. We wanted to make it easy for performers to record the ideal version of a scene's staging immediately after working with their director and to be able to go back and review this staging individually throughout the production.

"On-Book" is a term borrowed from live theatre production. It refers to the practice of having someone, usually the assistant to the stage manager, following along the rehearsal while keeping track of the scripted lines and blocking notes.

What it does

OnBook allows theatre makers to record and play-back their scenes in mixed reality. When recording, an actor wears a Quest 3 headset in passthrough mode and runs through a scene with their scene partner. On playback, the actor's recorded movement and audio data are replayed on a 3D avatar along with an avatar captured from their scene partner. For this iteration of OnBook, we focused on this core functionality along with basic playback controls.

In addition to the core features we fully developed, we designed features for OnBook which would allow theatre makers to annotate their scene recordings, integrate digital props, re-stage their blocking based on the dimensions of a new venue, and capture their realistic expressions.

How we built it

Colocation setup
Avatar recording and playback/pause/resume
Networking large data files
Networking small data files
Meta's SDK v71, Building Blocks, Colocation, Movement SDK, Photon Fusion

The core of our project starts with an Android APK built from our Unity 6 project. Within our Unity project, we used Meta's SDK v71 including interaction examples, building blocks, and more from Meta's packages and samples.

Our Unity project also uses Photon Fusion for networking and Meta's co-location and shared anchor packages. We use the very latest Colocation Discovery feature only available from Meta's SDK v71 or later.

The Unity project sends and receives motion capture data from our web server, which is connected to Firebase. We store motion capture data in Firebase for future usage and access. We also store audio files from the actors on the server as well.

GitHub: https://github.com/srcJin/OnBook

Challenges we ran into

Experimental Movement SDK capture and playback of data
Multiplayer networking for colocation
Putting it all together in such a short amount of time

Colocation requires certain permissions and a setup process to connect to a Meta organization and account. This was an initial confusing hurdle. We weren't 100% sure if we needed to deploy to a release channel, but we did it anyway because we think that we need to have one release candidate or an alpha, beta, or release channel for colocation to work.

Colocation can be a difficult without the proper deployment process with 2 headsets. Luckily, Meta's SDK and Building Blocks helps ease that process.

Sometimes, we weren't sure if colocation was working or not working because we did not put up an error pop-out, and if positions or rotations were slightly off, we weren't sure why colocation had incorrect rotation. We assumed that certain areas where we demoed, the later demo areas had too much black continuous surfaces, which could have caused scanning common reference points to be difficult.

Storing and sending bone transform positions and rotations can be a very data-heavy process. There's too many bones in a humanoid body, so we needed to figure out ways to optimize the amount of data we store. One good advice for our team is to not send positions for every bone like your fingers. For your finger bones, we only need rotation because the parent bone's position is the only thing that may move. These optimizations in selective bone updates allowed us to optimize what motion-capture data we send and receive.

Even though Meta's XR simulator can simulate the colocation with 2 instances of the simulator open, there can be unpredictable placement of the camera, so the only way for us to feel like we knew things were working properly for sure was deployment on the headset.

Meta's XR simulator does support Mac Silicon users, but when running 2 instances of the Meta XR simulator and making changes within Unity's scripts iteratively, the Meta XR simulator seems to respond with a slightly different positioning of the camera sometimes. We would have to exit the Unity app for Mac users sometimes and restart Unity.

Sometimes, Meta's XR simulator could keep on drifting to the directly of WASD if you held right click while alt tabbing to a different screen.

Accomplishments that we're proud of

Concept design process
Probably used the most Meta SDKs of any other hacking team

What we learned

Time management for hackathons, especially this one. We were all first time hackers.
Combining different technologies together is difficult in a time-sensitive hackathon.

What's next for OnBook

There's a lot of potential for OnBook to synchronize audio and motion capture better. The timing of both should match, so that the review of the rehearsal process matches with real life.

We could improve the leg movement with more interpolation and arm movement whenever hand-tracking is lost.

The actors or director should be able to export their motion capture as animation files or videos for future use.

There should be a scrubbing tool to be able to go through the timeline of the rehearsal, which we did not get a chance to implement.

We could add a system to allow props to enhance the experience if the scene is very prop-dependent.

There can be more tools to fine-tune remote acting rehearsals, which we did not focus as much as co-located rehearsals.

We can gather more animation data by capturing facial movements, which we did not focus as much as body movement data.