Setting the Scene in Video Generation

Shotblockr was a prototype exploring how 3D tools and generative AI can work together to streamline the storyboarding process.

I developed it as part of the Mondovision animated film project, as I was struggling to direct how AI should generate my videos, based on character references and scene images. I was looking for a stripped-down version of Blender — something anyone could use to set a scene — because, in the end, spatial control felt more natural and precise than prompting.

Along the way, I ran into safeguards embedded in both commercial and open-source solutions, preventing characters from being replaced with real people in a scene — even in clearly fictional 3D and not particularly realistic contexts. The AI was also not very good at precisely reproducing spatial positions.

This raised an interesting question: are these tools actually optimized to follow a user’s detailed instructions, or to produce satisfying results by composing those instructions with a much broader set of embedded references and constraints?

These limitations could likely have been mitigated using more advanced open-source or ControlNet-like approaches. But I was able to finish the film, and ultimately left the project where it stood.

It was my second vibe-coding project, developed with Replit, using BabylonJS for the 3D layer and the Flux Kontext image generation model.

July 2025

Code on Github