Visual Metaphors as the new Dynamic Storytelling Interface for Programming
Team member names
Warren Koch (Senior Programmer and Artist)
REDACTED (Senior Programmer and Engineer)
Short summary
Thanks to the progress in Visual Programming, text based code has an easy mapping equivalent to visual graphs of function nodes. These are a significant improvement, but we think there’s yet another tier. With AI, we now have the ability to map code to dynamically-generated stories with interactive controls (example shown above). These could be animated, bespoke, designed to show a metaphor of what’s going on under the hood in any level of detail, from multiple angles. A personalized interface tuning itself to best aid your understanding.
While this might seem like a huge leap, the fundamental operations are no longer complicated. The abilities of AI make this much bespoke recontextualization cheap, fast and easy for any problem - the only challenge ahead is finding stable-enough tiers of quality we can reliably hit so this can be relied upon as a basic interface. This exploration takes the form of trying out different patterns of interaction to find the best ways to represent data and processes - it’s a conversation format between a user and an AI. A protocol.
What is the existing target protocol you are hoping to improve or enhance?
- We hope to improve Visual Programming to be much more lively, intuitive-at-a-glance, natural and game-like - which will have downstream improvement effects on Text-based Programming, Operating System Applications, and the way we interact with our computers (and each other!) on any technical or complex task
- Intuition: We’ll use AI tech to dynamically combine stylish Video Game User Interface patterns of design with typically dull, dreary Visual Programming Interfaces, to hopefully make something useful AND visually intuitive
What is the core idea or insight about potential improvement you want to pursue?
- Visual and tactile programming have proven to be some of the most accessible and enjoyable ways to interact with computers, especially for less technically-fluent people
- Video games have demonstrated ways to preserve complex operation logic behind intuitive and fun visual metaphors for almost any kind of task
- Interfaces like Scratch, Unreal Engine, and ComfyUI have done the heavy lifting mapping between textual and visual representations of programs (to a node-graph metaphor), but it’s time to take this further
- Using new AI capabilities to generate bespoke images in seconds, we can dynamically adapt the visual representations of objects and processes to metaphorically represent the underlying data
- This can be adaptive to any scale of detail or fidelity, and tunable to the user’s preferences and the wider context of the system being represented
- Understanding and mapping out this process connects programming methods to real world visual protocols and vice-versa, opening countless avenues for new insights between the two
- Visualizations become Actions. Actions become Visualizations
What is your discovery methodology for investigating the current state of the target protocol?
- Examine existing workflows for the popular visual programming application ComfyUI (e.g. https://comfyworkflows.com/) and use those as primary “easy” examples for our mapping process
- Examine existing popular github codebases/libraries in text form and apply this method to them (“hard” mode)
- Examine existing OS applications and unexpected real world examples of complex systems and apply this method to them too (“wild” mode)
- Aim for small prototype niche demos, then general (80%) applicability, then rigorous/exhaustive applicability
In what form will you prototype your improvement idea?
- We aim to build a rough map of the most accessible and useful visual metaphors we can apply to various example processes. With the help of AI prompts, we may be able to make this exhaustive and auditable to high expected quality
- Using LLM prompt-based programming, we then build metaphor selection protocols (in the form of questions and answers) and using generative image AIs we build a series of intermediate visual representations before outputting a final image (see AI QR codes for examples of how we can dynamically shape such interfaces)
- We will tune this general method by applying it to a wide variety of example programs/applications/problems and attempt to produce various valid visual representations which try to balance accurate representation of the underlying data with visual and metaphorical elegance
- Deliverables will take the form of individual example demonstrations, limited-scope open source code extensions to ComfyUI, or a generally-applicable open source program
How will you field-test your improvement idea?
- Sharing example mappings with the SoP community for protocol-savvy feedback
- Friends and family demos for non-tech-savvy feedback
- Colleagues and friends for programming-savvy feedback
Who will be able to judge the quality of your output?
- Primary aim is to make complex systems understandable and intuitive to laymen. Thus, we hope to be judged by the public, on social media, reddit, and Summer of Protocols. “That’s pretty cool” would be a good barometer.
- Coaxing the harder-core nerds might be more difficult, having suffered so long glued to the bastions of eye strain and egotism that are complex textual git repos. It will take many more steps before they can be convinced of a softer future. One day we’ll top HackerNews though - you’ll see.
- We are mostly doing the work of condensing visual programming courses and methods into a generalized protocol for AIs to dynamically start whittling down complexity and represent any problem in a visual language
- Thus: anyone from a Visual Programming background would make a great judge
- These folks (VPL - Visual Programming Lab)
- Or these (Scratch)
- Or any developers from the ComfyUI community!
- Let’s be honest though, the real judge will be GPT5-or-whatever analyzing our solutions with a UX designer hat. You guys just get the final product
How will you publish and evangelize your improvement idea?
- Release an open source module plugin to the popular AI programming application ComfyUI implementing dynamic metaphorical interface generations
- Release general open source code and documentation making steps towards generalized application of this method across various operating system and applications
- Blog posts articles with demonstration images and videos showing the mapping process and examples of the method, shared to social media
What is the success vision for your idea?
- Create a scalable general protocol for mapping any complex system to a set of visual metaphors and analogies
- Create a library showing the mapping options available for any given base process (exhaustively, if possible)
- Create working code demonstrating the method in a popular programming ecosystem
- Specific target: mockup or modify ComfyUI node interfaces into visual metaphors of the underlying node computation
- Specific bonus target: make ComfyUI interfaces fluidly tune to pick a visual metaphor closest to the underlying data. For example: a classic workflow prompting “picture of a cow in a field” would be dynamically redesigned to shape the visual interface for each program node in the workflow to use a Stardew Valley-esque farming metaphor for each step in the processing logic, before outputting the original program output (in this case, a cow picture). This would entail live bespoke generation tuned to fit templating parameters and program scope - but we’re confident it’s quite doable. Just might take some work to get it high quality in the general case!