AI can't read your mind…but iteration is expensive
The challenge of working with large language models on complex projects.
AI capabilities are rapidly improving. It’s difficult to predict exactly how good AI can become and on what timeframe those changes will occur. Regardless of how sophisticated AI becomes, one fundamental challenge persists: communication. Despite significant AI progress, effectively communicating with LLMs remains a significant and under-appreciated problem that will arguably hold back the value of AI, no matter how good it becomes.
This challenge isn't unique to AI collaboration. Communication becomes a bottleneck whenever we work on complex projects with multiple “intelligent” actors involved (i.e. people or AI). Translating our thoughts with perfect fidelity is fundamentally impossible without some hypothetical mind-reading technology that doesn't yet exist. Even then, our thoughts are often more superficial and inconsistent than we realize.
Every attempt to convey a complex idea inevitably loses something in translation. Whether communicating with humans, computers, or AI agents, perfectly transferring your mental model to another actor is an unrealistic expectation. The recipient's inherent biases and assumptions, combined with ambiguities in your explanation, guarantee some level of misinterpretation.
With AI systems, these biases and assumptions often stem directly from their training data. If you have experimented with prompt engineering you have likely encountered this limitation firsthand.
When you describe a task for AI to complete, the system makes assumptions about our instructions and how to implement them based on patterns in its training data. For instance, when asked to write tests for JavaScript code, AI systems will often default to Jest, which is a popular testing framework heavily represented in training datasets, even if that constraint is not dictated. This is usually a good thing, as it represents a shared context between you and the AI, but, if that context is not appropriate then the assumption will cause problems.
Even when explicitly instructed not to use Jest, AI models frequently revert to outputs that include Jest. The statistical weight of Jest in the training data effectively overrides your specific instructions, demonstrating how deeply ingrained these patterns become.
This pattern repeats across all kinds of communications. It's practically impossible to provide sufficiently detailed specifications for even moderately complex ideas. And since the AI’s assumptions are not visible to you, you can not easily predict how the AI will interpret your request.
Even if you could somehow painstakingly articulate every necessary detail, the recipient must also perfectly process and integrate all that information. At some point the level of detail just becomes so great that it can be held in memory all at once. Even if that were not a problem, specifying anything at this level of detail while navigating unknown assumptions is practically impossible.
This creates a significant obstacle whenever we collaborate on complex projects. Communication becomes a bottleneck filled with hidden challenges that only become apparent after a misstep reveals a misunderstanding.
There are two primary approaches we can use to tackle this challenge: iteration and tools.
Despite our wish to simply describe a task to AI and have it execute flawlessly without further guidance, this rarely succeeds for complex tasks. As complexity increases, so does the likelihood of AI diverging from our intended path. The longer we wait to validate results, the further off-track the work becomes.
Iteration provides tremendous value in managing this risk. By breaking tasks into smaller components and validating results after each stage, we can ensure the AI remains aligned with our goals.
However, this approach comes at a cost. You must invest time in validating the AI's output, which can undermine desired efficiency gains. If validation requires as much time as performing the original task yourself without AI, then the AI adds no value. In fact, the process potentially becomes more expensive when accounting for both the AI costs and your time costs.
Now, it is possible that AI could become so powerful and inexpensive that simply giving it vague instructions and letting it try to accomplish a complex task is still worthwhile. In such scenarios, even mostly incorrect outputs might deliver enough value to justify the attempt, especially if the AI occasionally produces excellent results quickly and cheaply.
This scenario seems unlikely, however. Historically, more powerful AI models have commanded higher prices. Expecting dramatically more capable AI at lower costs would require simultaneous breakthroughs in technology, business models, and operational efficiency. This is possible, but improbable, in the near-term.
Even with hypothetical ultra-powerful, ultra-affordable AI, the communication challenge should not be underestimated. Complex projects like software development involve countless potential misunderstandings that compound over time, making quality outcomes from casual instructions highly unlikely.
This is where tooling becomes crucial. Imagine you're fortunate enough to receive an AI-generated solution that's 80% aligned with your vision. The challenge then becomes articulating how to refine that complex result to match your ideal more closely.
Trying to discuss complex ideas is very difficult if you don’t have ways to isolate specific aspects of that complex idea so that both parties know what exactly is being discussed. See also: Software Development with AI: Isolation and Abstraction.
Imagine trying to edit a book without being able to refer to a chapter, page, or sentence in that book. It would be extremely difficult to ensure that there weren’t further miscommunications that then need to be fixed or discarded.
The right tools dramatically simplify this process. When a tool enables navigating complex systems to isolate specific components for collaboration, you substantially reduce the complexity of the interaction. Both actors can focus on a smaller amount of context and discuss it more easily.
More importantly, tools can provide interfaces for iterating on complex systems beyond text-based communication. For user interface design, as an example, you need visual tools that display the actual design rather than just text descriptions. You need to identify specific elements (isolation), and ideally make quick, verifiable edits to that design.
Collaborating on complex projects like software applications will never be trivial. Even with hypothetical mind-reading technology that could extract your vision and execute it, you would likely evolve your thinking as you experienced the result and gathered feedback.
For instance, you might overlook edge cases or user experience variations. This is particularly likely to occur when building complex applications with numerous potential states. Complex projects inevitably require frequent iteration, whether collaborating with humans and/or AI.
We should focus significant effort on creating tools that allow us to easily isolate specific parts of a complex project, see and interact with those isolated parts more easily, and make changes to those isolated parts that can be easily validated. While this won't eliminate iteration costs entirely, it will significantly reduce them, making complex collaborations with both people and AI substantially more effective.
This is why we’re building CodeYam. We’re creating a tool that deconstructs software projects down to individual functions. We demonstrate how these functions operate by testing them with various data scenarios, capturing results as either data outputs or, for front-end functions like React components, as screenshots or interactive components on a simple static website. This approach simplifies validation of, collaboration around, and AI-assisted modifications to complex software projects.
CodeYam helps software teams increase confidence in code changes by making it easier to discover if a change is having the intended impact, or introducing bugs or side effects, via software simulations. These simulations provide test coverage, help teams understand the state of and any changes to their application, and can be used for documentation, demos, and collaboration. To learn more, please contact us or join the waitlist.