Circle
sparkle doodle element
settings
Arrow Dotted Line

Behind the scenes ​of AI-generated ​Chats: Prompt ​Engineering

summary

Cleo is an AI financial assistant app that offers insights on spending, saving tips, ​budgeting, and credit score building through chat. Previously using intent classification ​and pre-written responses, Cleo now leverages AI-generated chat for more engaging ​and dynamic conversations. I’ve been working on a set of internal tools to facilitate the ​Content Designers' job in the creation of those chats.

Introduction

Chatbots have come a long way. They’re no longer just scripted responses. Today, they ​can hold intelligent conversations thanks to large language models (LLMs).


But what most users see as a simple chat interface is actually powered by a complex ​system, with the true magic lying in prompt engineering. This is the key to making these ​conversations feel so natural.


In this article, I’ll take you behind the scenes of Cleo's LLM-powered chat system, ​exploring the intricate process of prompt engineering that allowed us to elevate our ​chat quality.


The Pre-LLM Era: Static and Stiff Chats


Before LLMs became integral to our chat system, we relied on a set of predefined ​messages triggered by specific keywords. These messages, although functional, often ​failed to deliver the nuanced, human-like responses users expect.


The chat experience was rigid, occasionally sounding unnatural or even a bit "dumb." ​It was clear that our system needed an upgrade, one that could generate more dynamic ​and contextually appropriate responses.


hexagon tech digital pattern
Arrow

The Game-Changer: Introducing LLMs


The introduction of LLMs marked a significant turning point. With their ability to ​understand and generate text based on nuanced prompts, the quality of our chat ​responses improved massively. But this leap forward didn't happen overnight. It ​required us to build a robust tool that would allow our content designers to craft ​precise prompts—text or instruction sets that guide the LLM in generating the desired ​responses.


hexagon tech digital pattern
Arrow

What Is Prompt Engineering?


At the core of LLM-driven chat systems is the concept of prompt engineering. A prompt ​is essentially a piece of text or a set of instructions given to the LLM to trigger a ​specific response or action.


Think of it as a conversation starter that tells the model what you want it to do. The ​effectiveness of the LLM's response is heavily dependent on the quality of the prompt, ​making prompt engineering a critical skill.


Learn more about Prompt Engineering here

Crafting a good prompt involves more than just writing instructions—it's about ​understanding the nuances of language, predicting the LLM's interpretation, and ​iterating based on feedback.


The better the prompt, the better the output, which directly impacts the user ​experience.



The Evolution of Our Prompt Engineering ​Tool


When we first started, our prompt engineering tool was rudimentary—a basic input ​field where content designers could write prompts, with minimal options for testing or ​iteration. To test a prompt, designers had to manually input it into the system, run tests ​across devices, and map results through spreadsheets. This process was time-​consuming, unsustainable, and not scalable.



Blueprint background
Blueprint background
Blueprint background
Blueprint background

Realizing the need for a more sophisticated solution, we embarked on a journey to ​refine our tool. The first step was conducting user research to understand the needs ​and pain points of our technical team. With minimal documentation available on ​designing for AI, we collaborated closely with our engineers to align on the goals of ​the tool.


Cr​azy

8!

Building a Scalable, User-Friendly Tool


Our goal was to create a tool that allowed for quick experimentation and easy iteration. ​We wanted content designers to be able to play around with prompts, test variants, ​and track changes—all within a controlled environment.


The first iteration of our upgraded tool included an input field with a side panel for ​adjusting parameters like temperature, which controls the unpredictability of the LLM's ​responses.




But as we continued to iterate, it became clear that we needed more advanced ​features. We added an output section and a preview panel where designers could see ​how their prompts would translate in the actual chat environment. This allowed for ​real-time adjustments and a more streamlined workflow.


The Power of Templates and Cost Efficiency


One of the most significant insights we gained was the repetitive nature of certain ​prompt types. For example, prompts related to specific characters or tones of voice ​were often reused. To address this, we introduced templates—a feature that not only ​saved time but also reduced costs.

In the world of AI, every word in a prompt translates to a "token," and each token has ​an associated cost. By using templates, we minimized the number of tokens needed, ​making the process more cost-efficient.


Previewing, Testing, and Iterating


To ensure the quality of our prompts before deploying them, we built a comprehensive ​testing environment within the tool. Designers can now preview responses generated ​by different language models, test variants for A/B testing, and make real-time ​adjustments.


We also introduced a feature for previewing how text replacements and templates ​would look in the final chat, providing a clear picture of the user experience.


A Clean, Focused Interface


After several rounds of feedback and refinement, we cleaned up the interface to focus ​on what matters most: the content.


The tool now features expandable areas for input, single-sample previews, LLM ​response previews, and evaluator test reports. This organisation allows content ​designers to concentrate on writing effective prompts without unnecessary ​distractions.


The Final Takeaway


From a design perspective, there are countless resources on how to design with AI, but ​far fewer on how to design for AI. This project has been truly enlightening in ​understanding the complexity behind chat systems and their integration with AI.


It has deepened my appreciation for the intricate work that goes into creating seamless, ​user-friendly experiences in this space.