You have 832ms. Make some AI magic.

Soma Letter

Newsletters
Blog
Contact
Login
Signup

← Back to newsletter

Newsletter Title:
You have 832ms. Make some AI magic.

You have 832ms. Make some AI magic.This was the boundary. The constraint that seemed almost impossible to reach.
͏     ͏     ͏     ͏     ͏     ͏     ͏     ͏     ͏     ͏     ͏     ͏     ͏     ͏     ͏     ͏     ͏     ͏     ͏     ͏     ͏     ͏     ͏     ͏     ͏     ͏     ͏     ͏     ͏     ͏     ͏     ͏     ͏     ͏     ͏     ͏     ͏     ͏     ͏     ͏     ͏     ͏     ͏     ͏     ͏     ͏     ͏     ͏     ͏     ͏     ͏     ͏     ͏     ͏     ͏     ͏     ͏     ͏     ͏     ͏     ͏     ͏     ͏     ͏     ͏     ͏     ͏     ͏     ͏     ͏     ͏     ͏     ͏     ͏     ͏     ͏     ͏     ͏     ͏     ͏     ͏     ͏     ͏     ͏     ͏     ͏     ͏     ͏     ͏     ͏     ͏     ͏     ͏     ͏     ͏     ͏     ͏     ͏     ͏     ͏     ͏     ͏     ͏     ͏     ͏     ͏     ͏     ͏     ͏     ͏     ͏     ͏     ͏     ͏     ͏     ͏     ͏     ͏     ͏     ͏     ͏     ͏     ͏     ͏     ͏     ͏     ͏     ͏     ͏     ͏     ͏     ͏     ͏     ͏     ͏     ͏     ͏     ͏     ͏     ͏     ͏     ͏     ͏     ͏     ͏     ͏     ͏     ͏     ͏     ͏     ͏     ͏     ͏     ͏     ͏     ͏     ͏     ͏     ͏     ͏     ͏     ͏     ͏     ͏     ͏     ͏     ͏     ͏     ͏     ͏     ͏     ͏     ͏     ͏     ͏     ͏     ͏     ͏     ͏     ͏     ͏     ͏     ͏     ͏     ͏     ͏     ͏     ͏     ͏     ͏     ͏     ͏     ͏     ͏     ͏     ͏     ͏     ͏     ͏     ͏     
Forwarded this email? Subscribe here for more
You have 832ms. Make some AI magic.
This was the boundary. The constraint that seemed almost impossible to reach.
Bill french
Nov 27 

READ IN APP

But we did it.
I have a strong background in real-time data, so synchronizing data values across great distances in less than 250ms is not foreign to me. Doing it with generative AI is a very different beast. The challenge can be summed up like this:
Voice → Text → Thinking → RAG → More Thinking → Text → Voice
… and in less than 832ms.¹
It was a long journey. The project is extremely secretive and will continue well into 2025. Still, the people who comprise our small AI team and the vendors we worked with can be celebrated for achieving this performance milestone.
Deepgram
Their technical support team floors me. They will answer questions on Sunday morning at 7 a.m. on Slack. Who does that? I won’t go off on a rant expressing my admiration of the Deepgram platform. Still, I cannot avoid reflecting on the architecture and flexibility of the Voice API for building real-time agents. We are deeply intimate with this API, and Deepgram was gracious with their time and patience as we became thoroughly familiar with customization options.
Replit
This project is researched, built, and tested in Replit—100%. It was a slow start. Some of us were contemplating AWS tools and services. We explored Cursor and Hugging Face Spaces. And, of course, VSCode and CoPilot were in the mix. The cool thing is that we never forced anyone on the team to adopt Replit—it happened organically. Replit won on merit, and it did so without Replit Agent. I called it in 2021. I told everyone who would listen to get on Replit to transform development productivity. I wrote about the advantages of Replit a few times and mentioned it frequently like this passage:
… as most of my work occurs in Replit which has it’s own relationship with OpenAI. I believe this will change soon and a few of the insiders at Replit have already hinted that Claude is coming soon with selectable LLM-agnostic options. Last year I predicted this.
We now use Claude exclusively in our Replit development and documentation work.
Coda
Our entire project and content management happen exclusively in Coda. But we took it further by building Coda apps that automatically test and rank AI outcomes. Packs, of course, play a significant role. We use Coda as a testing and measuring platform for all AI work. Tracking the latency and AI performance chart shown above is an aggregation of many tests that push analytics into Coda through webhooks.
The Team
One of the best teams I have ever worked with. John Gibbs (aka Dr. Know It All), Vanessa Fitzgerald, Ben Flanders, Roya Barhaloo, and Lance King—all professionals learning very fast, and adapting to all that is real-time AI.
The Tech
This project is secretive, with lots of NDAs and several intra-organizational dependencies. What can be divulged with shallow abstractions may be of interest.
One aspect of our research helped us discover unique testing approaches. Formulating testing protocols in AI is challenging. We built APIs into our agents, allowing us to control them as headless systems. It enables us to unit-test agents and gather metrics for improvements, analysis, and future training.
We explored and found ideal methods to create LLM proxies. These functions serve in a capacity that looks like the endpoint of an LLM while providing greater agility for logically controlling workflows that interface with the underlying LLMs.
Not enough can be said about the future of AI with sound and voice. The era of text chatting with AIs is about to end, and we’ve developed several approaches that convert sound into data and back to sound.
WebSockets are the only way to do this work within the latency requirements. We’ve developed specialized techniques to simultaneously manage multiple audio and text channels across a web socket.
This project is not yet finished, but I am already looking to the future, hoping our next client will hire us to make some AI magic in less than ten seconds.
1Two of the bars in the chart are asynchronous processes. If you thought the combined milliseconds of the bars exceeded 832ms, you’d be misunderstanding the data.
You're currently a free subscriber to Impertinent. For the full experience, upgrade your subscription.
Upgrade to paid

Like Comment Restack 

© 2024 Bill French
3531 East Barrel Roll Drive, St George, UT 84790 USA 
Unsubscribe

Newsletter Title: You have 832ms. Make some AI magic.

You have 832ms. Make some AI magic.

This was the boundary. The constraint that seemed almost impossible to reach.