Epsisode

15

Mastering the Dance of Customer Conversations with AI: Insights from Eugene Mandel

Available on
Episode Description

Welcome to this fascinating podcast episode where we have the pleasure of diving into the world of AI with Eugene Mandel, the Head of AI at Loris.ai. Mandel, a natural language understanding expert and serial entrepreneur, offers intriguing insights into how AI is reshaping customer conversations in the business world. He shares his journey from a software engineer to an AI pioneer and discusses the future of AI in customer communication.

Here are the top five key points from this episode:

🚀 Eugene Mandel's career transition from a senior software engineer to co-founding a social communications (VOIP) startup and eventually AI in customer conversations.

💡 The unique mindset required for success in a startup, especially while managing a probabilistic product like AI.

🤖 AI's transformative impact on customer support - how AI can be used to recognize conversational markers, create scorecards for customer conversations, and provide context for better customer interaction understanding.

🔮 Eugene's vision of AI's future potential - creating guided summaries, improving accuracy in conversation representation, and enabling non-tech experts to create models for customer conversation recognition.

🔬 An in-depth look at managing experiments and transitioning them into products, with insights into the potential of AI replacing human agents in customer support.

Get ready for an enlightening exploration of AI's influence on business and customer communication!

(0:00:11) - AI Implementation & Startup Experiences (14 Minutes)
Eugene Mandel, the head of AI at Loris.ai, is a passionate natural language understanding expert who has been a founder of multiple companies in the space. We explore Eugene's journey from being a senior software engineer to working with machine learning for HR departments, and his co-founding of the telephony over API company Jaxter. We also discuss the challenges of engineering problems, and the trials and successes of transitioning experiments into products.

(0:13:47) - Startups, AI, and Unusual Mindset (6 Minutes)
Eugene Mandel, the Head of AI at Loris.ai, shares his experience in building startups and how he has come to work with human language and data science. He talks about the unique mindset it takes to be successful in a startup, and how he has seen the development of AI in language rapidly evolve in recent years. Eugene also discusses what makes product managing a probabilistic product different from a deterministic one, and how building trust with users and understanding what is good or bad can make or break a product. Finally, he talks about his current role at Loris AI and how they are using AI to help conversations between two individuals and in call centers.

(0:20:02) - AI in Business Conversations and Support (11 Minutes)
Eugene Mandel's experience building startups and his decision to move into the field of human language and data science are explored. AI is being used to help customer support departments assess the quality of their support, create a culture of continuous improvement, and train their agents better. Eugene's analogy of a conversation in a business context being like a dance, where different moves, sequences, and how to respond to particular questions can be defined, is discussed. Additionally, conversational markers and how AI can be used to recognize them and surface them into metadata to help CX departments improve their conversations, are discussed.

(0:31:26) - Managing Experiments and Transitioning to Products (14 Minutes)
Eugene Mandel, the Head of AI at Loris AI, approaches customer conversations, product management and engineering management with a timeline, distinguishing between delivery and experimentation. He explains how to manage experiments in order to turn them into products. Eugene also shares his insights on how AI can be used to improve customer support, and how AI might eventually replace the need for real live human agents.

(0:45:07) - From Experiment to Product (10 Minutes)
Eugene Mandel, the Head of AI at Loris AI, discusses the challenge of creating scorecards for customer conversations that can be evaluated automatically. He talks about prompt crafting and experimenting with AI models to extract goals and problems from customer messages. The importance of giving AI models the right context and definitions to better understand customer conversations is discussed. Finally, the implications of AI-assisted content creation and what it means for ownership are considered.

(0:54:59) - AI Models in CX (11 Minutes)
We explore how to create custom scorecards for evaluating customer conversations and the implications of allowing clients to inject prompts into these conversations. We discuss the importance of providing context that the clients cannot change, breaking the prompt into small atomic parts, and following the Unix philosophy of small and composable tasks. We also explore the real-world implications of these conversations, the importance of putting guardrails in place, and how user feedback can be used to reinforce learning. Finally, we discuss the possibility of building a machine that non-tech subject matter experts could use to create models to recognize customer conversations.

(1:05:53) - Guided Summary and Predictions for AI (7 Minutes)
Guided summaries can be created by asking specific questions and constraining the output to provide contextual representation and actionable insights. Chat GPT can be used to experiment with different summaries and AI can be used to create summaries that are more accurate and representative of the conversation. Scorecards can be created to evaluate customer conversations and the implications of allowing clients to inject prompts into these conversations.

(1:12:54) - Improving the Logic of LLMs (1 Minutes)
We discuss the potential for natural language understanding (NLU) to improve logic and reasoning in the next year, the possibilities this could open up, and how conversations with customers and product management can be improved. We also explore how scorecards can be created to evaluate customer conversations and how structured summaries can provide contextual representation and actionable insights.

0:00:11 - Tyler Wells
Welcome to the Data Chaos Podcast. I'm your host, tyler Wells. On today's episode, I have a conversation with Eugene Mandel. Eugene is the head of AI at loresai, the natural language AI platform that automatically analyzes your customer conversations. Driven by his passion for natural language understanding, eugene has been a founder of multiple companies in the space. He's transitioned from being a lead data scientist to heading up product. This conversation went deep into the world of AI and its practical implementation. Listeners will hear how experiments can turn into products and what happens when they don't. I didn't want this discussion to end and really, really look forward to our next conversation. I hope you all enjoy it as much as I did, so sit back, relax and enjoy the conversation. Excellent, eugene. Welcome to the Data Chaos Podcast. I really appreciate you joining me today, on this Friday, for a nice conversation. Hi, tyler, thank you, excited to be here. Well, good, before we start, I was looking through. Obviously, I've known you for a number of years.

0:01:22 - Eugene Mandel
A very large number.

0:01:24 - Tyler Wells
A very large number, right, and it was funny. So I was going through your background and I was like hey, I want to start with FaceTime and move forward. I printed out your resume from LinkedIn. You don't have FaceTime on there.

0:01:35 - Eugene Mandel
I think I'm getting to an age where you kind of start keeping only several last gigs on LinkedIn.

0:01:44 - Tyler Wells
Yeah, that's funny. Obviously, for us it was a long time ago and for listeners wondering, did these guys work at FaceTime like the Apple product? No, it wasn't. We actually worked at a company called FaceTime Communications where we did at one point auditing of instant messaging and compliance, and before that we were basically a chat-based call center similar to LivePerson. Right, yeah, that was us for a while. You were predominantly a software engineer there and if I look at your like looking at your resume is pretty incredible. You've been a three-time co-founder. You were in data engineering even before data engineering was cool. You've been a let's see here. You were at great expectations of head of product. You were directly as the lead data scientist. You were at job-own as the principal data engineer. You're now at Loris as the head of well, when I talked to you last year, the head of product, but now probably very exciting you are now the head of AI.

0:02:50 - Eugene Mandel
Yeah, there's a whole story there as well, yeah.

0:02:53 - Tyler Wells
Well, where do we want to start? Let's start with this. How about you when you and I worked together, you know, back when there was still MS-DOS and all kinds of other crazy stuff? You're a senior software engineer writing Java, and what was the transition after leaving FaceTime? What got you, because it looks like you went right into data, kind of like?

0:03:14 - Eugene Mandel
walk me through some of that, you know it's really funny because sometimes your story is kind of planned upfront and I know some people that actually are able to do that and the heads off in front of them. For me it's usually not true. So it's still this like for me the story is kind of emerges more when I'm looking back because I'm usually following my interests Right. So, like background, like I studied computer science, like software engineering when we used to work together. It was kind of Java, mostly backend, kind of backend software, but I always was interested in data and in human language and, looking back, that actually those two interests going through everything I did because right after FaceTime actually did go to a startup that was doing early kind of machine learning for language in the area of HR. So basically parsing resumes, parsing resumes, parsing position descriptions, matching people, creating tracking software with machine learning for HR departments. I was doing engineering there, I was doing algorithms there.

And I think the third interest of mine was like I was always interested in what people want. Like for me it's still a mystery. Well, figuring out how, like what makes people tick, what they want in terms of product and I'm saying that I was a product manager before I was called a product manager, especially when you're working in startups, it's very difficult to divorce yourself for to kind of divorce yourself from what this thing that you write is actually going to do for some person, developing particular value, solving particular value, solving particular problem, and especially when I was a co-founder actually after that, of a telephony over a kind of telephony over AP company called Jackstar, there I was well there was, yeah, you did a bunch of stuff, but correct me if I'm wrong.

0:05:18 - Tyler Wells
There's a little bit of, I think, a little bit of fame here on the Jackstar one. I think I was reading somewhere that y'all did the very first Google voice integration.

0:05:28 - Eugene Mandel
Yeah, that was a very nice hack. It was amazing. It was like playing with a sleep protocol. It still exists right, I think so. Yeah, okay, good, I just got kind of further from voice lately, so so that exists.

0:05:43 - Tyler Wells
Sip is still around, of course, I mean it's still prevalent everywhere. And Google voice is still largely there. And then there's also Fee now. So yeah, those, those are. They haven't, they haven't gone by it, they haven't gone dinosaur on us yet Cool.

0:05:56 - Eugene Mandel
So Jackstar was extremely exciting because that was my first startup, like as a co-founder. There were three co-founders One was kind of more on the business, kind of business on VC side. Another one was like subject matter expert from telephony, and I was the technical co-founder and that was an absolutely crazy ride. So I remember that in the days when Facebook was signing up about 100,000 people a day we were signing up I hope I'm not lying Something about between 50 and 60,000 people a day. So all the you know like there was this class of engineering problems that people usually say well, you know, it's a good problem to have. Let's not even think about premature optimization. I agree, but in this case we had to deal with all of those problems in the database, in the back end, in caching, in fraud, all of those like almost like interview style problems we had to solve. Like what if? Well, no, no, it's not what if this is happening?

0:06:59 - Tyler Wells
This is now, this is yeah actually it's a funny story.

0:07:05 - Eugene Mandel
So, company that you worked for Twilio, right. So so with Jackson, it was a crazy ride because basically we figured out how to make international calling cheap by using normal telephony, the pots right at the point of origination and termination, and then place most of the call over AP. And we had a lot of numbers in all countries, right. So you basically mapped a number in one country to number in another country and it placed the call. This way we were growing extremely fast. Actually, our VCs wanted to grow us fast. We did go through a lot of capital. That was very Silicon Valley kind of style story.

0:07:48 - Tyler Wells
Well, it wasn't.

0:07:49 - Eugene Mandel
Silicon Valley. So, after all, and then and of course you know this is what kind of saying that well, just keep you know you worry about growing, it will be fine. So we actually, like our CEO was constant in Garakir from LinkedIn, so all the ideas about viral marketing got kind of imported from there. It was really working so well. Users were inviting users, everything was growing. Three minutes was kind of this currency, of as actually we call them jacks, kind of our own currency. Everything was just humming, including the servers.

And then 2007 and eight happened. And of course, I don't know if you said well, guys, you probably should monetize. And it's like okay, yes, but we're growing and unfortunately we could not monetize fast enough. In the end the company was sold, but not a good financial outcome. However and here's why I mentioned to you Leo, so I remember that, you know, in the very end we were kind of sitting thinking, okay, like what can we like? Obviously this is not working because we're running out of money, but we built something so valuable here. What can we do?

And one of the things that we built was a platform that allowed let's call them normal, normal software engineers to write phone applications. Right, because the problem was that, like I was hiring engineers, I could not hire telco engineers because they thought that we're not serious Like you, just software people. I'm like I worked there it sounds like who are you Right? And I couldn't hire normal software engineers because they didn't know the protocol. So we kind of wrote this API. That was basically okay. Start a call, here's a call like connect this, play this. You know, hang up, start right.

And that we were able to hire like normal experienced software engineers to write telco application without understanding the underlying protocols of telco. And it's one of the kind of sessions when we were discussing what can we do about it, we were saying, well, you know, it's like we did it. I mean, I bet they will be companies that probably will want something like that too. It didn't go, unfortunately. So, of course, well, as usual, ideas are dying with us and execution is everything. But I still, in my evernote, have a screenshot from a whiteboard basically described this idea.

0:10:18 - Tyler Wells
So I wonder how, I wonder how close that that screenshot was to the sort of famous pizza boxes that Twilio had from their early days of architecture. So there's some Twilio lore you may or may not know about, but we have. It's probably still in headquarters or somewhere, if they still have offices up in San Francisco. But they had these pizza boxes where they had drawn like early architectural diagrams and to be interested to take yours and like see how close it was, because I think if that was 2008, 2009, that was probably still before Twilio and Jeff Lawson had even started it.

If I remember correctly, if I remember correctly, as we know again like ideas are really dying with us.

0:11:02 - Eugene Mandel
And yeah, selling is hard.

Yeah, and yeah, since then I kind of kept going into either co-founding companies or well being somewhere early. Actually, once I tried a larger startup. Well, it wasn't actually I don't know if at this point technically, job on probably wasn't even a startup at this point right, because it was like serious, some kind of large number in the alphabet. We were probably close to 400 people, yeah, but I felt that even that was probably a bit too big for me, because I just absolutely love that startup phase where everything is together the ideas, the technology, the figuring out who needs it and if anybody needs it and how we can help them. Also, I kind of gravitated towards B2B or at least business use cases, because I think I gave up on the hope of figuring out what consumers want, simply because consumers, people, yeah, no, businesses, like they've got something to solve, right.

0:12:11 - Tyler Wells
So they're a little more. They're a little more set in their ways. Where humans are finicky and it's you know they, it's like you know, put a finger in the air and which way is the wind blowing? Well, that's cool today and it sucks tomorrow, right, Exactly.

0:12:23 - Eugene Mandel
It's like what does it take to write a popular song or create a popular movie or popular, you know, to write a popular novel? I'm not saying that there is no algorithm. I'm saying that I that the algorithm is not known to me. However, in B2B you actually can figure out what people want, what people value, how much it's worth to them. Everything is much more well logical.

0:12:49 - Tyler Wells
Yeah, trying to describe people is logical, not so much Businesses, probably. I could see that I could definitely see more. So it was funny. You know, you mentioned sort of you love that early stage startup and you like the the going from probably you know they heard it coin zero to one. I heard it stated the other day a little bit differently, which was an interesting analogy, which was all companies are born dead and it's up to you to bring them alive. And so when you start something from nothing, you have nothing there's. There's no heartbeat and heartbeat being revenue. There's nothing to sell, there's no product. So that maybe is the body, however you want to call it, if we're trying to stick with the human analogy. But it was like everything is born dead and you're given basically some, some resources in the terms of capital and people, and it's up to you to bring that business alive.

So, I should agree.

0:13:39 - Eugene Mandel
I literally say usually this to team members right, like, okay, like if we start up, by default we are dead. Okay, let's go from there. And doing the startup requires this very unusual mindset. Right, where you have to hold two opposite ideas in your mind simultaneously. On the one hand, you know that, yeah, by default you are dead. You are a startup, the like, statistics are against you and in the same minute, you have to be optimistic, you have to be determined, you have to be hopefully happy. That usually helps to produce good work, very opposite ideas, and I like for me, took some time to figure out how to do it, but I feel like it serves me so well everywhere beyond work, it's, it's just a good mindset.

0:14:31 - Tyler Wells
Yeah, I mean it's. It's a tough mindset, right it's. It's. Everything is against you. Everything, all statistics, selling, building something, people, everything is just it's stacked up against you. But at the same time you love it so much, you enjoy how hard it is, you enjoy that challenge. You have that fear of failure. You have that drive to say I don't care what they'd say it can't be done. I'm going to make sure it is done and I'm going to figure out how to make it be done and ship this thing and build it and just get it out there.

0:15:04 - Eugene Mandel
Exactly yes, yeah, yeah, yeah, yeah. So since then, I mean, it's been mostly startups and more and more getting into data, machine learning and particularly human language. That's definitely the threads that are going through everything I do. It's been extremely interesting for me because human language is at the it always was at the forefront of AI. What we see in the last several years is truly just amazing, right. So it's actually well, maybe, okay, maybe we can talk to any of it about that, right? Maybe then about product management, right.

But so my last job before my current thing was head of product for Great Expectations, which is an open source software for data quality, and I absolutely love the team. Again, it's a startup, it's great. However, it wasn't really about human language and I was, and I kind of, you know, when I felt like I'm kind of done with one of I think it's called like tour of duty At the startup where we got to a particular point, I knew that I want to really get back into natural language, especially because I've seen what is going on Like. I've seen that the development is no longer like linear or like it's just all jobs right, like basically, things that previously like previously four or five years was impossible to do with AI in language, or either impossible or you had to Google with like hundreds, you know, hundreds of PhDs and unlimited money. Suddenly it's becoming possible with a kind of small and smart team. And now the challenge is not so much coming up with next foundational model but with figuring out how to productize the capabilities that today's model give you. And this actually kind of is related to product management. Right, because the more I started working with data and AI, I started getting more into product management, because product managing let's call it a deterministic product, like a normal, let's, you know, normal, more traditional software and product managing probabilistic product are related skills, but it's not the same skill.

Right, because, like how you treat specs, how you treat errors, how you treat expectations, everything changes how you communicate. It's so important, for example, in ML based product, to build trust with your user. Right, because the same, like every model, has errors, of course. Right, how do you position? It Can change whether you succeed or not. Right, because, well, let's say, you know in any model you probably like define some kind of precision and recall, right? So well, first of all, explaining to the users what it means, understanding what you can expect and what's good, what's bad. Like it's 0.8 precision good or not? Well, depends, right? I mean, it's like if you are creating an autopilot for you know, for a self-driving car, well, it would be probably lethal. If you are writing a game, it's probably good.

0:18:21 - Tyler Wells
It's close enough.

0:18:23 - Eugene Mandel
Yeah, close enough, exactly Close enough right.

0:18:26 - Tyler Wells
Yes, that's, no, that's I mean that's. It's interesting because you know, as you've sort of, your career has weave right. It's always brought you back to this core technology, which is around the natural language and the usage of natural language, and now you're in a position, at Loris, your current company, to really apply that, in the sort of latest models, in the latest generation of this tech, to something that is fundamentally human to begin with, which is conversations between two individuals and a call center.

Exactly, yeah, so tell me a little, yeah, yeah. So let's, let's, let's take into that a little bit more, because you went there, so you joined Loris AI right as as as had a product. I know we're jumping around to here a little bit, but I think I want to talk about Loris because I think this is the most you know, the thing you're, you're you're probably super passionate about, as I can tell, you're there ahead of product. Uh, we had talked a while ago. We were looking at, you know, building insights, data, everything else like that. You're dealing predominantly with that, and now AI explodes on the scene and so, like year and a half ago, maybe almost two years ago, it's, it's, it's creeping in, but it's not quite there. Now it's here, it's in full force. You're now head of AI at Loris. How has all of that fundamentally changed your thinking and where you're at, what you, what you do?

0:19:39 - Eugene Mandel
maybe just a couple of words about, like, what we do, right?

So and I think, uh, in, in any startup, you can describe what you do, at least on two levels, like pure business level who we serve, how. And then beyond the business, like why is it interesting, even if we didn't make any money I hope it was that we do make some money, so it definitely helps. So, on the especially, those things are related. So, on the business level, it's whenever you call support and you hear the. Well, this call may be recorded for quality and training purposes. So we are the quality and training purposes, but not only for calls but also for chat email. Basically, uh, we help a CX department, so customer support departments, to get interesting metadata about their conversations with their customers in order to assess quality of their support uh, create some kind of culture of continuous improvement, uh, and in order to train their agents, uh, better, but training their agents, it's only one of the possible improvements, right? Because sometimes you have to change your product, change your policies, update your content, and there is no like one silver bullet there. So that's on the business level, that's on the business level.

On the beyond business level, the analogy that I use is something like I usually ask people like well, how much do you know about tango. Okay, and usually people, like most people, don't know much about tango and I ask well, imagine I show you a video clip of a tango, of tango and I ask you to commentate, like I don't know much about tango. So I would probably say, well, I mean, it looks amazing, yeah, and it's like it's. Yeah, people like dance and it's very cool. However, if you ask the same question from somebody who is like a dance instructor or an expert, they would probably, for every second in that video, they would tell you what is going on, because they understand that this is a sequence of well-defined moves. Every move you can describe, you can evaluate the sequences make sense and it becomes very, very interesting, by the way, sometimes hearing commentators describe you know anything, whether it's boxing or tango or anything else, even if I don't know it, well, it's just fascinating. So same thing we want to do to conversations, to business conversations.

Right, because a conversation and business context is again this very like it's a dance, where you can define different moves, you can define which sequences make sense you can define how do you respond to a particular question, because the wonderful thing about customer support is that almost every conversation and customer support that you can imagine probably happened before. So this repeatability gives you this power and for us to define those moves, this language of conversations, this science of conversations, really, and applying it to business. That's of extreme interest to me.

0:22:45 - Tyler Wells
No, that's very cool. I don't think I've ever heard conversations between a support rep and the person calling in with the problem described as a dance in the form of tango. But largely it is, especially if you probably break it down to the repeatability, to the scientific nature of it. It's slightly rehearsed on one side, less rehearsed on the other side, but it's taken place in nature before. It's definitely happened.

0:23:11 - Eugene Mandel
It's not new.

0:23:12 - Tyler Wells
It's not new in nature and so patterns because it's taken place. Now, patterns can be developed and the person on the side that's a little more rehearsed can now be trained to be better at that novice on the right hand side to help them either better articulate their problem or help them solve their problem.

0:23:30 - Eugene Mandel
Yes. So just for example, like we have this notion, what we call conversational markers. It's situations that interesting, situations that happen in customer support conversations, for example, when somebody is asking, well, can I talk to your manager? I mean, of course, it happens every day, it happens right or threatening a lawsuit, or threatening a bad review, or just expressing confusion. Oh sorry, I don't get what you're saying. Can you repeat this? All of those situations are interesting.

Cx people would love to know when they happen, because you can analyze them and they can lead what you improve. And doing it themselves with keywords is completely untenable, because think in how many ways you can express some kind of confusion Hundreds, I mean. Human language is so unbelievably rich and with machine learning you can train a model that can recognize all of those markers, surface them into a metadata and report and now you can start answering questions like well, let's see which topics are correlated more with confusion or anger, like it's capabilities that you simply didn't have before. Unless you well, unless you force your analysts to read or listen, I know like thousands of transcripts.

0:24:56 - Tyler Wells
Yeah, the mechanical Turk, and that doesn't sound like that scales too well. So, but you're okay, you're, you were there starting off, had a product. What was? What was the move Was it? Was it your move to say like, hey, ai is so important, this needs to stand alone, because I would venture to think that, as a head of product he had a product at Loors you can kind of you're overseeing everything. What was the decision to say, hey, I'm going to create this, this entire role called head of AI? What, what? What necessitated that? I guess?

0:25:29 - Eugene Mandel
So several things. So you actually kind of said one of them, right, that you know AI is developing at such a speed and it's so core to our success that you know we need somebody who would focus on that. But it's not the only thing. We also are distributed company. So I'm in California, we have an office in Manhattan and we have an office in Tel Aviv. Most of our engineering and data science in Tel Aviv. It's 10 hours ahead of me. I'm naturally an early riser, like I usually start. My first call starts at 6am and that gives me probably three, four hours of overlap, but it's still not enough, right. So it's always been kind of a lot of difficult, right? However I was, I was doing it like of course, it's like hours, we're getting longer and longer and more stressful.

In the meantime, we recruited an amazing director of product in Tel Aviv, ronan and hi Ronan. So if you're listening, so incredible, right, and at some point those needs combined, right. So first of all, we need somebody, like I would love to be able to focus more on AI, because with AI, when you do product in ML, you cannot stay kind of high level. You must get into the data, you must get into the models. It's very difficult to just talk about needs without kind of getting your hands dirty and, on the other hand, in the meantime, kind of Ronan was fully kind of inhabiting the position and building all the relationships and already leading everything. So we just went with this grade. So he became head of product, I became head of AI and things are really going better, like on all of those fronts.

0:27:17 - Tyler Wells
David, can you? Can you really I mean, can you really stay high level when it comes to AI? Are you dealing with prompts? Are you doing prompt engineering, kind of where? Where's your interface with the LLMs as the head of AI?

0:27:32 - Eugene Mandel
So for the first. So when I was head of product, I think it took me seven or eight months before I actually installed all the Python machinery on my laptop, which was first time ever, right, because I'm not used to actually getting my hands dirty but I was so busy with all the non-programming stuff. Right Now I do have everything, but again, like I'm not playing a data scientist here, right, so I'm playing like I'm trying to preserve this very fine balance. I understand the business, I understand the users. I talk to users I need to but I need to understand the models that we're using. So well, we'll be using all kinds of an LP, right, we absolutely. Of course we use LLMs and of course we do experiment with open AI, of course, right, and of course we are trying to get an LLM in-house, but we also use all the models like from before that. So you know, it's actually very interesting that most of the building blocks that now came together in LLMs we already used for years, but now just people are much more aware of this. So I have to be able to see what model predicts I like.

Primarily, I deal a lot with definitions and validation of errors, right, because with language, everything like, well, defining the question is well, it's a cliche, but it's true, it's half of the problem. Right so let's say I don't know, just a couple of examples. Right, so one of the models that we have is a sentiment. It's a sentiment model, right so, as kind of every customer message that they say or type, we assess on a five point scale if it's negative or positive and usually what the sentiment is targeted at you know, the company, at the agent or something else. Just defining what's like people kind of say, okay, sentiment, not a problem, but thinking what sentiment actually means, you encounter more questions than answers. Right Because, is sentiment emotion Well, in customer support?

Not exactly, because if you are sad, that's not necessarily what I'm going for, it's more about it's kind of. It's about two things One, satisfaction with your experience with the company and your emotional state. Why? Well, satisfaction with the company? Obviously, why, right, because that's how CX department evaluates itself. And emotional state in order to manage efficient conversation, effective and efficient. Right, because, if you are very upset, if the customer is very upset, I, as a professional conversationalist, the agent, just like, just like, I guess, hostage negotiator have to come up like before. I just give you the information. I have to come up with some kind of de-escalation right, empathize.

0:30:20 - Tyler Wells
Empathize, Empathize, empathy, something yeah.

0:30:24 - Eugene Mandel
Exactly, but it's not. But empathizing is not just saying I feel your pain. It's not necessarily apologizing, because it's. If you're apologizing, it's very easy to fall into apologizing too much and then the way it comes through it's actually more negative than positive. It's human psychology, right? Okay? So this is just an example of defining what sentiment means and then, whenever our model spits out predictions, reviewing them, discussing them. Is it true, is it false, right? So I don't necessarily deal with core technology of models, but it's more dealing with data. Now, I cannot imagine AI product manager let's call it this way that does not do it. If you don't look at data, I don't know how you can. I'm not sure how you can product manage an ML based product.

0:31:12 - Tyler Wells
Just walk through. What does a day look like for you as the head of AI at Loras? Now you've got your laptop, you've installed all of this tooling, probably in Python. You're probably writing Python. Now you're dealing with all the data. But what's that day to day look like? And what does a good day look like in terms of coming up outcome?

0:31:33 - Eugene Mandel
So, well, I definitely try to talk to customers as much as possible, right, because with AI, you can do so much now, but figuring out what actually can help and what can be done in a satisfactory way, that's a lot of work. So half of my day is probably with customers or our customer facing team. The other half of my day is with our data scientists long conversations, reviewing kind of sessions, right, like, we review our total model. We make sure that we agree on definitions, how we test, like as always with AI. Right, you can kind of have a big dream, but how can we build something minimal, how we can create a sequence? Of course, well, it's product management, right. So with product people and engineers, I have to make sure that we do like, we do create some kind of timeline in ETAs and something. Things must be predictable at some point, right, which is a major challenge. Right, because that's actually has been one of the recent changes that we made.

So in product management or project management or engineering management, you always talk about ETAs, right, you always come trying to come up with estimates, right, and I start understanding that in some projects, of course, this is the way to manage absolutely and you can get better at estimation. However, you have to identify if your project is delivery or an experiment, especially in ML, and if a project looks like an experiment. You know there's this expression if you ask a dumb question, you will get a dumb answer. So, on a completely experimental project asking for an ETA or for an estimate, it's actually might be a dumb question and of course, data scientists will try to give you an answer and then they will blow through this estimate. It's not their fault because sometimes, like if the variance is very, very high, you don't know. So you can't pretend to know. I'm not trying to defend the position of oh great, so if it's an experiment, we'll just keep going, and maybe we'll go for a year or 10 years or until we run out of money. We cannot do that right.

0:33:45 - Tyler Wells
Right, no, I was going to say you also have a business to run that requires capital or cash to be made. So my question coming out of that is I get it like I like to treat things as an experiment. Everything's an experiment until it's not, but at what point does an experiment either become product or get killed.

0:34:02 - Eugene Mandel
Because at some point.

0:34:03 - Tyler Wells
like you said, the experiment can't go on forever, Otherwise you're just printing, you're just, you know, you're burning money at probably a ridiculous rate and it's not bringing back in revenue, which we all need in order to survive. So what does that look like? What gets killed? How does it turn into a product?

0:34:19 - Eugene Mandel
So you manage the experiment, kind of like. Scientists manage experiments right so well, because they have grants and they have money and time constraints as well. Right, so you come up with okay, we're asking a very big question. Answering this question might take months and months and months. Let's create a string of questions, string of hypothesis, starting from the very immediate and the smallest, and then, okay, so the presently we will answer. We have a hypothesis, let's validate it. And then, instead of an estimate, it's a slightly different question how can we budget answering this question four days or five days? Okay so, define the question and agree with data scientists. Agree how we answer, agree on our definitions, agree on risks. In four days or five days, whatever we say, we have another review and three things might have happened. Right, either the hypothesis have been validated great, go to next question or it was truly disproved, or we're saying we actually need more time. So it's kind of managing this time in very small increments. And, of course, as your confidence kind of goes up, you start getting closer and closer to productization. But for, like, I actually have a good example, I think.

Right, so part of what we do is QA for customer. Well, customer support conversations, right so every CX org has QA that picks very small number of conversations. They have like a scorecard of questions, of different behaviors that the agent must perform and they manually verify. Okay, did the agent greet the customer by name? Did the agent empathize with the customer's problem? Did the agent blah, blah, blah, right? Of course we want to make sure and this way you may be review I know 2% of your conversations, 3% of the conversations, and it's very demanding and, like you know, it's not great. We want to automate it right. And of course, we had this idea that, well, if we give a transcript to an LLM, can it answer the same questions? Actually, this is literally, by the way, we are a distributed company, but this idea was that was appeared in a conversation that we had in Manhattan when we actually were together over lunch.

0:36:30 - Tyler Wells
Well, I was going to say probably every call center is thinking to themselves okay, there's this LLM, there's this natural language processing. When can I get rid of these agents? Take all of this training data right, like, is that that's what it sounds like? That's where you're going right? Is it at what point? Is the LLM or is a model going to be completely replace a real live human agent? Is that is that? That sounds like kind of where you're starting to go right?

0:36:57 - Eugene Mandel
So well, we are not doing chatbots, at least not well right now. We're not doing any chatbots. I think to be a winner in this space, eventually you kind of have to cover like the entire surface of, from chatbots to analyzing conversations, into everything, but you don't have to do it at once, right? I actually on our blog, I have a blog post that kind of says, well, what I think customer support org will look like in several years, and I don't think agents will disappear. I actually believe and this is okay, I know that it's somewhat a controversial point right that instead of completely chatbot or chatbot like voice, or actual chatbot right, or to respond there, or completely agents, there will be a very interesting picture right where there will be a very small number of agents.

So, let's say, if you had, let's say, 500 agents, now you're down to 50. But people that are probably high-skilled, they're sitting in front of a very large monitor. It's kind of like you know, traffic controllers and they're looking at conversations that are being handled by technology and they kind of jump in when the technology gets stuck. Either the technology itself says, okay, human touch is required, or there is a signal from the customer that you know.

I want to speak to a human Because so even like the same, the same conversation doesn't have to be a bot or a human, right, because in any conversation and this is back to this science of conversation and moves some parts of the conversation I'm very scripted, like you don't have to be there, right, and some conversations, some parts of conversations, can be emotionally charged or very complicated, and that's for me the human. But imagine that a kind of agent or this support, analyst, support, I don't know, I think they probably will not be even called agents. I don't know what this role will be called, but just jumps in, jumps out.

0:38:58 - Tyler Wells
Well, I mean the world that I think we're going to get to, which we're probably not far from right. So, like say, you go from the 50 down to the 10. And the first day on your job is you're going to speak, to train a model on your voice so it sounds exactly like you, and then when a call comes in, it's not talking to you, the agent, it's talking to your model at first because there's no video, so they don't know. And that person, like you said, is looking at the you know, air traffic control thing, watching all the calls that have come into Agent A. We'll just call Agent A for lack of a name, or Agent Sally. So Agent Sally's voice has been trained in the model. Agent Sally's AI is answering all of these calls and if Agent Sally sees something going totally sideways, she jumps in seamlessly because the voice is exactly the same. Is that kind of where things are going?

0:39:49 - Eugene Mandel
Possibly, and well, we're talking, well, so we're saying voice, but there is actually, quite like in the last several years there has been a change, right, where it's more and more goes away from voice and to kind of chat like interfaces. Right, because, just like I, hate those.

0:40:04 - Tyler Wells
No, they're terrible. No, I want them to. I think I think I've never, I have yet, to have a good experience Like the experience when you get, like you said, chat bot. And there was a part of me like I cringed because I think chat bot, I think I think that that that shitty robotic voice which is like Hello Eugene, please enter your passcode now, your passcode has been like those things like those are just just. They suck.

0:40:26 - Eugene Mandel
Oh, no, no, I'm talking about chat Right Like.

0:40:30 - Tyler Wells
I don't even want to do that. There's just times where, like there's so many things, like anything you have to do with your credit card or finances right, you don't want to do a chat bot, you want to talk to a human, your airlines get screwed up, you want to talk to a human. And so I'm kind of like thinking to myself if you've got a, if you've got an LLM, if you've got a model that you can train on an agent's voice, and the vast majority of maybe those inbound calls are simple, they're handled by you know the training models and that person's voice, but it's not really them. And then, when something does go really, really bad, that person's able to dial, you know, connected directly in, and now that the AI is out of it, but it's that part, the real human now, but it sounds exactly the same.

0:41:14 - Eugene Mandel
Most likely yes, and that actually brings a very interesting question Should the customer know that they're talking to technology or to a human? It's not an easy question, by the way, right, because I mean, I was dealing with AI for AI for customer support a couple of companies ago and there we actually had to deal with this, and I bet then at least my opinion was that it's better to be upfront, right, because when a customer thinks that they're talking to a human but it's actually a bot, then any slip causes this. Like, really like, I didn't cheat it here right.

0:41:53 - Tyler Wells
So does it become an ethical question? Is it a question of ethics, or is it a question of trust, or they kind of go hand in hand.

0:42:01 - Eugene Mandel
I think they go hand in hand right, because in customer support, actually right, it's like one of the reasons I really like this field is that the goals of the company and the customers, in the best possible scenario, are not actually against each other. Like you remember, when I was talking about, like you know, this analogy of tango and science of conversations, I actually used to use boxing as analogy before that, and then it hit me that wait, it's actually like technically it's a good analogy, but in boxing, well, it's there.

0:42:35 - Tyler Wells
You're punching together. You're punching, you're getting right, you're getting punched in the face.

0:42:39 - Eugene Mandel
In tango? In tango, both dancers actually have the same, the same goal have a really good member.

0:42:45 - Tyler Wells
Yeah, yeah. Yeah, the boxing one is a little more as adversarial and you don't want adversarial in customer support. Yes, but it was interesting when you said, when you brought up the point is, do you tell the customer or not tell the customer that they're first talking to AI? And it made me think of I think I remember reading a study somewhere that said people talk horrible to their Alexa. They're super rude to their Alexa, even though Alexa sounds exactly like a human voice, but they know that it's not and so they're super rude to it. And their kids were picking up on the rudeness and we're talking like this rude behavior. So I tend to wonder if you told them. So, if you told the customer that, hey, you're dialing into our support center, you're going to be talking to an automated AI agent that sounds exactly like a human, but at some point a real human may come in, would they just be completely rude in the very beginning? And how would they know if they sounded exactly the same?

0:43:43 - Eugene Mandel
Oh, that's such a good question. So first of all, yeah, I did hear about that with Alexa. I actually think that if you say exactly, actually, this is great. If you say that currently you're talking to a technology or to an automation, but the agent is right there like at the fingertips and might get in at any time, I think they would actually treat it as a human, would not be rude? Yeah, and actually achieve this kind of you know, best of both worlds.

0:44:14 - Tyler Wells
Yeah, I would hope. I would hope because if all of a sudden they started off in this horrible manner, demeaning manner, degrading manner, and then they don't realize it's the agent, all of a sudden the agent is like whoa, wait a second, what the you know? They all of a sudden start firing back at the person because they're like, oh, I'm talking to a real person, I'm sorry.

0:44:34 - Eugene Mandel
And then they do exactly. Oh, I didn't know you were a human. I'm sorry.

0:44:40 - Tyler Wells
Yeah, you sound exactly the same. Well, no, I'm, I'm. I put you know, sally, the AI over here and now I'm really Sally and you're hurting my feelings. Exactly, yes, yeah, because the AI just won't have any.

0:44:51 - Eugene Mandel
The AI doesn't care, you know, but Sally's going to be like my God, but this is like a whole family of those like unbelievably interesting questions that current state of AI brings in, right, just like, well, obviously, like with you know, 10 GPT and everything right. So then more people start using it. Like, what does it mean, right? So, starting from, I know, like, let's say, if you wrote a blog post and you, let judge, a PD edit it, is it still yours? Well, yeah, I think that's not controversial, right? But if it wrote most of it and you edit, and you edit it, and then you agreed with it, and is it still yours? I don't know, maybe the answer is it doesn't even matter, right? But it's like we never had to answer questions like that.

0:45:36 - Tyler Wells
This is true. I've kind of gone with it. It slightly doesn't matter, because it's made my life obviously more efficient and simpler. If I'm having to write copy that, I hey, I give it a bunch of ideas and I'm like go give me three paragraphs on this that I'm going to put up on LinkedIn. Largely it comes out and it's like that's okay. It kind of gets it part of the way there. Then I go in and edit it myself and make it sound maybe a little bit more like me. So it's this hybrid thing and I don't know if I would be, you know, putting a little asterisk down there, like I wrote part of this. And so did you know Llama version six, five. I don't know. It's like I don't know, but it's. Yeah, it's an interesting, you know, time to be alive and a strange place to think about what is real versus what is not and what is mine versus, you know, the AI or the LLM, and it's the way that it was programmed or the model it was trained on.

0:46:32 - Eugene Mandel
Oh, absolutely yeah, and I think like being at the kind of in between people that develop this and people that consume this, it's like like I've been like having so much fun, right, because it's you have to both make it happen. You have to make sure that people understand it, you have to make sure that people expect something like something reasonable from this. I mean, it's been amazing, yeah.

0:46:55 - Tyler Wells
Definitely. So let's get back. We were talking about a real world example where something was started off as an experiment and then I would like hopefully we can have one that's like started as an experiment, then turned into a product, or how do you kill it, or when do you decide to kill it? So let's well, we got off on a fun tangent there to talk about some of the crazier future. But go ahead, Sark.

0:47:17 - Eugene Mandel
Let's talk about one that is turning into a product like as we speak, right so. Okay. So we talked about every CX department does the QA. They have their own scorecard and in the scorecard Because I saw so many scorecards from different clients and prospects you start seeing like every CX department comes up with their own. But if you analyze them, probably third, maybe half of the scorecard is kind of touching on the same things. It's not knowledge specific to the company.

But what does it mean to be a good professional conversationalist, like your empathy, your managing of conversation, you know, like being concise, being clear, being empathetic? And then we started kind of really overlapped thinking well, first of all, why do they create those scorecards like each for their own? What if we came up with a scorecard that would just handle this you know empathy, or I call this like good conversationalist scorecard? And what if we actually evaluated conversations against it automatically using an LLM? It started, okay.

Then we started trying, right, so first of all, you need to assemble a list of interesting questions that won't validate them with the possible clients, right, so, like, always like, is it worth doing, is it possible to do? Well then, how much it will take to actually do it. So it started from very early interviews it started being clear that, yes, they would accept it and it's very valuable, is it possible? Well, you know, we kind of started playing with judge GPT, just giving it anonymized transcripts, asking questions like here's a transcript Did the agent, you know, did the agent apologize appropriately? Did the agent acknowledge feelings, did the agent do this or that? And the judge GPT started, you know, like in their own way, giving answers, sometimes incredibly insightful, sometimes completely bonkers and sometimes in between.

0:49:14 - Tyler Wells
Right.

0:49:15 - Eugene Mandel
That hallucinating.

0:49:16 - Tyler Wells
It was hallucinating that I think it was what they call it right Hallucinid nations.

0:49:21 - Eugene Mandel
Yes, reading those answers when it hallucinates, it's just, it's amazing. It's just amazing, especially when you try to understand, like, what you know, when you ask it to extract goals and problems from customer messages right, it's like, I call it it immediately it's very easy for it to start climbing all the way up on the mass low pyramid of needs. Like somebody says like you know, I want, like, okay, I want to refund, right, so I guess the goal of the ticket is I want to refund. But then you know, it came, it comes up with the customer's goal is managing their finances Efficiently and not getting a refund is a problem preventing them from achieving. I was like, oh, no, no, no, no, no. If I don't stop you now, you will probably say that the goal of the customer is self actualization, like all the way up on the pyramid.

0:50:12 - Tyler Wells
So we started experimenting with prompt engineering.

0:50:18 - Eugene Mandel
Okay, by the way, I don't like calling prompt engineering engineering, because engineering implies something much more repeatable and organized, but it's like a prompt.

0:50:29 - Tyler Wells
Yeah, yeah, okay, yeah, yeah, it's prompt. Yeah, it's prompt crafting. Yeah, yeah, it's prompt crafting.

0:50:34 - Eugene Mandel
Yeah, exactly, oh it's prompt creation yeah, so we started, we started experimenting right, like basically, how much freedom to give it, what context, what definitions to give it, giving it answers right, giving it examples, and slowly it started working right. So, but of course. But then we saw that we kind of came up with maybe, like I know, about close to 30 questions that would be good to put on the scorecard as we started developing prompts. One of the questions in the experiment was well, is it possible, is it possible to get consistent answers? And let's watch ourselves how long it will take to develop this per question. And we saw that, okay. Well, yeah, rolling out like all 30 plus, okay, that's not really feasible. Let's identify, like you know, four, five, six, most important ones, just to roll out as like, as well as an actual feature.

We identified them with, testing them on multiple customers, and this actually moved from an experiment to a feature that is already on the roadmap. We actually going to be well, we want to release it, most likely next month and, of course, there, from the model it started, you have to think about product managing it in all other ways, right? So will the clients be able to edit them? Will the clients be able to pick and choose which line items they want. If they disagree with the decision of the model, can they mark it out? What will we do with this markup, like how to think about priority, how to what expectations to create? But yeah, that was a really good example of moving from experimentation to a product. And it's challenging, right? Because, like as a PM, you always have to talk to.

well, like as a head of you know, let's say head of product, head of AI, but you also have but I also have to talk to my CEO, right, and managing those expectations is not easy as well, right, because you kind of have to be very clear about what we know, what we don't know, where we can create an estimate, where we cannot create an estimate, right? So that's. It's like you could say that it's still product management, but it's definitely an update on how normal, regular product management is done.

0:52:48 - Tyler Wells
Now that makes a lot of sense.

Let's take a step back, as you said, something that was interesting to me.

It's something that we're thinking about too when you talk about the flexibility of injecting prompts by your customer right Writing their own, or being able to add to it, or a little more autonomy.

So it's not just a black box. That's something we've thought a lot about as well, because we were sort of at one point or it might have been me going down this point with my co-founder, nico, as we were thinking about how we can utilize it to better help people create metrics, insights, everything else on top of their data. I was sort of thinking about it from the standpoint of hey, here are the metrics, I have just go and do this magic and bring back a bunch of stuff. And it was designing a black box where Nico was more of the mindset of hey, there are things that we should sort of pre-populate context in those context windows that are not going to change, but really allow them the flexibility to provide their own prompt crafting. I say I stayed away from engineering their own prompt crafting to feed it into the context and feed it into the LLMs. What have your thoughts predominantly been on that?

0:54:04 - Eugene Mandel
So, when it comes to this good conversation and scorecard, we for now are not even taking the risk of letting clients inject their own prompt, like we wanted to right, because that was kind of one of the first really happy ideas. Well, imagine if they can create their own line item and we'll just be able to automatically evaluate it. On hundreds of conversations, nirvana, amazing. And then we saw how much kind of guardrails you have to build If I cannot predict what they will say and I'm not there to ask what do you mean by what? Was it appropriate to respond to? Not, and the LLM will not do it. So for now, we're just going with a predefined scorecard, but it's defined in a way that it captures all the commonalities from many, many scorecards that we've seen. So it's a very interesting compromise. If we, when we start experimenting with more custom things, I would probably do two things. So, first of all, yes, provide a lot of context that the clients cannot change, and actually part of this context would probably explain how to understand the injected prompt it's almost like how and the second thing would be to break the prompt into very small atomic parts, which actually is something that we are about to do now.

So let's say one of the line items in the scorecard is like well, did the agent appropriately greet the customer? So, appropriately greeting the customer is, I guess, well, if the name of the customer is known from the beginning, you definitely shouldn't greet them by name. If it's an anonymous chat, obviously it's just hide there. If it starts as anonymous chat and then you authenticate yourself, you probably should use their name somewhere later. And we are talking about the simplest of possible behaviors and already there are all those kind of three branches going all possible ways. So creating very small line items like greeting or just greet, I don't care how Okay, gritting by name, using the customer's name somewhere in the conversation, and very small tasks, almost like following the Unix philosophy of this small, composable tasks and then letting the client pick and choose what makes sense for the client.

0:56:41 - Tyler Wells
Because I guess I mean, obviously the big fear here is if you open it up too much and you allow them to inject a bunch of prompts and the AI decides to eat a handful of mushrooms and provide back a bunch of hallucinations and the agent just starts regurgitating those hallucinations as if they're correct answers. You could have a really bad, obviously a really bad experience with that particular kind of behavior, but you can't really do that. So you could have a really bad experience with that particular agent because of the AI going off the rails.

0:57:15 - Eugene Mandel
Exactly yes, yeah, yeah, Of this line item, this scorecard. It's basically QA and evaluating agents, Right? So it's kind of high stakes, right, Because in the end those QA scores, they become part of people's bonuses or like when they're retained they're not retained, so you can't allow too much like well, it's not exactly autopilot in a self-driving car, but it's not a game either.

0:57:44 - Tyler Wells
No, people's livelihoods are at stake, and then that company's reputation is at stake. Additional revenue could be at stake. I mean, there's a whole bunch of real-world, tangible items that could be compromised if that thing goes off the rails or if something doesn't work, the way it's supposed to Interesting. So that's, there must be just an incredible amount of testing and guardrails you're having to put in in order to prevent that. And then if it does go off, what do you do to detect it?

0:58:12 - Eugene Mandel
So this is something that we are so guardrails, yes, what you do to detect if it went off rails. And it doesn't apply only to this, right, because we have multiple models that built in-house, like models that predict C-SET, models that infer sentiment, models that infer those what they call conversation markers kind of interesting situations in conversations, models that predict contact drivers, like what was the main reason for this call or call, chat or email. Each one of them, and most of them, are built in-house. Every one of those models have some kind of error rate, sometimes because of problems with definitions, sometimes because problems with training sets, sometimes because of just inherent problems with the models themselves, and sometimes because the reality around this model changes, like what's called the candidate drift, right, this is something that we are actively working on now.

You obviously need some kind of monitoring, you need retraining and you need robust user feedback. Now, with robust user feedback, it's this is where it becomes very interesting, right, and it's like well, you have to work with UX people. We have an incredible designer on the team and she's very good about figuring out those micro interactions, and it's important to make sure that when users give you feedback on your models, they don't perceive it as, oh, I'm working as an annotator for this provider. No, it's data exhaust right. Ideally, they're doing something that they're interested in, and when they disagree with something, disagree with the prediction, it's they fix it for their own purpose, but for us it becomes a great feedback.

1:00:00 - Tyler Wells
Reinforcement, basically reinforcement, learning right. Yes, yeah.

1:00:05 - Eugene Mandel
Yeah, but this is all, like, you know, building, and actually this is what I think we kind of all discovered and understand. Now, right, is that our, like, the product of data science in our company is not just the models, it's the machine for building and improving the models. Because it's a very important distinction, right? Because it's not that like, yeah, like, data scientists work like everything is done kind of in the dark, right, and then the model is integrated into the product. Great, okay, but what next? Right, like, how can you give feedback, how you can retrain the model, how you can keep it on guard rails? So it's the machine for, it's the machine for building new models, especially the in language.

I think a lot of interesting models can be defined models not basic models, but features can be defined by people who are not data scientists and not necessarily even work for our company. Right, like, if we talk about conversation markers, right? So let's say, when somebody gets an idea oh, you know what? I saw that sometimes customers blame agents for, like, I think you don't understand me, we don't have a conversation marker like that and imagine that they start experimenting. Can I train the machine to recognize the situation? Do a search, come up with some messages, say okay, this is a positive example, this negative example. Say train itself like you actually can create a machine that would build those models. In response to a non tech well, non technical subject matter expert user Now that's the holy grail, right?

1:01:42 - Tyler Wells
That's yeah, cause I didn't say even sometimes you may not want your scientist training those models on something like that, because they're they're far away from the customer, they're far away from the experience. So you could imagine if you took your most, your most sort of proficient CX agents, folks with the most experience, and said, okay, what are the things that you, what can you extract out of their minds to train better models and allow them to provide that input that can, and then it, then it actually goes out and does all of the training.

1:02:12 - Eugene Mandel
That would be really cool, exactly, I believe that. Yes, like, from a 500 agent call center, you will go to 50 agent call center. The remaining 50 agents, well, let's call them support analysts. That actually can be dead. This job, right, it's create, it's being this conversation, conversation engineers, right, coming up. Well, when, when customers says this, actually you know this is the appropriate response. Let's train this, let's verify this. It's not exactly the same skill as agent, but it's not a data scientist, it's not a data analyst. It's somewhere. It's kind of very capable subject matter expert.

1:02:49 - Tyler Wells
Yeah, they're more like behavioral scientists. They're. They're different. They understand the human they understand the human psyche. Yeah, they understand the human psyche and the human nature and they're probably more more grounded in in the humanity of it than the science or the the math behind it.

1:03:05 - Eugene Mandel
Exactly yes, and this is like in any and every time you try to come up with this ML based product where humans and machines collaborate in some kind of well business use case. One of the most interesting questions is how to create a division. What work is best done by machines? Which work is best done by humans? Like the products that get this question right that are more likely to succeed, and for us it's still an open question.

1:03:31 - Tyler Wells
Got it. So think of open questions. What are you most excited about? That's coming up saying the next six months coming out of LORS, Anything that you're just like this is going to be groundbreaking, or anything you can talk about.

1:03:44 - Eugene Mandel
Things I can talk about. So okay, so I can. Maybe I can't talk about like specific features, just kind of our philosophy, like, unless they're like, you know, they were updates and everything right, just be responsible team member, but I can talk about a flavor of this. So there are a lot of companies that are kind of AI in CX or AI for CX. When I think about what's different about us, the words that you usually use is it's opinionated modeling of the domain, like we're not going to, we're not going to come up with a new algorithm, probably right, you know we have very, extremely smart data scientists, but, like, this is not our business, right?

However, opinionated modeling of the domain, it's defining the questions. Just a short preview. So one of the things that we do is detecting contact drivers, right? So, given a conversation, usually companies ask agents to manually create like a disposition tag, like what was the main topic of this conversation? Or it was about refund, or it was about cancellation of subscription. It was about the problem signing up for something, right, so they can analyze it. So doing it automatically and in order to so you give us a lot of data. We already do it today, right, we kind of cluster it and we define okay, here are the topics, and then, based on that, we build a classifier and then now you don't have to ask your agents to do it manually, it's done automatically. That works now.

However, when we cluster those topics, anything can become a topic, right, and so we started developing a theory. What does it mean a contact driver? It's not any topic. Usually, once you contact customer support, you talk in language of goals and problems, like what you want to achieve well, maybe what you wanted to achieve, what you're asking me and why, and things that prevented you from doing this. So what we're working now in while improving this contact driver automation is, first of all, extracting goals and problems from from raw messages and only then sending them to classifier or to auto clustering.

This is this, for me, is an example like you know, my favorite example recently that we didn't come up with any kind of new base model.

Yet helping the model to pay attention to particular parts of the thing, of the text, improves the quality of the output immensely, right. So from my, from my time at Great Expectations, I'm very much used to that One of the most, one of the most kind of bang for the buck things that you can do, that you can do for your model give it cleaner data, almost specific data, and so this is just a preview about kind of most of our features will involve this opinionated modeling of the domain of customer support conversations cleaning data, helping models pay attention to a particular thing in the conversations and going from there. Actually, one more related thing right? So when people talk about an alignment, they always talk about summary, and this is one of my like pet peoples, right, because when people ask, well, can chat GPT create a good summary of something, it usually makes me very annoyed, because I don't believe that there is such a thing as just a good sound like the best summary of a text. Because I would agree with that summary.

Yes, thank you. So what's? I agree with that is driven by what do you care about? Like, if you have a movie, you can summarize it in 100 different ways. You can summarize it around the plot, you can summarize about how it makes you feel, you can summarize it against the level of excitement that it generates. Many different things, right?

1:07:30 - Tyler Wells
So this is I was gonna say no. I agree that the summary to me seems like a very personal thing, and a summary is an interpretation right, it's an interpretation of either events that took place like real events that took place in the real world, or it's a summarization of characters or words you know from a book or you know a movie, anything like that. But it's all subject to interpretation. I mean, I have to for the, for the podcast.

I take the transcripts and I run it through some AI and I say you know, give me the summary of this. And I can't tell you how many times I read that. I was like that doesn't even sound like my conversation. It sounds like this, this completely alien thing that has maybe a few talking points that we got to, but it's half the time. I would say I have to rewrite it or I just throw it out completely because it doesn't. I feel like it doesn't represent what I would expect a summary of this of a conversation to be, because it'll be interesting when I take this conversation here and I'll run it through the same, the same, the same, the same AI that I use.

1:08:37 - Eugene Mandel
I would love to see that.

1:08:39 - Tyler Wells
I will share it with you. I tell you what. I will share it with you before I publish it and we'll see. It'll be. It'll be interesting to see, like is that really? Is it really what we talked about? Does it actually capture our intent? Does it actually capture the emotion? Does it actually capture, you know, everything that took place? Or is it just kind of this, this, this robotic? Well, I see some keywords here. I'm just going to put it together, whatever.

1:09:01 - Eugene Mandel
So exactly, so we have this approach to some summarization of customers' support conversations. I call it it's guided summary. It's basically it's more like a question question answer game. So well, you know like we will build our sounds, probably right. Like you know, I always tell, like chat GPT is awesome for just experimenting with things, like when I think about the spec, when I think about whether it's even possible, right, and I kind of came up with I thought about, like what is it being to summarize a customer support call? It's you have maybe five, six questions that you definitely want to be addressed, right. Like, well, what this call? Why does customer contact with us? Was it urgent? How did they feel about it? Did we get to a resolution? If not? If not, was there some kind of follow-up that was agreed upon? How was the customer satisfied with the call? Maybe a couple of others, right? So when you, even with chat GPT, when you ask it to summarize this call while addressing the following points, suddenly the quality of summary that comes out is like 10 times better.

1:10:06 - Tyler Wells
Like that. I like that. I should play with that because I mean, a summary itself is just saying write a summary. It's pretty open-ended, right it's? You're leaving it open to a lot of interpretation and leeway, whereas if you constrain it, yeah, you're seeing the outcome of that being far more useful and being far more actionable in the sense of, especially with CX. You want that summary to provide, you know, contextual representation with actionable insights that you can, somebody may follow up on, you can understand.

Is it solved? Is it not solved that I leave the customer in a lurch? Is my customer happier or angrier now? Do I, my brand reputation taking a hit? There's all these things that you kind of want to look at and understand, which also come into the scoring, of course, and everything else like that. But, yeah, I could see how that's. That's, that's interesting. I should try that too. I should just say create a summary. So I'll take our conversation when I'm done here at some point in the next week, I'll say create a summary. Then I'll try to put in five questions that I want answered as part of that summary and see we can sort of play around with it Because it's.

1:11:12 - Eugene Mandel
I think that a really good way to model this is always like imagine how a human conversation about the same topic would happen and then put it in a prompt, right? Because imagine, like, okay, like, if you imagine that one agent is starting their shift and they need to get from the agent that ends their shift, like well the picture. Okay, like I'm going to take over, just can you summarize this quote that I need to follow up on? So you would probably start asking, you'll probably start asking those questions like what was it about? Or like so, was it resolved? If it was resolved, maybe I don't even need to know about, right, if it's not, then what else? So it's possible that even in different use cases, for different people, the same call can be summarized in the most effective way slightly different, or maybe not even slightly different, very different.

1:12:02 - Tyler Wells
Probably very different. Yeah, depending on who's going to read it and who's going to interpret it and who's going to want to do something with it or just, like you said, cast it aside. Exactly, super interesting. Well, eugene, I appreciate you hanging out with me on a Friday afternoon. I know you've got another call coming up here. I think we could keep going for a while, of course, delving into all types of fun stuff, but if we do one more, let's do one last question here, and we're going to go for predictions, any large predictions, of where AI will be, let's say, this time next year, so 12 months from now. Do you think anything that you're sort of like, ah, this is going to happen, anything that you're thinking about?

1:12:41 - Eugene Mandel
Wow, right now, making predictions about AI might be a bit of a fool's errand, but why not?

1:12:48 - Tyler Wells
It's Friday. It's Friday, you know. It's a nice and fun day. Exactly, yeah, so.

1:12:54 - Eugene Mandel
I think in the year. Okay, so it's what current LLMs are not particularly good at is true reasoning, like real logic. I think and this is between think and hope that while I'm not saying that they will achieve, like you know, human level of like awareness or anything, no, but I think a year from now we will have an LLM that will be whose ability to reason and reason, like you know, really kind of build pretty long and the gains of logical arguments will be significantly better, like you know, different by degree from what we see today, and this will yet again open another kind of a new family of use cases that will be possible to build with them, and in a year it might be completely wrong and, you know, no logic will be there, but the style will be better.

1:13:49 - Tyler Wells
I do hope we don't go a year without talking again. I'm sure we will either get back on here or at least catch up here in the near future. But again, very much appreciate you hanging out with me today on the podcast. This was a ton of fun and one of these days we will do it again. So enjoy your weekend and we'll talk soon.

1:14:08 - Eugene Mandel
Tyler, thank you so much. It's been really, really fun, thank you.

You could be building more

Get a product demo to see how Propel helps your product dev team build more with less.

Stay updated and connected

twitter icon

Twitter

Newsletter