Democratizing Data Access: Insights into Hightouch with Kashish Gupta

Available on
Episode Description

In the latest episode of the Data Chaos podcast, I had the privilege of hosting ⁠Kashish Gupta⁠, the co-founder, and co-CEO of ⁠Hightouch⁠, a reverse ETL data activation and composable customer data platform SaaS company.🧑‍💼💻

Key conversation highlights include:

🔍 The complexities of managing chaotic customer data

🔧 Hightouch's engineering approach to creating a schema-agnostic product

🌐 The benefits of ⁠Snowflake⁠ for data management

🔨 The complexities of building a Customer Data Platform (CDP)

🧩 Hightouch's 220 unique data connectors to help customers synchronize their data to various SaaS tools, leveraging strong abstraction frameworks to simplify the process.

🎯 Hightouch's latest product line, the ⁠Customer 360 Toolkit

This episode is a gold mine for anyone looking to understand and navigate the chaotic world of customer data platforms. 💡
Don't miss out on the enlightening insights and practical tips shared by an industry expert. Tune in now! 🎧🎙️

(0:00:11) - Chaos and Entropy of Customer Data (7 Minutes)
I chat with Kashish Gupta, co-founder and CEO of Hightouch, a reverse ETL data activation and composable customer data platform SaaS company. We discuss the chaos and entropy of data in organizations, the difficulty in building a CDP or customer data platform, and the complexities of creating a true customer 360. Kashish shares his insights on the entropy of customer data and how Hightouch has built over 220 connectors to help customers sync data to their SaaS tools. He also talks about the ORM framework, strong abstraction frameworks, and how they enable developers to build integrations with minimal effort.

(0:07:12) - Data Abstraction and Schema-Agnostic Product (9 Minutes)
We discuss the challenges of data synchronization for modern and legacy products, and how Hightouch took an engineering approach to build a fully schema-agnostic product. We also explore the benefits of building a semantic layer and the ability for marketers to build visual queries without writing SQL. Lastly, we consider the implications of companies that don't have this technology and how it limits their ability to serve customers who are willing to pay for their services.

(0:16:10) - Snowflake Benefits for Data Management (9 Minutes)
Kashish Gupta and I discussed the potential of dynamic tables to unlock new use cases in Snowflake, the advantages of using Snowflake for streaming in and out of data, and how Hightouch has helped customers solve complex engineering problems. We also explored how writing data back to Snowflake can have a positive impact on democratizing data access for teams across an organization.

(0:24:55) - Build CDP With Hi-Touch and Identity Resolution (7 Minutes)
We explore how many companies are able to build a Customer Data Platform (CDP) in-house using existing engineering infrastructure such as Airflow jobs, AWS hosting, and Lambda functions. For those companies that are able to build a true CDP, it typically takes two years and a team of at least 10 people. For the majority of companies, they don't have the engineering resources to build a CDP and instead rely on ad hoc engineering work to maintain and update data pipelines. Hi-Touch's Customer 360 Toolkit was launched to help companies take this engineering work off their plate, providing them with identity resolution and a closer 360 view without the hassle of maintaining and reliability. We also discuss how Hi-Touch's solution works natively in Snowflake and Databricks, without the need for data to flow to Hi-Touch, and the potential of dynamic tables to unlock new use cases in Snowflake.

(0:31:57) - Data Integration in SAS (4 Minutes)
We continue our conversation with Kashish Gupta, co-founder and CEO of Hightouch, exploring the benefit of an abstraction layer to streamline data integration. Hightouch acts as the logic layer that helps transform data, eliminating the need for manual ETL processes. We examine the potential of dynamic tables to unlock new use cases in Snowflake, providing customers with a richer, continually updating data set. We also consider how the market is shifting to recognize the importance of APIs not just for data transfer, but also to instruct tools to do different things.

(0:36:00) - Building Trust With Customer Data (6 Minutes)
We discuss how Hightouch took an engineering approach to build a fully schema-aware UI for customers to gain insights into their data and how they sync it from one place to another. The importance of giving customers visibility into each step in the data pipeline is discussed, as well as how Hightouch provides an in-depth look at each row for debugging and capturing metrics. The potential of dynamic tables to unlock new use cases in Snowflake, as well as the advantages of using Snowflake for streaming ingestion, is explored. The benefit of an abstraction layer to streamline data integration and how it can help companies build a Customer Data Platform (CDP) in-house is also discussed.

(0:42:15) - Learnings From Launching an Identity Product (6 Minutes)
We examine the importance of customer feedback and how Hightouch took the feedback they received and turned it into a product offering. We explore the challenges that came with building an abstract solution that could work for any company, and the realization that it was possible. We discuss how the product is helping reduce redundant work and the overwhelmingly positive customer feedback that has followed the product launch. Finally, we look at the excitement of learning from customer use cases and continuing to iterate and improve the product.

0:00:11 - Tyler Wells
Welcome to the Data Chaos Podcast. On today's episode, I have a conversation with Kashish Gupta. Kashish is the co-founder and CEO of Hightouch, a reverse ETL data activation and composable customer data platform SaaS company. Prior to starting Hightouch, kashish has been a venture investor, a product manager and has started two other companies. We dig into the chaos and entropy of data in organizations, the difficulty in building a CDP or customer data platform. Now achieving a true customer 360 is largely unachievable. So sit back, relax and enjoy the conversation. Kashish, welcome to the Data Chaos Podcast. It is great to have you here. I am thrilled to be talking to you today. We are not only customers of Hightouch, but we are also fans of Hightouch over at Propel. I think this is going to be a great conversation.

0:01:05 - Kashish Gupta
Hey, tyler, thanks for having me.

0:01:07 - Tyler Wells
Absolutely. It's great to have you here. So big week for you this week, coming off a $38 million fundraise. That's amazing news, especially in this climate today.

0:01:18 - Kashish Gupta
Thanks, Tyler. I mean, the part that we're really excited about is actually the new identity resolution product that we launched. Fundraise is always good, but those are less important metrics than really what we're building for our customers Absolutely.

0:01:33 - Tyler Wells
So we're going to talk about that a little more. Without a doubt, no, the customer-centric view is where it's at. That's why we're doing this to begin with, so it's definitely a lot of fun there. So let's talk a little bit. Something we talked about in our intro call was entropy and chaos in customer data. What have you seen in your career, specifically at Hightouch, since you deal with data? A lot of data. Why is there so much entropy? Why is it so chaotic out there for everybody?

0:02:00 - Kashish Gupta
Yeah, it's because enterprises are complex. So there's a saying that we have which is everyone is trying to build their customer 360. They're full view of the customer journey, that kind of stuff in their business, and everyone that is realistic will realize that's never going to happen. There is no such thing as a customer 360. You're only going to get to 320 or 350, but you're never going to have the full view of the customer because of how complex the journey is. We've been fortunate enough to work with some of the largest brands and when you look at bigger and bigger companies, they actually have multiple products, multiple websites, multiple mobile apps, so it's very difficult for them to track their user profiles across all of those. So actually, folks will say data chaos or data entropy is a bad thing. I don't think it's a bad thing. I think that's just how the world works.

0:02:47 - Tyler Wells
It's a very normal thing.

0:02:49 - Kashish Gupta
Exactly, it's a normal thing. In fact. It's like in science. You can't say entropy is solvable, entropy is just part of life. I think that's the same thing it's up to us?

0:02:57 - Tyler Wells
No, I agree, I think it's up to us to help solve that. So, as high touch specifically, you started off as Reverse ETL. That's right, reverse ETL. You have now over 200 connectors across all of that and the biggest thing you're trying to do is take all of that data, put it to where customers need it and then now you're kind of helping them to tame that chaos that exists in all of that data. Talk a little bit more of those 200 connectors. That had to be a huge amount of engineering effort to build and maintain. Can we dig into that a little bit more?

0:03:34 - Kashish Gupta
Yeah, sure. So just context for the audience, high touch is well, we started in 2020 as a Reverse ETL platform. So a very simple way to get data from any database into any SAS tool. All you do is you give us a SQL query, we pull the results that SQL query and then we help you map the columns in your SQL query to the columns in your SAS tool and we will get the data there. And so we'll run all of the API calls, retries, rate limits, everything to get that data into your different SAS tool. And it could be a Salesforce, a Marketo, it could be a Facebook or Google ad tool, it could be an email tool like iterable or braids, really any SAS tool that you have in your business. And the whole idea, the vision, the beauty of it was if I update my SQL query and it's intuitive I'll have the support. Thatload really supports my SQL query. All of my SaaS tools should update with that new definition. So, let's say, I tweak the definition of customer propensity score, I can now update all my SaaS tools with that new definition immediately in like laser, fast time with no extra engineering effort on my part. So that's why it was so exciting for the market because they were really excited about being able to use their data warehouse as the central control plane for all their different SaaS vendors across their business, of which we now have so many SaaS vendors. That becomes more critical over time. That's what we started with. You're totally right. We had in that journey. We've now built exactly like 220 connectors all destination connectors that help our customers sync data to their SaaS tools.

The reason we were able to do this is because in the early days, we had really strong abstraction frameworks for our developers to build integrations. So I think one thing that Hi-Touch has as just a leg up against other folks in the market is really how much we've invested in our frameworks, and I'll just give you a couple of examples of that. We have what we call an ORM framework, so ORM is object relationship model and we build most of our destinations under this framework. So our engineers are actually only writing a few functions in that framework and then many things are done for them. So the actual sync logic is done, retries is done, dead litter queue is done for them.

We generate the UI automatically for every destination that falls under this framework, and the reason we're able to do that is because Sean, one of our first front engineer in the early days, he actually built this thing that we created internally. It's called FormKit. The idea is that our developer puts in to a JSON the different attributes of this destination and what they need in the front end, and then the front end is automatically generated in a very modular way. So the reason we can build 220 destinations is because a very small subset of those destinations actually require front end and almost none of them require a sync engine because the sync engine is shared. So I think we got quite lucky and I wouldn't say it's luck, I would say it's really. Credit goes to our CTO, josh, and then our architect, kevin Lin, for really building those incredible frameworks in the early days.

0:06:27 - Tyler Wells
It's hard engineering there, definitely without a doubt. So if you think about all of those in the beginning, did you ever track any metrics of how long it would take the team to go from say like, okay, you've got the first 10 connectors to the second cohort of another 20 to 50, and were you ever looking at those metrics and saying, okay, how is this getting better and faster for time to market for a new connector?

0:06:48 - Kashish Gupta
We didn't do a good job of tracking it. But I can tell you that in the early days when you could get away with shipping any code, it would sometimes take us a day or two per connector. Then we got much more rigorous. We had enterprise customers. We had to have everything locked tight. It became a week or so per destination. Then, as our frameworks improved, we actually started being able to ship fast again. But it really depends, because there's destinations like NetSuite or SAP that might take you multiple months to develop. Other ones might take you only a couple of days, and so it just really depends. That's actually the interesting part of data, which is that some types of tools accept data in a very clear format in a very easy way, and others might be completely abstract. They might just be a database in themselves, like NetSuite is, which is why it takes so much more effort to sync data to NetSuite.

0:07:40 - Tyler Wells
Do you think it was harder with some of those legacy products? I look at NetSuite and SAP. I mean, yes, they're still out there, but they're probably not as modern as a number of the other pieces that you support there. Were those always more difficult to do just because of the legacy nature of them?

0:07:56 - Kashish Gupta
Yes, exactly. Oftentimes it's also correlated with how on-prem these are or how secure they are. So, offing InternetSuite, there's a double step off that you have to do. So it doesn't fit under a normal off framework. I don't want to blame that. It was just legacy software. It's also sometimes that the security need for those kinds of software is just really high.

0:08:19 - Tyler Wells
Fastly different. Yeah, Definitely very different, though I could see that Super interesting With those abstractions. It's a big investment. There's an investment that Hightouch had to make from an engineering perspective into those abstractions. Obviously, you could have just said, hey, we're just going to turn these things out and not worry about it, Maybe take a less architectural or hardcore engineering approach to this. But you said those abstractions I think at one point we talked have helped out in a number of ways beyond just adding additional connectors. How has those abstractions impacted what you can do and build at Hightouch Internally, obviously?

0:09:00 - Kashish Gupta
Yeah, so this goal is actually tied to both abstraction and engineering culture. But a lot of engineering teams back away from building kind of like the end vision. They want to incrementally build towards that. They want to build the MVP, which is really smart. Right, you want to ship for speed, but in certain instances the customer is best served by kind of like the entire vision.

For us in this case it was to build a fully schema-agnostic product. So in most products when you use them, you have to kind of send that product to your data in that product format, and if to constrain your data to their format that's the most common paradigm it's asked. We wanted to build something that was completely schema-agnostic, because every single customer we work with is very different from each other, and so in the early days folks said you can't build a product that works for B2B teams as well as B2C teams. Their data looks really different. B2c people have users, b2b people have companies with users underneath, and so that's why we built a schema-agnostic product in state one, and I can kind of show you a little bit of what that looks like. But you'll see that it's really difficult to imagine in advance what something like this would be, but over time it becomes more and more clear that we need something like this. So like just to quickly show you being able to give our users these nice frameworks.

Your tables might be users, contacts and orders. Someone else might be companies, saas contracts and something else.

So the model in every some company is completely different, and it's actually not possible, in my opinion, for SaaS tools to follow a very strict data framework anymore, I think they need to be somewhat schema-less, and I think that's one of the trade-offs that we made in the early days.

We said we're going to build a sync engine, but it's going to be fully schema-agnostic. So you bring your warehouse schema to us and then you bring your SaaS schema to us as well. So we'll pull from Salesforce what is the table in Salesforce that we're syncing to. We'll pull from your warehouse the same thing and we'll help you match those up together. And then eventually, we launched a product for business users to build these queries without writing SQL, and the way to do that is we gave them a schema builder as well, where they could tie together the join columns between different tables, which would instruct our query builder how to turn their visual queries into SQL. And so, effectively, we built a semantic layer. We weren't calling it that, but we built a semantic layer way back in 2021. Tons of customers are on this and that's exactly why their IC marketers are able to build visual queries in our product, because this semantic layer informs our query builder on how to run those queries in the warehouse.

And so that was one of the things where folks said that we were crazy and that we were like taking on too much engineering burden and this just doesn't make sense to build such a complex product when you're not even sure that folks will use it. But we had really high certainty that marketers needed this. We had talked to so many of them that we invested the resources really upfront to build it the right way.

0:11:55 - Tyler Wells
But probably an easy way to inform your cells and whether or not customers are going to use it is. Look at the engineering work that a customer would have to do if you didn't support this. So if you weren't able to actually give them this sort of holy grail of coming to saying, hey, bring your schemas, we don't care, bring us whatever you want, what would that conversation look like if you said, well, here's what we support.

0:12:24 - Kashish Gupta
That's honestly, a great point. So another thing to think about is a lot of the companies that don't have that. They end up telling their customers you can't POC this, we're going to sell you 100 hours of professional services, we're going to help you implement this, we're going to hold your hand and, as a result, they're only able to serve the biggest companies in the world, and their software is pigeonholed really to only serve companies that can pay them a lot of money, which actually doesn't democratize access to this kind of infrastructure. It only builds it for the biggest companies. And so being able to provide this kind of schema agnostic infrastructure where customers themselves without talking to us, instruct our product on what schema to have.

It opens up the gates for smaller companies to have access to this great infrastructure that otherwise they'd have to have hundreds of millions of dollars of revenue to be able to afford. So that's what excites us as founders. It's like, of course, we want to make visual querying available for marketers, but sometimes even data people don't want to write the SQL query. It might take them an hour to write the query when you could just point and click, boom it's done. And we've seen that a lot. We've surprisingly seen really smart data. People want to use our visual audience builder and succeed in using it because it's just easier than writing the query themselves.

Yeah it doesn't have to be amazing.

0:13:38 - Tyler Wells
I was going to say no. It had to be amazing the first time you put it in the hands of a customer and to watch them say well, I wait, I don't have to do dbt on this way, I don't have to put some sort of intermediary transformation in between my system and your system in order to gain all the benefit of high touch. You're just like no, no, just send us the data. I mean, that had to be some magic there.

0:14:01 - Kashish Gupta
Exactly, yeah, and like people just feel unlocked, Like I think people are now getting tired of being really caught up in how to do things the right way, like how should I do this in the right way for the next 10 years? Versus I just need to get the thing done. If it works, I'll optimize it later. And so oftentimes they really enjoy that. And I'm not saying that we shouldn't be building dbt models out of these things. Oftentimes we should.

Sometimes you want to write it to memory. That way, when you run the query next time, it will be much faster, right? So for certain use cases, you actually do want to modularize it, you want to write it to memory, you want to index it, and there are reasons why we should be building models. But we should be building models when we want to reuse something, not only we want to try it one time, see if it works, and if it does work, then we'll build them all. Right, it's just a different mentality of how quickly do you want to get things done and how perfect you want to be on step one.

0:14:50 - Tyler Wells
No, absolutely. I think that makes a lot of sense. You do a lot with Snowflake. Let's talk a little bit about that ecosystem. I know you're just recently at the 2023 Snowflake Summit. You spoke out there. What excites you specifically in conjunction with High Touch and Snowflake, that you can do for your customers Are their customers and yours, which are joint customers in this case, the list is so long.

0:15:18 - Kashish Gupta
I'll give you the maybe the more interesting idea here, which is that Snowflake is really on track to become the operational center of a business, not just for analytics queries. So reverse ETL five years ago would not have made that much sense because your query would run overnight. Right, If you run your query once a day, your data is getting synced once a day, which means you're running suppressions on your users for the ad campaigns once a day, Because Snowflake has made it so easy to run these queries faster without doing tons of engineering work to optimize those queries you can actually run your suppression campaign every hour.

It's pretty cheap. It runs very efficiently. The engineering work needed to run that kind of operation is really low. So the combination of a five trends, snowflake and High Touch actually gets you a data stack that functionally can support many operational parts of your business. And so when I think about what excites me, there's all these business things. I could say, oh, the more people that use Snowflake, the more people that'll be ready to do reverse ETL. That's obviously true for our business, but when I think as an engineer, it's the fact that running queries faster and in real time is becoming easier for everyone and we're able to abstract so many things away from our engineers.

Do we need to write an index on this? Like, should we worry about B trees, all these things that you'd have to worry about in the past. I feel like the same type of tooling that used to be created for developers 10 years ago. All the developer tools came up. We're now getting those for data people, and so the infrastructure layer is actually just a lot easier to manage now. And so that's what excites me. One thing I see happening pretty soon is more streaming in and out of Snowflake, and once we can get streaming. I think we can really think of how do we use Snowflake as obviously not a Kafka replacement, but in some ways can we actually just read and write from Snowflake immediately.

Dynamic tables, for example, are just like. I want to know how they work and if they are going to be able to work as fast as we're hoping. But that's just going to be an incredible way to do diffs in Snowflake.

0:17:18 - Tyler Wells
Yeah, we see that as one of those big announcements this year to come out is dynamic tables. I know those have been out for a little while, but they're starting to get a lot of adoption, a lot of traction. Now we're seeing a lot of noise out there the different posts and medium and everything else. People are talking about them. We're looking forward to kicking the tires on those as well. I think they're going to unlock a lot of additional use cases that previously you would have to run services for in order to accomplish, which is additional cost and page or duty schedules and everything else like that that you can now just do inside of Snowflake.

0:17:51 - Kashish Gupta
Do you know any examples of folks that are successful with dynamic tables and what they're doing with it?

0:17:56 - Tyler Wells
We've not seen it yet, I've only seen the demos. We're super interested in trying them ourselves. So today we were very event driven architecture. This is at Propel and all of our events are flowing through a Kinesis data fire hose into snow pipe, landing inside of a Snowflake, but then we're using DBT, which is now running in a Fargate container, to do all the transformation and, I would say, decoration enhancement of those events, and then that lands in another table inside of Snowflake. What we want to do is we want to cut out having to run that additional container and that additional logic that we have to maintain and see if we can transfer all of that into Snowflake and run it in there in the form of a dynamic table.

0:18:45 - Kashish Gupta
That's cool. Wow, that's going to be awesome.

0:18:48 - Tyler Wells
We're hoping that's going to cut down on the spend but also just cut down on the time for that data to then be available, because we also use that data back in the Propel product for a number of visualizations and insights in product insights that are built into our console.

0:19:03 - Kashish Gupta
Yeah, this concept of like, I mean it's an age old concept. We've always thought about how we incrementally run our queries, but I think, yeah, I'm just quite excited to see how this pans out for the market. We've also had a lot of interest in just kind of writing back data to Snowflake as well. So a lot of folks will say, hey, the models I built in High Touch are this kind of this modeling layer that you've given me. Again, we're not a modeling layer. We very much still use dbt as the modeling layer for all of our customers if we can.

But when we think about the fact that people are getting insights on their customers in High Touch and sometimes they'll create traits on their customers show me the total orders from this customer They'll run the aggregation in High Touch in the visual audience folder and then they'll say, man, I wish I could write this back to my warehouse. And so we provide that right and so in some ways we're actually helping write back good data that is less chaotic to the Snowflake instance. And one thing I think about all the time is that that's actually the risk in becoming a MARTIC vendor for us, because if we only support marketing, other folks in the company might want that table too. They might want the orders over time table per customer to use for other things, and so it's really important for at least data products to think about any insight that we gather. How can we write that back in a way that's now usable for the analytics team or the attrition team or the data science team?

0:20:28 - Tyler Wells
No, that's a great point there, because there is you're running this job that's providing value, and most of the times, just because it's providing value, to say, the marketing team. There's also a ton of value that could be gleaned, maybe even by your customer as well, that could use that value as well, and so a bunch of other teams. That's like you talked about democratizing data. It's that. Access is how does everyone get access to it, but get access to it in a safe and secure manner. I like the idea of writing things back. Let's talk about some of your customers. You've got a ton of big logos out there that are usually high touch. Anyone's in particular that you have helped solve some really sort of hairy engineering problems that have now just gotten a much easier for them now that they've used high touch.

0:21:15 - Kashish Gupta
Yeah. So one thing I mean we see all the time there's a lot of noise in the market that says buy this product or buy that product. Then there's all these enterprise evaluation decisions. I have to go through a 12-month decision process for what to buy. I like thinking about that a little bit less, but unfortunately that is what a lot of our customers face. It's not that they have an engineering problem to solve and then they can just go solve it and make things work within their business. Oftentimes it's going through much lengthier sales processes, having to talk to multiple different vendors, etc. One example I can share is Warner Music was evaluating a CDP for three years in a row. Fundamentally, the reason they didn't want to buy a CDP is because they felt that we have a CDP. We have Snowflake, we have our data in there. I've already done stitching of my user profiles in Snowflake. I know what my user data looks like. I even have Spotify data at Snowflake, so why would I buy a CDP and then send?

my data there. I've actually, over time, what we've realized by talking to tons of these bigger companies is that that's the reason why CDPs, like the general customer data platforms, didn't take off in the early days and even now are not really taken off. It's because folks are not excited about having two sources of truth their data warehouse as well as their CDP. They're also not excited about giving all the data away to a CDP. That was the original reason why we started Hi-Touch. It was to make it possible to just use your data warehouse as your CDP, because you already have the data in there. That's what worked really well at both Warner Music and PetSmart. They both had a really good understanding of their customer and a pretty solid customer 360. I think the thing that these two teams did really well is they didn't wait for perfect data in order to get work done. They waited for data that was good enough, then shared it with the marketing team and at least drove the business forward. That's why you see them really building their e-com business faster than most brands, because they were able to get unblocked on the data side pretty fast.

The engineering problems we help them with is things like in the early days simply, I want to run hundreds of campaigns to my ads tools. I know exactly what they should look like and we gave them way to auto-generate these six. Just using Yaml and JSON, you can very simply instruct Hi-Touch to create syncs to different ad tools. It could be to let's say you're Warner Music, you support 4,000 artists, so you have 4,000 different ad accounts with different ad budgets. You want to be able to auth all those different ad accounts and then send different data and different syncs of data to those ad accounts. So you need to be able to instruct Hi-Touch in a very programmatic way.

So we built a programmatic way for them to instruct Hi-Touch. We built in the data pipelines that go to all these ad syncs and then we gave the marketers a UI to be able to build customer segmentation. So it was very many different problems. I think maybe the easiest way to describe this and submit this succinct is if a really good data person wants to be spending their time building foundational models for their company, like what is a user, what is revenue, what is an order, and less time writing one-off queries for the marketing team. So we helped them set up this audience builder to give to their business team that allows them to not worry about the one-off queries and then spend more time on the core work, and so that was really the biggest unlock in the early days, especially in 2021, where now a Warner is able to spend all their time doing, for example, identity resolution, stitching Spotify data, stitching off Eventbrite data all of these things in-house, and then let their marketers run queries on those things using Hi-Touch.

0:24:54 - Tyler Wells
So it's interesting to think. You went into some of these companies and they had their data in Snowflake or different places and they felt like they felt they had built CDP themselves in a sense. What was, when you go into a situation like that, what's the like? You're doing a, you're doing a replace, you're not maybe not doing a full replace, but you're saying, hey, Hi-Touch is going to give you better identity resolution, it's going to give you a better 360 view, it's going to give you all these things. But what I'm always curious about when you come in, what was the amount of engineering resources and infrastructure that these folks were having to run to build their CDP? It had to be pretty extensive. It wasn't something that was probably took years to build. And you're coming in and you're saying, well, Hi-Touch, we can do all this for you now.

0:25:46 - Kashish Gupta
Yeah, I have to be very honest, the folks that were building data pipelines, they were not building crazy infrastructure, they were using airflow jobs, any sort of orchestration type like just hosting pipelines in AWS and hitting them once in a while like Lambda functions. So it's quite interesting. You would expect that they did tons of engineering infrastructure work and, yes, that is true at the biggest companies and I can't name them by name but they did build a true CDP in-house. They built an audience, builder segmentation, something very reliable and consistent. For those it usually took two years and engineering team of at least 10 people. That's the standard, I would say.

But for the majority of companies they actually said hey, you know what? We don't have those engineering resources, so we're not going to build it. We're just going to build pipelines, we're going to maintain them and update them every month because they're going to break, but we're just going to do the 80-20 of just getting the pipelines. And so that was most of the companies that we've worked with. We go in and we find that they have all these pipelines they don't want to maintain, apis, are updating downstream and they have to continue updating these pipelines.

Marketer might ask for a new column and now suddenly have to add more code to be able to sync that new column. And it was just a ton of ad hoc engineering work rather than a big platform that they built once and for all. And that's actually why a lot of the folks said, hey, like we want to bring in someone to take this work off our plate, because it's not the building that's hard, it's the maintaining and the reliability. And what if it breaks? I have no idea that it broke until my marketer told me dude, my campaign's not running, what's happening here?

0:27:18 - Tyler Wells
So they just essentially get it part of the way there and so it's, like you said, 80-20. So there may be 75, 80% of the way there. It's doing its job but it's not quite. It's not quite fulfilling the promise of a 360 full identity resolution. And then Hightouch comes in and says, hey look, we can give this to you with the data you have, not requiring necessarily new engineering resources, no new pager duty schedules, Don't worry about the reliability, the resiliency. We've got all of that. And, by the way, now you're at full 360.

0:27:54 - Kashish Gupta
Exactly, yeah, and the full 360 piece is, again, not a promise that anyone should be making, and nor are we. So we just launched this product on Wednesday. It's called Customer 360 Toolkit, but that's really like a wrapper for the actual product that we're starting with, which is identity resolution, and that is something that will help people get closer to Customer 360. It's not nearly gonna get you the full Customer 360. But the whole idea is that you have user rows in your table for your mobile app, then you have it also for your table for your web app, then you have the anonymous users for your websites, and then you might have all the other things, like maybe you get data from partners for transaction data, you get data from your brick and mortar source for transaction data, and so you might have like five or six different places that are like user tables, and you might even have tons more if you're unlucky, right.

So these are the kinds of problems most of our businesses face, which is that they don't have a unified ID for what is a customer I can't just say unified ID, number 126, show me all the data I have on this customer, and that's what identity resolution does. It stitches together all these different tables. We let you instruct us on like which tables have which data and how do you want them to join, and then we build together a unified profile of I will be able to show across anonymous browsing history and logged in browsing history one unified profile of the customer for like one and with what one unique ID. And that's what we build. We build like that canonical model for our customers and we write it back to their warehouse and that helps them get closer to customer 360.

0:29:24 - Tyler Wells
And is this launched for all warehouses or specifically for Snowflake first?

0:29:28 - Kashish Gupta
Yes, it's actually launched for Snowflake and Databricks both and we're working on rolling it out for the other warehouses as well and it works natively in both those applications. So the thing that's really awesome is that there's no data that really has to flow to high touch. We can build you this canonical model in your warehouse without seeing any of that data, and we don't want to see that data. We don't want you to have to go through large info sec hurdles in order to be able to build your user model, and so we built it in really like a way not for business users, but for data folks to build this kind of model and on their warehouse without any sort of infrastructure concern.

0:30:07 - Tyler Wells
Nice. Now did you get access to the Snowpark container services. Is that running in there now, or is it running more as a native app?

0:30:14 - Kashish Gupta
This one's more like a native app. We're not yet looking at containers, but we have seen some companies super successful on it and, honestly, like a huge shout out to the Snowflake team. When they launched the container services, we were super impressed. It's really fully like any sort of Kubernetes container like it just like really seems like it should work. So we're gonna probably test it out at some point and just see what it looks like just to know, but that was honestly a very impressive launch for Snowflake.

0:30:43 - Tyler Wells
Yeah, I mean definitely. From my perspective at the 23 summit, that was the most impressive piece to me. They talked a ton about bringing apps to the data versus data to the apps. They're reversing that Just like y'all reverse the ETL, and like they've got the two ways to do that. They're bringing it through native apps, but then now they're actually making it, I think, even more robust by giving you that full container so a company can come in and build their software directly into a container like a Kubernetes cluster ship that right alongside the existing customers implementation, and not have to deal with all of the security implications. It's sort of like great, maybe no more SOC too. That would be really nice.

0:31:26 - Kashish Gupta
Yeah, exactly yeah, and I think to be honest we've shared this vision with them for a very long time, since the very early days, like one thing we saw happening was that it goes exactly back to the schema agnostic piece that I mentioned before. It's that you want your software to be built based on what your data looks like. You don't want your data to have to conform to the constraints of your software, and so that's why we're super pumped about data apps and this, this concept of bringing the app to the data. I do feel like the reason why there's so much extra work and unnecessary work in in in SAS is because of the autism integration problems, and over time we might actually see some of these integration problems go away.

So in the very early days we shared this vision with Snowflake. We said, hey, like what if you didn't have to worry about how data gets into snow, into Salesforce or into any of your other SAS tools, because high touch would just provide the abstraction layer for you, and we're very excited to continue being part of that vision with them. We don't have to necessarily like download the data from Snowflake and get it into Salesforce over time. If Salesforce and other companies build a better app layer that integrates with the data, we can just be the translation right. We can help transform that data, provide the like like N of over 300 example cases of how this data should be transformed, and we can become the logic layer instead. We don't actually need to be like ETL specifically right. So I think it's a net benefit for the whole market that we're having to worry less about like where is the data and more just on how are we using it.

0:32:56 - Tyler Wells
Because you become the glue right. So you're the glue between these pieces and a large amount. What you do, especially with the new feature, launch identity resolution, is that data is writing right back into that customer Snowflake. And so again, you talked about you're not having to move that data around anymore. You're giving it all of the richness, transformations that are needed and then it's right there available for them. And then now, I'm assuming, with dynamic tables it's going to be available, possibly even faster, and continuously updating, and so now you can start, you know, layering on sort of the snow park streams, and there's, there's a whole bunch of, I think, exciting opportunities to just continue to make that better as those apps are getting closer and closer to the data and are able to do more with it.

0:33:39 - Kashish Gupta
I totally agree and like one thing we should think about too is it's not just no fake like Snowflake will build us, and I'm sure other folks will have to build it too, and I'm sure, if they become big enough, that this as teams, saas companies will also have to conform to like kind of these new standards. That's exciting, like you need someone in the market to kind of push everyone forward. I feel like right now that's happening and we're seeing that like day by day or month by month, and in like three or four years this will just seem obvious and people will be used to this kind of new architecture.

0:34:10 - Tyler Wells
Yeah, I think I'm focusing on Snowflake a little bit too much right now and fanboying too much, because we were just there yesterday filming our powered by Snowflake video, which I saw the high touch one. Not too long ago, you guys did one as well which I was showing everybody inside the team today to go out and watch that, and so I'm kind of like, you know, still coming off that 2023 summit of excitement there and seeing a lot of the opportunity, and then you go, you know, you actually go on site and start recording a video with them and just like, okay, this ecosystem is amazing. But I agree also, we can't ignore the data bricks, you know we can't ignore a bunch of the other you know warehouses out there. You know, yes, they've got 8,000 plus customers, but there's a lot, of, a lot of companies in the world. So we've got to. We got to move beyond that.

0:34:55 - Kashish Gupta
Look at some other things, yeah, and I think it just encourages innovation for everyone.

And I feel like we've like not I don't know how to say this about bragging, but I feel like we've sort of done that for the market as well, where tons of folks have now started building reverse ETL. So people ask us all the time right, are you worried the staff members are going to build reverse ETL? Are you worried that your competitors are going to build reverse ETL? And there's maybe, like before, there were like two companies that called themselves reverse ETL. Now there's over 10. And so in for that reason, I feel like we've also really pushed the market to work in this way and we're proud of that.

I think you actually want folks to copy you because it shows the market that there was a good reason to copy. It actually admits the market and other technologists that there is a right way to do this and the whole rest of the vendors have kind of accepted the right way to do this, and so I hope that over time, like we even think of API is not as ways to transfer data, but API is are just ways to instruct tools to do different things. Like I might want to run my campaigns in the API If I want to get the user data in there.

0:36:00 - Tyler Wells
I think of more like ETL, so one thing I was curious about when you started off with the reverse ETL customers are coming to you and trusting that the data is going to move where it moves it's going to. You know the number of records there's. There's insight that has to be available for that data when those jobs are taking place. Tell me a little bit about what were the early, early day customer demands of insights that you had to give back to them, and was that sort of? Was that core of the initial product design? Was it something that came later? Because obviously I'm trusting with my data. I'm wanting to move my data from my, my, my production postgres over here into snowflake. I'm going to have questions like how long did it take? How many rows got moved, so some sort of insights there Can you tell me a little bit about? I mean, obviously I come from the world of customer facing analytics so I'm very curious what your own journey was there at High Touch.

0:36:55 - Kashish Gupta
Yeah. So we had a very strong perspective literally day one that we didn't want to build a black box. So one thing that we've struggled with with many other sass tools that we've used is that it's so difficult to debug what's going on and you have to reach out to their support team to figure out what's going on and to know just simple things like how many of my rows were rejected versus accepted. You have to go talk to someone or you may never know, and so that was really frustrating for us. I mean, the early days, folks said, hey, I'm not going to use your product because I have all these things that I could build in-house. It'll give me more visibility and more certainty that the thing is working. And so we got that full list of them from them. There was like 20 different things and we just built all of them. We said you know what, if that's what it takes, in six months, I can build every single one of these and give you a perfectly in-house like experience. But BSS, and so we were very strongly against between black box and in our UI, all these exactly we described like how many rows were synced. All this stuff was available in our UI Later on in the company life cycle we started actually writing back those metrics to the warehouse so people can run their own queries on it.

But since day one we've always had it and let me just tell you how deep we go. So we'll show you here's how many rows were in the query. Here's how many are changed or added that are net new that were not there before since the last run. For everything that's changed or added we're going to run the sync. Now in the sync step you can see here's how many rows we're trying to sync. Here's the HTTP request for making pro. Here's the return from the SAS API. Is it like was it 200 or not? And the exact like JSON response from the API. So you can see, like every single row you send, what the diff was, why the SAS API is returning.

You can see on aggregate which types of errors are coming the most often. So you're getting an email malformatted error most often in the sync. That means your email probably format is not constrained properly In your SQL query. You can probably fix that and then we'll even show you that, like here's the rows that are rejected, they'll run next query run. That could be in five minutes, that could be next hour, but they'll run again and then they'll get rejected again. They'll run again so you can see like literally the waterfall, like exactly what's happening with your data. You'll get alerts in Slack if something gets messed up or in PagerDuty, and so, because we gave people that visibility, the question of oh, I could build this in house better just completely vanished and that's really like in 20, 21, why we were like really capturing the market faster than other folks. It's because of developer experience.

0:39:17 - Tyler Wells
And it just it gives them that trust.

It was a big thing that we saw during our time at Twilio when we built the voice insights products, as folks would build these exceptionally you know these real time voice workloads on us in the form of call centers or you know chatbots or anything else like that but they had to understand what was happening. They understand when things were failing or when things were going right or how long calls were taking. And without those insights, we had customers coming to us saying like, hey, we're flying blind. We know that you are creating calls for us, but without us having that the level of visibility we need, we don't feel comfortable in expanding on the platform or bringing new workloads on here. And we found for us it was really about retention and trust that we were able to then build and garner with those customers by giving them Access to that data and they could understand when, how well things were performing. They can understand why something broke. When it broke, they would know was it hey, that is something Twilio did, or was it something we had a network issue?

Or somebody closed the browser, tap, there were all these things. And once they had that visibility which I'm sure is the same for you they become more and more comfortable to continue to bring you workloads, because they know that high touch is gonna do the right thing, that high touch is going to expose the correct data to understand the performance of what they're trying to get done to solve their larger use case.

0:40:40 - Kashish Gupta
Yeah, exactly, and I think, like it's like what we were talking about in the beginning, like how long does it take to ship a destination? Folks want to know that if this is generally available Not in beta that I can trust this thing with my job, right, because we are in their production data pipelines that are often times powering production use cases like ERPs or their email automation. Anything goes wrong and it's our fault. That would be disastrous and we would have actually be Quite upset with ourselves, right, and so that's why we have all these different fail-saves built into the system. For example, let's say, an engineer ships Some, some update to the system.

We have all these different fail-saves built in so that it doesn't affect any of the running sinks that even before it sinks like retry or like the next sink runs that only the new code only ships if it works properly. That's all these testing frameworks. We have to make sure that people's things can't get broken and that they can't accidentally send an email twice, because we have seen, like tools in the past that send an email twice really, really struggle to keep their customers, because that's just just like one of the worst things you can do. So, anyways, like there's, I think, like we've heard the horror stories enough, that we really cared about consistency here and Because, like some of the brands we work with might be sending like hundreds of millions of Data points per day, sometimes billions per week, we just have to make sure that when we sell distrust it's not just Something on paper, it's like really bad, just it's got to be real and 100% backed up.

0:42:10 - Tyler Wells
No, I love that you mentioned horror stories question. I always like to ask here is any? You've been around data a lot. You've obviously dealt with a ton of data. You've seen the good, the bad, the ugly, the chaotic. What about the biggest regret? Any big regrets around data that sort of stand out to you. It's like, ah, wish we hadn't done that, or wish maybe you hadn't done that, but something around those natures that you know you can share with the listeners interesting.

0:42:37 - Kashish Gupta
Well, as a company and Again, I don't know if it's a regret, it's more like a we launched this identity resolution product this week right, we Frankly should have launched it sooner, and many of our customers were telling us, hey, we need this thing. So it took us a while to kind of. I think generally, our company does a great job of taking the customer feedback and acting on it, but here we were like huh, maybe we should stick to our principles of being a data activation company, so we think we should just be sending data from a to b, segmenting it, helping people understand it, but we were not going to build models for them. That's something that should be done in dbt, it's something that she would have done in sequel, and we kept having this framework.

Part of the reason we had that perspective is because we didn't want to provide a cookie cutter solution and we didn't feel comfortable providing this kind of abstract solution that would work for any business, because we didn't know if it was possible.

We just genuinely did not know if we it was even physically possible to make this.

We should try to wait for me, and we didn't do the discovery for quite a while until the requests really added up. Then we did the discovery and we said you know what? I think we can build this. It's gonna be tough, but we're gonna build it in an abstract way that's gonna work for, like, all types of businesses, shipped it and now we feel great about it, but the overwhelmingly positive customer feedback that we get after shipping it tells us that we're late and we should have done it sooner and that we shouldn't really have shied away from doing transformations for companies, because we're not probably providing an abstract transformation product. We're just providing a transformation product that does one specific use case that almost every single business in the world has to do at some point in their life cycle. So we're helping reduce redundant work in a more verticalized way, we're. And then and like I guess the regret is just that we should have done that sooner, because folks were telling us this is something important for them.

0:44:16 - Tyler Wells
Yeah, I mean that's. That's always nice signal there, and especially when they're telling you, hey, I already doing that dbt stuff, but no, we want to do it with you and we're actually willing to pay you for it. You know, now it's an amazing signal. Now you say, okay, sure, you're maybe a little bit late, but now you've, you've got that in the marketplace. You launched it on Wednesday and it sounds like you've already gotten what you just said a lot of positive feedback, or some of your biggest customers now saying, hey, let's, let's talk about this, show me a little bit more. I'm sure you're probably having doing the roadshow now, right of Demonstrating the power of what can be done. What's what's next here? What's what's got you now excited that you've launched it, aside from customer uptake?

0:44:54 - Kashish Gupta
But yeah, I mean, what excites us really honesty is like learning from all their use cases, getting deep in their data models and just understanding how crazy it is and how complex it is and then continuing to iterate and prove the product because, again, like Typical identity resolution was something that was really just users and they're known or unknown events.

That's basically it.

The kinds of things that we will see and customers will show us because they feel comfortable showing us is going to be vastly more complex and that's just going to teach us so much more about their data models and what problems we might want to solve for them in the future.

So I think, like we just have really enjoyed this kind of open relationship with our customers, where our offering is flexible enough that they bring us their hardest problems. We see those hardest problems and we help solve them and then the product gets better and better over time. So I think, like, like, as we we're starting to meet some of the largest companies now, and in the past I didn't feel comfortable telling them we could solve their problems, because it would be outlandish to think that you could solve such a complex problem. But now we have a lot more confidence in that and I think, like just going into those rooms and having confidence that we can actually help here, we would not be making things up. That's inspiring and it makes me feel like we can really help folks. So I think that's hopefully, hopefully inspiring to the rest of the team as well.

0:46:07 - Tyler Wells
Absolutely. I love the, the progression of the company from reverse ETL to the realization of like your customers want this help. They need this help with the identity resolution, the 220 connectors. I mean y'all have built a ton of amazing tech and Continue to be a fan and and look forward to seeing what, what high touch does next. Really appreciate you taking the time today to sit down with me. I know you've got to be incredibly busy after you know one closing the round, even though that was probably two months ago, and then you know, to launching, you know, an amazing product like identity resolution and now starting to have those conversations with customers so pretty. To take the time and sit down with me is incredible. I appreciate the conversation. I think it's been a lot of fun, I've learned a lot and I hope we get to do it again someday.

0:46:53 - Kashish Gupta
Yeah, thanks, tyler, thank you for having me and, honestly, is just like want to be, just want to give a quick shout out to our engineering team Because for these kinds of things to happen all in parallel, it's because of how good they are and how willing they are to again tackle those complex problems rather than just greedy optimize like interim solutions. And it's a very weird thing to say because folks will say, oh, hi, touch is like shipping a new product every month and so there, people would assume that they're kind of just like light inch deep type products. But I really just feel like this is the first engineering team I've seen that will think about the long term and really condensed that down into a very quick roadmap and shift that immediately. Like I if any, if any of the audience, folk who wants like meet our CTO co-founder and just like kind of like learn how he's built that kind of culture, just definitely encourage y'all to reach out because we love talking about this kind of stuff and we are hiring senior back-end engineers if anyone's interested in joining.

0:47:49 - Tyler Wells
I mean, engineering culture is so important, for for a company it's so important to have that great leadership in there and it definitely sounds like you have it. That's exciting and definitely encouraged listeners to to reach out, talk to the CTO and obviously they're hiring. So have a look at their website. Where should they go to look at that, at all your jobs that are available?

0:48:09 - Kashish Gupta
Yeah, it's just high touch comm slash careers.

0:48:12 - Tyler Wells
Easy enough because she's. I appreciate it very much. Look forward to possibly doing this again someday.

0:48:18 - Kashish Gupta
Thanks, tyler, see you soon.

You could be building more

Get a product demo to see how Propel helps your product dev team build more with less.

Stay updated and connected

twitter icon