In this Fireside Chat from Annual 2018, Anurag Gupta (VP, Amazon Web Services) and Paul Hsiao (General Partner, Canvas Ventures) discuss the current proliferation and importance of customer data, trust issues that come with obtaining that data, and what businesses can do to be more thoughtful around data collection in general.
And if you haven’t heard, we’re building a completely immersive experience for SaaStr Annual 2019! With 3 full days of content sessions, featuring over 300 speakers from the best SaaS companies around the world, SaaStr Annual is filled with actionable thought leadership to help grow your business from $0 to $100M ARR. Use this link to get your tickets to the SaaS show everyone will be talking about.
Announcer: Please welcome Paul Hsiao, Partner at Canvas Ventures, and Anurag Gupta, VP at Amazon Web Services.
Paul Hsiao: Hey, guys. How’s everyone doing today? Thanks for having us. I’ve always wanted to be a talk show host. I can’t see you guys all, but thanks, Anurag, for giving us the opportunity.
My name’s Paul Hsiao, General Partner at Canvas. I joined the Venture business in 2003. The first two investments I worked on were Salesforce and Tableau. It’s really come a long way. Sat on a few SaaS boards today.
So privileged and excited to have Anurag here. For those of you who don’t know him, he joined AWS in 2011. From a short span of six years, these guys have built a multi-billion ARR business.
We’ll get into how they did it. It’s really exciting to have you here.
Anurag Gupta: Thanks.
Paul: How did you get to AWS?
Anurag: I was at a startup, which we sold to another company — you’re on the board, of course — I spent a couple of years there, to get the handcuffs off.
At that point, I couldn’t do another startup for personal reasons, so it was the question of where to go next. It’s easy to look back and figure out 10 years ago what company would you have loved to have been joined at that time.
The question for me was, 10 years from now, what’s the company I wanted to join now? What would I have wanted to join now? Amazon and AWS really were at the top of that list.
Paul: Bezos talks about the regret minimization, it sounds like you went through a very similar thought process there, too. What led you to AWS?
Anurag: I’m a systems guy. If you look at systems, these points were things…Transition, like mainframe to mini-computer, mini-computer to PC, PC to Internet, Internet to cloud and mobile.
When you look at those transitions, basically the whole stack has an opportunity to change. I saw the opportunity to change the stack.
Paul: It’s been astonishing, the depth of the innovation that you guys have introduced, the pace of the product, so that you have also done. More importantly, I’m surprised by the business model. You guys have turned a lot of non-consumption clients into mainstream customers.
The question is about, how did you guys approach this? What was the insight there that, from such a short period of time, you could actually come into the business and build such an off-ramp?
Anurag: You’re familiar with our working backwards process? Maybe I can fill you guys in a little bit.
What we do is, before we write a line of code, we work on the press release and FAQ — Frequently Asked Questions — document. Basically, there are two problems when you’re building something, what people need and whether it’s possible. Whether it’s possible takes a lot of time to figure out.
What people need is actually a simpler problem. Particularly for engineers, it’s very easy to get wrapped up in the technology. That’s very different from figuring out what people want.
For these big services, we like to go into large spaces, because you’re not going to get 90 percent of some small space. You want to do well enough so, even if you mess up, you’re going to do pretty well, and it’s a meaningful business.
Then, you know that you’re going to go into a space where you’re going to be feature-poor compared to what people are, at least when you launch. That’s the general nature of minimal viable product.
You have to look for a space of non-consumption. One of the things when we started Redshift was this notion that, “Enterprise storage, the growth rate is about four times the pace of data warehousing. Why is that?”
You get into, then, with your customers figuring out why they’re storing data that they’re not analyzing. Now, that’s a strange thing, right?
What we figured out is that it wasn’t simple enough. It was too expensive and was too slow to put more data into these systems. That leads us to a somewhat different model than what they have today.
Paul: Gosh, congrats on the success. If I track you for the last six years, it’s more than 12 products, from Redshift to EMR to Elastic Surge and now most recently, Neptune? I think about how we sit on boards where we dream about building a 100 million ARR or 150 million ARR in six years.
It’s incredible that you guys have built multi-billion ARR in the same time frame — basically from scratch — with a thousand reach, growth rate from…
Anurag: We don’t break out anything below the AWS level, but we’re pretty happy with where we are.
Paul: I worship these guys, if it’s not obvious to you all. What surprised you?
Anurag: What surprises me is that when I first joined AWS, it was a much smaller team than it is today, obviously, but the people in leadership roles are much the same. The ability to scale without going outside has been really impressive to see.
Paul: How is that? I sit on boards of some of these SaaS-enabled marketplaces. They are really powered by the services that you provide, from QuickSight and others.
I think about Transfix. It’s incredible what you have done, but yet you held the headcount the same. I’m just curious what about AWS that enabled you to do so?
Anurag: We think about scaling all the time. The key thing there for us is really working through mechanisms.
Jeff has this famous statement that good intentions don’t work. People don’t come into the office expecting to mess up. If they do, you’ve got bigger problems.
The question is really how do you create mechanisms so that you can make things that are scalable? Examples of that would be, you set a goal. Maybe it’s a revenue target.
You’re going to have trackable output metrics that drive that. You might have input metrics that drive that output metrics. Maybe you recurse a couple of times to say, “OK, treat that input metric as an output metric. What drives that one?”
That lets you watch the metrics. Then you can focus on the things that are doing really well — so you can double down on them — and the things that are doing really poorly, so you can focus your attention — which is a fixed resource — and apply into that.
The things that are in the middle, that’s fine. You don’t have to pay that much…
Paul: Is there an example of a service that you guys have applied that methodology to?
Anurag: Really, all of them. It’s across the board.
Paul: Across the board, from Aurora to Athena, you guys have done similar things with it?
Paul: Got it. Switching topics a little bit, you have an amazing vantage point on the transformation that is happening in the stack that many companies here build on. Just curious about what you see in the transformation right now.
Anurag: I’d look at the growth of data. I think that’s been a really big deal.
My estimate is data grows tenfold every five years. That’s basically three orders of magnitude every 15 years. If you have a petabyte now, you’re going to have a exabyte in 15 years. If you have a terabyte now, you’re going to have a petabyte in 15 years. That’s a lot.
That actually changes how you approach things. It used to be that you built these monolithic systems. Because of the growth of data alone, we’ve had to go to distributed systems, things like Hadoop, etc.
That’s also, in the future, going to drive a move towards things like serverless, going to multi-tenant systems, and pay for what you use, in very small request bursts. We could talk a little bit about that.
Paul: Totally. The notion that data — that 10x every five years — I’d encourage you to name that Gupta’s Data Law.
Paul: On the Venture side, we’ve followed the Moore’s Law for many years. That’s been the defining technology curve. What’s interesting is that, with the data increasing 10x, there’s a lot one could do.
How do you drive the innovation based on the data that we’re all collecting?
Anurag: Data’s like potential energy. It doesn’t do you any good if you just store it away somewhere.
It used to be that voice-recognition systems were garbage. If you ever had a wrist that was hurt and you tried to use one of those systems, every word would have a mistake in it. Now, suddenly they all work.
Why is that? It’s really two factors. One is that we have enough data to provide a meaningful corpus as a training set. We have inexpensive compute power that we can use to run the training models on and make them better, and better, and better.
I think that’s generally true across a lot of AI. AI’s been a pretty big change in the industry. We’re still in the hype cycle on it today, but it’s going to be transformative.
The other thing that I’ve seen is that people’s expectations change. You start off, and you’re saying to yourself, “Wow, isn’t it amazing? I can talk to my phone and have it do something.”
Two weeks later, you’re like, “How ridiculous is it that I’m on this voice conference, and I have to tap things in while I’m sitting here trying to drive my car, rather than just being able to tell it who I’m trying to talk to.”
Paul: That intelligence, how you build into the software stack, makes a difference.
Anurag: That’s right.
Paul: As we talk about the data increasing so much, what are these data? Are these business data? What are you seeing? What are we collecting here?
Anurag: At the highest level, the Internet itself is a video-serving platform with a small amount of data going on it. Just focusing on the data, there’s this…
We used to think about data in terms of things like business transactions. That’s a tiny fraction of it now. It’s all machine-to-machine now, or it’s Internet-scale consumer, and so forth.
The real question still is, what do you do with it? I can give you an example. Let’s think about, if I were building an expense reporting app five years ago, what would I do? I’d obviously make it mobile first. I’d do gestures. Maybe nowadays I’d do voice. I’d focus on productivity, and so forth.
That’s all the basic meat and potatoes, if you will, of building something. It’s not particularly defensive, because anybody else can do the same thing, particularly when they have my app to go and copy it.
Paul: Software development stack is so easy now, everyone could have these mobile…
Anurag: That’s right. What Jeff would say is, “OK, you’ve built a castle. Now, the barbarians are coming to attack your castle. You know what you need? You need a moat around your castle.”
The question is, what’s your moat? Data presents a moat, because there’s a flywheel for it. The more data you have, the more use you can provide into it, and then the more data you have. Those are positive things.
An example of that — going back to expense reporting — is that, if you think about, expense reporting could be viewed as a productivity tool, but it’s something more than that. It’s actually a measurement of transactions or interactions between producers and consumers, much like Amazon.com is.
If I’m a consumer, wouldn’t it be great if I go to Kansas City to find out what are the restaurants that people like me use? Very similar to finding out a book.
If I’m a producer, where do I stack-rank relative to other people, and what could I be doing to make it better?
If I’m a company trying to provide this app, how do I do automated acceptance of trivial transactions to prevent people from dealing with them? Or understand fraud, where people are doing something where I can collect that?
What’s great about that domain in SaaS is that you don’t have data for one company, you have data for a lot of companies. There’s a positive cycle there, that the more data I have, the more useful it is, the more data I have.
Yeah, it’s a real shift away from process.
Paul: It is. Having watched this industry for — gosh, how many? — 20, 30 years…There was the semiconductor, that sort of innovation drove a lot of…
Then, we put the ERP CRM on the last, what? SaaS is coming up 20 years old now. We’ve gone to the cloud transformation. We’ve gone to the mobile transformation.
It’s harder to build that motor on SaaS. We see the white spaces taken up. The pictures that come into our shops are…
You target a particular vertical industry, like farming — whatever the vertical industrial might be — or you target a marketed segment, like an SMB went in the market in the start.
We think about how do you make that SaaS more intelligent? How do you build something that self-learns? That trained data is something we think a lot about. Can you actually incorporate the proprietary data into that SaaS stack?
Paul: If you look at the capabilities that Athena and all these have provided, it feels like everyone needs a datalink strategy. What is a datalink?
Anurag: A datalink is, basically, once you have a lot of data, you need to put it into a central location. In AWS that might be S3, which is our central repository. I think of S3 as Chicago O’Hare. Everything can get into it, everything can get out of it.
You want that data to be in some standardized open format, so that the tools that you use externally or the tools that we provide, all of it can access it. You can’t afford to have your data be trapped, because it’s way too expensive to move it around, change it, transform it, or whatever.
Then you need a portfolio of different data analytics capabilities around it. In our case, we have serverless SQL in Athena. We have Hadoop in EMR. We have data warehousing in Redshift, and so on, so on, so on.
The real goal for us is to be able to provide something, so you just throw your data in, and the system will go and process it, understand it, and so forth. We’re not 100 percent of the way there, but it matters.
You need to secure that datalink, of course. Then, around the basic core of data analytics, you need some form of AI, so you can really gain incremental value off of it.
Paul: One of the things that we all struggle with — I think entrepreneurs do, as well — is the scarcity of funding data scientists, including the data engineers. I’m just curious about what you’ve seen and your recommendation on that as well.
Anurag: I see a lot of people on LinkedIn who are suddenly data scientists.
Anurag: In some sense, the question is what you need. I don’t know that it’s about data science and how many PhDs you can stack from CMU or someplace like that. It’s really about figuring out the value you’re trying to give to your customer.
It’s very much back to that working backwards process, of going and figuring out what am I trying to do? If I’m trying to build a recommendation engine, then there are particular things I need. I need someone who knows recommendation engines.
The great thing is that most of the algorithms are well-known, published, they have the open source format. A lot of us in the infrastructure industry are providing our technology stacks in Open-source.
The pace of innovation is really significant. What people really need, right now, is advice on what to do. Once they have a measure of advice, the building part of it isn’t dramatically different. I don’t know that you need in-house advice, necessarily. Depending on the size of your company, of course.
Paul: One thing that we see is, because the capabilities coming from you is so…Even though you’re humble to say, “Not a full stack,” but it’s really very complete. If someone wants to actually have the datalink, they can actually come to you. They want to integrate the data into the SaaS, they can do that, as well.
What’s interesting is, then the moat — going back to the moat conversation — is what’s the proprietary dataset that the individual customers have that can fit into the machine-learning model, that will give them an edge on whatever they can do for their customer?
Anurag: That’s actually a really important point. A lot of people, when they set up their initial contracts with their customers, they don’t include what rights that they have to the customer’s data. That’s an important question.
We all end up on these systems, whether they’re consumer or business, where every so often, the EULA changes. You just have to hit accept after paging through eight pages before you can do anything.
A large part of that is because people need better access to the data, not because they’re interested in invading your privacy, just because they want to share it, aggregate it, figure out how to provide a better experience.
The degree to which a company can set those things up easily — quickly, and so forth — removes those internal barriers. The value really is in aggregating a multitude of people together.
Paul: We used to think about what’s the IP rights to software? What is the IP stack?
Nowadays, we ask a lot of questions to the companies that we fund. What’s your data rights? When your customers give you data, or when you’re collecting data from different sources, do you have rights to that?
We’ve been surprised how some of the large SaaS companies — as much as the talent-based machine-learning talent that they have applies to it — they actually can look into their own customers’ data to actually apply intelligent…
Anurag: You have to be very careful about it.
Paul: It’s the privacy things…
Anurag: It’s a trust thing. You can’t afford to break customer trust ever.
If you’re a customer-centric company, like we aspire to be at Amazon, you have to be very thoughtful about what you’re doing and why you’re doing it, and that you’re doing it in such a way that it really benefits the person who’s providing you that information, not doing it in any sort of negative way.
Paul: Totally. Shifting a little bit, where we’re seeing so much of the things is global in nature now, just curious about what are you seeing from that perspective?
A lot of customers want to have these multi-master datasets everywhere. Historically, that’s been a really difficult thing to do. I’m just curious about what you guys doing on that front?
Anurag: We’re working on it. Projects are underway. We announced them at the most recent re:Invent, but it is a challenging problem.
People have an expectation for low-latency communication. If I’m playing Angry Birds in San Francisco and I go to London, I expect to still be able to start that game from where I left it off. What’s true in consumer is equally true in enterprise.
It’s a funny thing that enterprise tends to trail consumer. Think about messaging apps, and the cost of them on the enterprise side versus the value you get just by using, say, Facebook, and the pace of innovation on the consumer, and the simplicity of the experience, and the fact that it’s free — for me as a consumer, at least.
Paul: When you think about your product, and people are thinking about…We always talk about having product strategy. Coming back to the data rights a little bit — even that global in nature — how do you think about data strategy?
What’s the tactical advice that you give to people when they…Is it just even having that conversation on data strategy? When you meet your customers, what is it that you want them to think about?
Anurag: That’s a really interesting question. I, frankly, don’t know. I’m not a deeply strategic guy.
My basic notion is, “Do what my customers want and aggressively go and keep iterating.” That way, I don’t have to think forward five years. I can think forward one scrum. Maybe a little bit longer than that, if it comes to the question of something that’s just fundamentally hard to do. It’s really just a question of understanding what your customers want from that side.
From a data rights side, it’s a question of understanding what one might want to do, and making sure that you handle the privacy considerations, particularly with the European Union, GDPR and all of that, which is not just an EU issue. You want to be able to do that for any customer, anywhere.
Paul: Maybe another way to come at the question is, if you think about the capabilities — the 12, 15 products you have — what is it that people…?
We’ve gone from batch to real-time. We’ve gone from global in nature, in the depth of the pipeline and datalink. What is it that people don’t appreciate how they should use these products, yet that will turn that business advantage to them?
Anurag: What’s increasingly important is bringing simplicity for people. I announced a bunch of stuff at the past re:Invent — AWS’s user conference — in November.
I was pretty surprised that the thing that really most attracted people in the data space was serverless. This notion that I’m just going to send request, the servers spin up, spin down, resize, whatever. Why would someone care about that? You can build that kind of thing on your own, if you had to.
The benefit is that we are increasingly in a world…10 years ago, we were in the world where you had nine back-end engineers to every front engineer. That’s flipped over. Now you’ve got one back-end engineer, two or three ops engineers, and the rest of them are front engineers.
Which is awesome if you’re an apps person, because the front end is where you’re providing value.
Paul: You’ve taken the complexity out already on the back end.
Anurag: That’s right. The degree to which we can provide that simplification matters a great deal. It’s not just about AWS. There’s so many API companies that are out there now.
Previously, if I were creating an application, I’d have to burn four engineers just on how to do payments. Now I do an API call, and that’s amazing.
Paul: We like to think of it as the API application network. How do you build before, compared with how they tied everything together.
Anurag: I think the next interesting thing is going to be how the SaaS companies themselves become API companies. How do I integrate my expense reporting thing directly into somebody else’s application, so it’s largely invisible?
We think that having the screen time in front of the customer is valuable, but it’s really the data that’s valuable. Attention’s a scarce resource, and there are only so many applications I’m willing to learn.
We all probably installed 50 applications on our iPhones — or whatever — when we first got them. Now, we’re using four or five.
Paul: When you think about — coming back to that how do you add intelligent back into the software stack? — the notion of having that transformation we’re talking about — going from SaaS to the intelligent SaaS, and how do you build that data moat in there — are there things that it’s good practice that you’re seeing people do right now?
Anurag: One thing is that the world moves to SaaS.
Anurag: It maybe arguably has already, but there’s an awful lot that’s still sitting in enterprise applications, sitting on premise. People don’t want to work and be…
Those applications basically treat you like a clerk. You have to spend a month learning how to do order entry. I don’t want to spend a month learning how to use order entry, or doing time off, or any of the things that I end up having to do.
I really want small apps that are single-purpose. I want them to have a trivial interface. I want them to be gesture- or voice-oriented to the degree possible. I want them not to bother me if they can do the thing themselves.
That all turns into AI on the back end and also automated integrations into APIs between applications, so that they’re not asking me to carry a piece of data from one app to another, almost like the iPhone used to be.
It was transformative to say, “Oh, I just read an email, or I press on something that it recognized as a phone number, and I’m placing a call.”
That doesn’t seem to exist in applications today, that deeper integration. In part, because I think everyone wants to own the screen.
Paul: They do. It creates a really interesting opportunity, even in an enterprise app space.
Anurag: Particularly if you try to monetize the API call. That’s where serverless comes in. If you can go to something where I’m not charging you on a per-month basis, or on a per-year subscription basis, I’m charging you for what you use. That’s been an interesting move.
A large part of the innovation in the cloud was this move from TCO, provision to peak, to pay by the hour. If you look at something like Lambda, which is our serverless computer offering in AWS, it’s pay-per-request in hundred-millisecond chunks.
If you think about a hundred millisecond, that has the same ratio to an hour as an hour has to three years, roughly. That’s the model.
We talked a little bit about non-consumption as a place to go. The other interesting thing to do, from a business model perspective, is over-consumption. People paying for something they’re not using.
Anurag: It’s just useless. For example, if I had a ETL tool and I’d paid for it, and I’m running my ETL jobs 8 hours a day, there’s 16 hours where it’s sitting idle. Why am I paying for it?
SaaS vendors have the opportunity to interleave usage. Yeah, you have to provision to peak just as I have to provision to peak, but my customer doesn’t and your customer doesn’t, which is awesome.
The challenge becomes how to make money at those price points, but that’s all about volume. One of the things we talk about a lot in Amazon is how to run high-volume, low-margin businesses and to really pay attention to the underlying efficiencies.
That also builds you a moat, because the larger you are and the more you can take out of the efficiencies, that only comes with scale. That’s another positive advantage, just as data is.
Paul: You guys have brought the pricing to a level that’s so much lower than everything else, and made it on-demand. It created volume…
Anurag: It’s based, in some sense, on the intrinsic sense of optimism, that the demand is there if I can find a price point and a product that people can afford.
Paul: Cool. Thank you so much for coming here and sharing your thoughts. A round of applause for Anurag here.