Building Notion AI: Lessons Learned and Myths Busted with Simon Last, Notion Co-Founder and CTO

Join us for an insightful conversation with Simon Last, co-founder of Notion, as we explore the groundbreaking AI integrations transforming knowledge management. Discover how Notion is harnessing AI to elevate productivity and creativity in the digital workspace.

Episode Description

In this episode of the AI Native Dev podcast, host Guy Podjarny sits down with Simon Last, co-founder of Notion, to discuss the company's innovative journey in AI integration. Simon Last shares the pioneering efforts that have positioned Notion as a leader in AI-driven productivity tools. From the early adoption of GPT-4 to the development of AI Writing Assistant, AI Autofill, and the Q&A feature, Simon discusses the vision, challenges, and future of AI in transforming digital collaboration. Discover how Notion's agile "tiger team" and rigorous evaluation processes drive continuous improvement in AI capabilities. Learn from Simon's insights into fine-tuning models, building trust in AI, and the future vision of automating tedious tasks to focus on strategic, high-level activities.

Resources

Chapters

1. [00:00:00] Introduction to the Podcast and Guest
2. [00:01:00]
The Birth of Notion AI
3. [00:04:00]
Key AI Products Introduced by Notion
4. [00:07:00]
Organizational Structure and Team Dynamics
5. [00:09:00]
Evaluation and Testing of AI Capabilities
6. [00:13:00] The Role of Fine-Tuning and Model Selection
7. [00:16:00] Building Trust in AI
8. [00:21:00] The Future Vision for Notion AI
9. [00:41:00] Discussion on AI in Software Development
10. [00:51:00] Conclusion and Closing Remarks

The Birth of Notion AI

The journey of Notion AI began in October 2022 when the team gained early access to GPT-4. Simon Last described this as a turning point, stating, "Playing with GPT-4 was the trigger for me. Oh my God, this thing is actually really useful now." This realization led to the immediate development of Notion's first AI product, an AI writing assistant launched in February 2023. The assistant can write, edit, and insert text, offering various pre-packaged prompts for improving writing.

This initial foray into AI was not just about incorporating a trendy technology but about fundamentally transforming how users interact with digital content. The AI writing assistant was designed to be intuitive, allowing users to seamlessly integrate it into their existing workflows. This integration was made possible by understanding the core needs of users—efficiency, accuracy, and ease of use. Simon Last emphasized that the AI's role was to complement human creativity, not replace it, by taking over repetitive tasks and allowing users to focus on more strategic activities.

Key AI Products Introduced by Notion

Notion introduced three main AI products: AI Writing Assistant, AI Autofill, and Q&A. The AI Writing Assistant allows users to write and edit text with ease, using custom prompts. Simon Last explained, "There were prepackaged actions and then also you could just type whatever you wanted." The AI Autofill feature enables users to fill out database columns using AI-generated prompts, useful for summarizing or translating content. The Q&A product, launched in November 2023, indexes all of Notion, using embeddings to facilitate a chat bot where users can ask questions. Simon Last noted, "We built an embedding index over all of notion, and then you could ask questions and it's a chat bot."

Each of these products serves a distinct but complementary purpose in the Notion ecosystem. The AI Writing Assistant boosts productivity by streamlining content creation, while AI Autofill automates data entry processes, reducing manual workload. The Q&A feature leverages advanced natural language processing to provide accurate and contextually relevant answers, making information retrieval faster and more intuitive. Together, these products exemplify Notion's holistic approach to integrating AI across its platform, enhancing its utility and user experience.

Organizational Structure and Team Dynamics

The AI development at Notion started with a small, agile "tiger team," which Simon Last believes is crucial for rapid innovation. "It's good to have a small group of people that can move really fast," he stated. As the AI efforts expanded, the team grew to about 20 people, organized into subgroups focusing on indexing, UX, and modeling. Despite challenges in democratizing AI across teams, Simon Last emphasized the importance of enabling more teams to work with AI.

This approach not only fosters innovation but also promotes a culture of collaboration and knowledge sharing. By embedding AI specialists within various teams, Notion ensures that AI expertise permeates the organization, leading to more cohesive and integrated product development. Simon Last highlighted the importance of maintaining a balance between centralized AI expertise and distributed innovation, allowing all teams to leverage AI while benefiting from a shared foundation of knowledge and resources.

Evaluation and Testing of AI Capabilities

Evaluating AI products is challenging, with Simon Last highlighting the need for a repeatable evaluation system. Notion focuses on robust logging and dataset creation to track and address failures. Simon Last explained the importance of deterministic evaluations: "For the situations that you test, you need to make sure that those work and they don't regress." This approach allows Notion to continuously improve their AI capabilities.

The evaluation process involves rigorous testing, using both synthetic and real-world data to simulate various scenarios. This iterative method ensures that AI models are not only accurate but also resilient to changes and capable of adapting to new data inputs. By prioritizing empirical testing and data-driven insights, Notion can refine its AI models, enhancing their reliability and performance over time.

The Role of Fine-Tuning and Model Selection

Simon Last shared insights into the complexities of fine-tuning models, noting that it often complicates the development process. "You're making your job like a hundred times harder," he remarked. Instead, Notion prefers in-context learning and leveraging the latest models to maintain product stability while adapting to new technological advancements.

Rather than relying heavily on custom-trained models, Notion opts to use the most advanced models available, integrating them into its products in a way that aligns with user needs. This strategy allows Notion to stay at the forefront of AI innovation without the overhead of extensive model training and maintenance. Simon Last emphasized the importance of flexibility and adaptability in AI development, recognizing that the landscape is rapidly evolving.

Building Trust in AI

Trust is a critical aspect of AI adoption. Notion employs strategies such as user verification and transparent actions to build trust. Simon Last stated, "We show you this little pop up and the default is no, but you can opt in to sharing data with us." Citations and visualizations in the Q&A product also help users verify answers, fostering trust in the AI's outputs.

Transparency is key to trust, and Notion is committed to clear communication with its users about how their data is used and protected. By providing users with the option to opt-in for data sharing, Notion respects user privacy while still gathering valuable insights for improvement. Additionally, visual cues and citations in AI-generated content allow users to assess the reliability of the information provided, enhancing their confidence in the tool's accuracy.

The Future Vision for Notion AI

Looking ahead, Simon Last envisions AI automating tedious tasks, allowing humans to focus on higher-level work. He sees AI as a primitive, integral to Notion's mission of enabling custom software creation. "Our goal as a company is to try to break the pattern of these like rigid vertical SaaS tools," he shared. This vision positions Notion AI to significantly impact knowledge work, elevating productivity and innovation.

As AI continues to evolve, Notion aims to harness its potential to transform how individuals and organizations manage information. By automating routine processes, Notion empowers users to allocate their time and energy towards more strategic, creative pursuits. This shift not only enhances productivity but also fosters a more dynamic and adaptable work environment, where AI serves as a powerful ally in achieving business goals.

Full Script

**Simon Last:** [00:00:00] So for Notion, we think a lot about like primitives or building blocks. Our goal as a company is to try to break the pattern of these like rigid vertical SaaS tools and instead reconstruct them from their underlying primitives and allow users to make their own custom software.

So things like, like a relational database or a table view or different blocks within a page. I think of AI as another primitive in the toolbox lets you do really useful automations on top. So yeah we have AI primitive concepts. So The writing, the AI autofill is a great example.

That's a primitive where you can write a custom prompt and it can do whatever you want.

**Simon Maple:** You're listening to the AI native dev brought to you by Tessl .

**Guy Podjarny:** [00:01:00] Hello everyone. Welcome back to the AI Native Dev today we have an opportunity to talk to one of the makers of probably one of the most advanced AI adopters in the industry right now, which is Simon Last the co founder of Notion. Simon, thanks for coming onto the show.

**Simon Last:** Hey, Guy. Really happy to be here.

**Guy Podjarny:** Simon, there, there's a lot for us to cover but I'm thankful that you're joining because I really perceive Notion as one of the pioneers in the product side of making AI really a fundamental part of the of the product. And And in useful ways, but also some ways that feel like a little bit weird and experimental, which I love.

Like I like that it's a bit of a combo of the two and in the spirit of sharing and learning as a community, it's great to just tap a little bit to the do's and don'ts and the learnings and perspectives that you might have accumulated in the process. So I guess before we dive into the specific questions and try to tie them in and understand how they relate to software development tell us a little bit about what Notion AI is. I think people [00:02:00] will be well familiar with Notion as a modern knowledge base for the organization, but tell us a bit about Notion AI.

**Simon Last:** Yeah. So we got started working on it. It was back in October, 2022. We had just gotten early access to GPT 4. And it was something that I've been closely watching for a while.

And then playing with GPT4 was the trigger for me. Oh my God, this thing is actually really useful now has a bunch more world knowledge. It can actually like, follow complex instructions. We immediately got to work then on building our first product, which was a AI writing assistant.

And we launched that in February 23. So basically it can write and edit text insert new content and then there's a bunch of pre packaged prompts for different comment actions, like improving your writing.

**Guy Podjarny:** And this one was focused on sort of an inline creation or more like it was always with some command, right?

With sort of a slash.

**Simon Last:** So yeah, so there were prepackaged actions and then also you could just type whatever you wanted.

**Guy Podjarny:** So similar to the combo, I guess that we're seeing today as well with some of the code completion and chat interfaces that we see in cursor or [00:03:00] in the coding tools.

**Simon Last:** Yeah. It's still in the product today. It's very similar to the cursor inline, replace, or insert. And then the second product we worked on we call AI autofill, and it's basically you give the AI a prompt on an entire column of a database. And it will like, fill out for all the rows of that column. So you can fill out a summary, you could do like a translation of a different column. You could extract some information from the document, something that was pretty useful. And then the third product that we shipped in November 23 was Q and A, so basically we built an embedding index over all of notion, and then you could ask questions and it's a chat bot where you can ask questions.

That one took us a bit longer just because there's obviously a lot more technical stuff involved around indexing all notion, especially at our scales, a lot of data and then figuring out like how to retrieve it and answer from it.

**Guy Podjarny:** And then I'm a big fan of the column capabilities of it.

It's used quite heavily in Tessl in a variety of spots as a summarizer of it. Sorry, but I interrupted you. So third product and

**Simon Last:** yeah. And then [00:04:00] since then throughout 2024 we've been working on what we call AI connectors so indexing, not just notion but external apps and evolving Q and A to be like a.

A true universal search product. And then we've also been adding more capabilities to the chat bot to turn it into more like an assistant. So we can now create and edit pages. And I can do all of this using retrieval. So you could do something like, go find me information about this customer and then drop me a customer proposal and we'll fill it out, in the right format.

**Guy Podjarny:** So those are kind of actions that compliment the search and they're again, kind of notion wide as opposed to something specific. And I'm curious, organizationally, like you describe these as different products, you call them, they're clearly a user experiences them as a, as capabilities within the product within maybe the sort of the one experience that is notion.

How did you organize for this I guess both when the penny dropped or, when you got that exposure, is that kind of a commando force that sort of put to the [00:05:00] side to to tackle it or not. And also as you evolved, are these still dedicated teams outside the main perimeter?

**Simon Last:** Yeah. So I'm a big fan of starting like a tiger team for a new initiative. I think it's good to have like small group of people that can move really fast and have a clear focus mandate. So that's how it started. And that's how it was until probably the launch of the original writer.

Then we created a more formal team around it with a manager. And over the course of the past year and a half, we've been growing the team. It's probably about 20 people now. And there's different kinds of like subgroups. There's a whole group working on the kind of indexing ingestion.

We have a group focused on UX and a group focused on the modeling eval aspects. So it's still mostly centralized, although we've been trying to democratize it a little bit. So we want more teams to be using AI. I don't know. I just think there's so much low hanging fruit.

I can think of a new idea every day and be like, Oh, I wish somebody would work on that. It feels this crazy technology just dropped [00:06:00] upon us and, it can do all these like insane things. If you have the idea and you know how to iterate on it.

So yeah, we were trying to democratize a little bit more. We have two or three other teams also working on AI stuff now. There's been some challenges with that, which, we can chat about if you want.

**Guy Podjarny:** Yeah. I'm very curious to dive in because indeed some things are the same and you just want to tap into them.

And some things are quite substantially different, but I guess still on the organization, do you think about the team that is originally building it as a platform team? How did you consider that division now that you've been building this for a couple of years and you have, it sounds like multiple teams working it Yeah.

Yeah. Yeah. How do you think about the shared infrastructure, if you will, between those or the shared learnings or evaluation methods, as we'll touch on in a sec how do you think about I guess enabling these other teams to adopt AI more easily?

**Simon Last:** Yeah, that's a good question.

Yeah. Initially we didn't think of it that way as more just like a end to end product team the only goal is just to just ship them. Yeah. Just to ship useful stuff. And then we're learning along the way how to do evals and then our goal is to make it more of a platform team.

So we're trying [00:07:00] to take what we build and expose it as something reproducible. I would say it's actually pretty tricky because in many ways we don't really do any training of models. It's all about taking the best models out there and packaging them into the product.

And making everything work well, which involves doing logging in the right way like setting up your prompt, doing logging the right way, doing evals, all that stuff. And it's interesting most of it is actually, there's not that much technically challenging, I would say, about the platform aspect.

A lot of it is more about the best practice of how you do it and the knowledge of what steps you take and even how to think about it. Yeah it's a tricky thing to get other people to work on without context. What we've had most success with actually is when someone new wants to work on AI we've had a lot of success with just having them join the AI team temporarily.

And then they just join our standups and get in the weeds with us every day and then quickly pick up all the context around like how to do evals, how to iterate on prompts and yeah, we found it challenging the other way where we just like, give them, like the code pointers [00:08:00] to, all the different layers of the stack.

**Guy Podjarny:** Yeah, here's the thing that would run the inference or whatever it is, but

**Simon Last:** Yeah, and the code isn't that complicated. I would say the complexity is actually more in like the best practice of how you do things and the mental model of how to even think about it.

**Guy Podjarny:** Yeah. Yeah. I think that makes perfect sense. I actually had Patrick Debois was a kind of DevOps luminary on the podcast. And we talked about how the analogies work from the DevOps era. And a lot of it is indeed about platform teams and reuse, but also about embedding and about someone walking a mile in your shoes a little bit and learning both for empathy as you might get annoyed later on when the product doesn't work quite as you want, but for skill sharing and in security and DevSecOps, we use the same approach.

I think we're running out of letters we can add into the DevSecOps kind of combo. I think we need to find a new strategy there, but I think the approaches themselves are working quite well. I love that approach and it makes perfect sense to me. Maybe let's indeed I want to talk a bit more about the skills and those gaps, but maybe let's first describe what has been repeated oftentimes in this probably said most succinctly by Des at Intercom in an early episode here is that it's really hard [00:09:00] to know whether your product works when you're helping on AI. So I guess, what has your experience been when you're building these capabilities, how do you think about evaluation, about testing, about knowing whether it hits a threshold, but also knowing whether it hasn't regressed.

**Simon Last:** Yeah, it's super hard. And it's a really different experience than the pre AI world. Most of my experiences in building Notion building products, and it's been honestly really painful.

It's very exciting and fun and like I'm pretty obsessed with the new technology, but it's very painful and I often miss the old world. In the old world I could have an idea and it might take me longer than I thought, but I could definitely ship it.

But In the AI world, I can often have an idea and then be surprised that it didn't work in some way I didn't expect. I would say I guess the way I think about it is, it's a two fold challenge of One is, for the situations that you test, you need to make sure that those work and they don't regress.

And then there's this nebulous space of things that you haven't even tested it [00:10:00] can be arbitrary. And then, filling enough of that space so that, you're confident that it's going to match the distribution of what users are going to request. Enough that you're reasonably confident.

You can never like, like fully match it. Yeah, I guess the way I think about that is, yeah, you need like a repeatable system or engine around this and the key pieces are something like you need you need to set up good logging such that interactions, can be called up again if it fails in some way, you can go look at that.

And it's extremely important to be able to exactly reproduce this is the failure situation. The log has to be an exact reproduction of the error. And then you should be able to, just rerun that inference with the exact same input. That's really important.

If you can't do that you're totally screwed.

**Guy Podjarny:** And that creates probably some challenges. Like I fully relate to the full logging even more so than in regular instrumented systems. How do you work around the kind of the privacy concerns or some of the user data concerns around it?

**Simon Last:** Yeah. So we take that really seriously. What we do is we have [00:11:00] an opt in early access program. So when you use Notion AI, we show you this little pop up and the default is no, but you can opt in to sharing data with us. And we don't use it for training.

It's just for evaluation. So basically that allows us to see the logs like if you thumbs down, we can actually see the log with the input. And then, we can add it to an eval data set and we segregate out the prod data, to make sure that, it's not contaminated with other sources, but yeah that's been extremely helpful let people opt in and our user base is large enough that even if a pretty small percentage opt in, it's still quite a lot of data.

**Guy Podjarny:** So logging is one

**Simon Last:** yeah, so logging is one so yeah, I would say the next step is around collecting the failures into data sets. That's pretty important. So just like organizing them in some reasonable way such that for the given task, you have some data set or data sets that are like, these are all the cases that we care about and all the known, previous failures.

And then the next bit is being able to evaluate that a given failure is still happening or not and that can be really [00:12:00] hard. There's many ways to do evals. I think about, you can have deterministic or non deterministic evals. Definitely best to use deterministic when you can, and then, you can use like a model grade eval when not.

For model grade evals, they work best when it's as concrete and specific as possible. I've seen people fail a lot doing model grade evals when it's like too generic or the task is too difficult. I think I said this in the last one, but if your model grade eval is not like robustly easy for the model, then you have to have an eval for your eval.

And you just created like a new stack of problems,

**Guy Podjarny:** an infinite loop. So I relate to the the data piece. So the data sounded like going further back, you have to create a set of test cases for the things you want to work and then a set of cases for problematic ones.

I guess that's synthetic. That's at the beginning was you build a feature and then you need to have access to real world scenarios. And you curate out of that data set of failure cases with the proper user permissions, like amidst the users that opted in to help improve the system.

I guess. Do you [00:13:00] accumulate. over there. Good cases as well. Does the data set mostly grow in the negative sense of don't do this? Or is it the same one? Is it basically you're like, here's a scenario, here's the correct. And here's the incorrect answer.

**Simon Last:** I tend not to care about good cases that much.

I think the main focus is on, cause what's the actionable value of that. Maybe it's nice to have a few of them just as a sanity check but. But to me, the main goal is to collect concrete regressions, and then fix them, and then make sure that they stay fixed.

And then over time, you're ratcheting up all these things that you, that used to not work are now working, and then you're growing this data set of things that, continue to work. And it that's ideal. You'll often find issues that you just can't really fix.

And so that's just part of the yeah, we'll come back to that.

**Guy Podjarny:** That's really interesting because my product works at this percentage of Yeah,

**Simon Last:** The models have limits there's many issues that we can fix and we're not a foundation model company it's like, all right, we'll just wait till the next model comes out.

Hopefully it'll fix this one. So you want some kind of. And it could be just like a like a human grade eval, although that can that'll quickly get out of control. Then the really key next thing is [00:14:00] you need to enter this loop where, you find some new regression, you add it to your data set, then you change the prompt in some way.

Then you rerun the examples of the data sets and then you need to be able to decide whether it's like better or worse. And ideally that's as automated as possible. And then, ship that change. And then there's just this constant loop of, new example, improve the prompt.

**Guy Podjarny:** Improve the product. And I think on the eval side Notion is pretty open or flexible in what you can do with it. There are a lot of degrees of freedom of it. It's easy to think about non deterministic evaluations that you can have in Notion as you're looking at text and unstructured data and the likes, what's an example of a structured evaluation that you can do in an Notion AI capability?

**Simon Last:** Yeah. A lot of them, like anything to do with formatting the output in some way, maybe it's XML, maybe it's JSON. That's ideal. I think about, if you're having some issue, the best way to solve it is by solving it with a validator. Make it so that the invalid output is [00:15:00] impossible by, structuring the output in some way where you can deterministically say that, the bad output is incorrect.

So that's like the best way to solve any problem. And it's the easiest way to, to evaluate it. Another deterministic eval that we use all the time is I love using mini classifiers as part of the flow. Classifiers are really great and by classifier, I just mean some inference that outputs, like an enum.

Like set of possible values. Classifiers are really great because they're really easy to eval because

**Guy Podjarny:** you classify the result of the of the products activity. Like you classify the output from the LLM.

**Simon Last:** No, actually as part of the flow. So for example, one thing we need to do as part of the chat experience is decide whether to search or not.

Just as an example. So it's should I search or not search? Yes or no, basically. There's other classifiers that have more than two possible values, but yeah, those are really great in large part because they're so easy to evaluate. Because you just have this ground truth output and then you just compare, the actual versus expected.

It's really great for that reason. You can collect a big data set,and you can get a score.

**Guy Podjarny:** Yeah. And [00:16:00] I guess what has been the driver behind the choice not to fine tune, or actually, you know what, I might be jumping to conclusions here. You said you're not training foundation models of it.

Do you find yourself fine tuning the model?

**Simon Last:** Yeah. So we've tried fine tuning. extensively. I've personally banged my head on it a lot. I think I need to ask OpenAI, but I feel like I'm, if not the highest, maybe like 99th percentile like number of fine tunes created.

Yeah, we try fine tuning quite a lot. I would say, yeah, the big issue with fine tuning is that you're making your job like a hundred times harder, at least. Because, I need to collect this whole dataset. The actual fine tuning process can be difficult.

Especially if you're using, not just an slower. You have to wait for, things to train. And then it's really hard to debug issues. You're basically like creating this black box where any of your new examples, as you do, successive fine tuning runs, any new examples can poison the entire data set.

You can mess up the model. So it really requires you to have extremely good [00:17:00] evaluations. And It's really hard to debug issues. So I've had issues with fine tuning where, you know, I literally would spend weeks trying to figure out what the problem was and just extremely hard.

Yeah, I would say I'm honestly not that bullish on companies outside of the foundation model companies doing fine tuning. And if I like meet a startup and they say, they're doing fine tuning. I actually now think of that as like a negative update. It's like a lack of

**Guy Podjarny:** experience.

You buy into the promise, but it doesn't actually

**Simon Last:** justify itself in the

**Guy Podjarny:** real world,

**Simon Last:** And I got bit by this bug to like it sounds so cool and like I think as engineers we want to have more control. We want to do a technically cool thing. You know the technically like powerful thing and I was totally susceptible of that and it was really fun to do it But I don't know.

I'm a huge fan of in context learning. Like It's gonna get better Yeah, there's a lot of reasons to do it. Another big reason not to fine tune is if you're not a foundation model company, you really want to just be using the best model at any given time.

And the progress is so fast. And so if something doesn't work [00:18:00] now, there's a good chance it's going to work in the near future. And if you're fine tuning, you're really just locking yourself in to this slow, complicated process that makes it hard to update and context learning is amazing because if it's a new model, I can switch the next day.

**Guy Podjarny:** And I guess in that case, that was my next question is like how often do you try different models and how rigorous a testing process do you feel is necessary to make that switch. Cause those are probably, new features are new features.

You control the timeline. A new model comes along. I guess you still control the timeline of how quickly you, you roll it out. How do you think about the trade off between wanting to tap into the new hotness versus, I don't know, potentially slow or how confident, do you feel about, Hey, like the next one came out, can I sort of slot that in and get it out within the week?

**Simon Last:** Yeah, that's a good question. I would say the speed at which you can do it is really dependent on the quality of your evals. How good do your evals like, how good are they at telling you confidently that, this change will improve the overall experience. Yeah, I would say that's one of the reasons why evals are so [00:19:00] important.

Probably, Changing prompts is more important than changing models. That's going to happen much more often. But, the same evals work for both, right? Yeah, we try to, yeah, we've definitely invested a lot in really making sure the tools are right and the evals are really good.

But also, I would say, as a products company, I think it's pretty important not to be just like jerked around constantly by new models coming out. I think it's important to like, think about, like really understanding the task that you uniquely really care about a product and, your alpha should be really deeply understand the task and what it means to produce a good result and the product experience around that.

And not just like watching new capabilities and, releasing them as they come out. I think, it's really about you and your product and like thinking about the models as like a means to an end to enable that. And very often a new model like just it's like shiny and cool, but it's like maybe it just doesn't really get you much [00:20:00] benefit for the tasks that you care about.

**Guy Podjarny:** And I guess, is there an example of distilling? You mentioned a couple of times is now the importance of being super crisp about, what it is that you need either, what is it that you want for your task and from the model for it, or in the evaluation about actually asking the model for something very concrete.

Is there any like an evolution example of, Hey, when I, when we started with feature or product X, Y, Zed in notion we did this and then we we learned that we have to trim it this way, that jumps to mind.

**Simon Last:** You're asking about making the task more specific.

**Guy Podjarny:** Yeah. I'm just I guess I'm curious what has been the real world experience as you were building within Notion, these capabilities of a case where making it more crisp or more succinct has yielded better results.

**Simon Last:** Yeah, in general. Yeah, like the model doesn't know what you want until you tell it and right Yeah, I think a big one. Q& A is probably a really big example obviously, you know in the in our very first experiment, you know you just give it the search results and then tell it to produce an answer or something like that I think a big evolution there [00:21:00] is coming up with a more specific rubric which just explains in detail what we care about and what it means for answer to be good in our product experience.

And then literally telling the model that rubric in the prompt is, yeah, super helpful.

**Guy Podjarny:** And I guess just to make sure everybody listening is aligned on it, the search versus chat is really like search is the more traditional search, the search your knowledge base with Notion has had for quite a while.

It probably involves some amount of AI there as well for ranking or at least some algorithmic ranking within the findings.

**Simon Last:** Oh, we actually completely rebuilt that stack for the Q& A. The key technical unlock there is the embeddings. Our previous core search stack was like, like Elasticsearch, full text fuzzy search.

It doesn't really work if you just use that and try to do Q and A, maybe works, but yeah, the really magical experience comes from rebuilding your stack to use embeddings.

**Guy Podjarny:** So today when you say search, you mean the two parts are both kind of LLM inspired, one is embeddings, why pull up the relevant records from the vector database or, and then [00:22:00] subsequently process those.

Wisely knowing what it is that a good answer looks like to produce the result.

**Simon Last:** Yeah. Yeah. That's the Notion AI yeah. I like the Q and A experience.

**Guy Podjarny:** Got it. So I guess let's talk a little bit about the people building the stuff. We talked about the frustrations, the set of the new skills, how does that change, if at all, the skillset of the individuals building and you've already spoken about the fact that there's a learning and so not a, if you're just a software engineer and you pick it up right now, it's going to take some getting used to, and you'll need to change some of your ways.

Does it change anything around who would be good at it? The traits you might be looking for in a hire?

**Simon Last:** Yeah, that's a really good question. And I think it, it definitely does. Yeah, I think of a few things. So one is, especially on the more producty side, if you looking for someone that's going to be able to lead an AI product team, you definitely need someone that's going to be very okay with a lot of uncertainty and really pushing to think through every day are we doing the right [00:23:00] thing?

I think that's yeah, it's so important because you can use, I've basically every time we do one of these projects, like the idea that we had at the beginning and the end are often super different. And you have to really just be, putting in the energy every day to think about okay, are we today still doing the right thing?

**Guy Podjarny:** Are the changes due to like misassessment of what the AI can and cannot do? Is it how the user might want to use it? I guess what do you feel drives the more substantial evolution versus software?

**Simon Last:** Yeah. Yeah. It comes from multiple places. Yeah. One is definitely what the AI can and cannot do.

And then actually morphing the experience to what the AI does well, to the thing you discovered as well. I think that's pretty important. Another big element is there's no, product paradigms really around it.

There's a few things, we have the AI chat concept now, there's a few concepts now, but there's just not that much. So you have to invent things. And then when you have people actually using them, you can easily discover a lot of things like, the user wasn't using this [00:24:00] the way I expected.

So yeah, just, there's a lot of uncertainty around just like inventing the product concept. It's such a fuzzy, weird technology. For any given goal if you think of I want to build this feature for this goal, I can think of 10 different ways to do it. You have to navigate that. Another big one is discovering an ideal feature, I think, doesn't require people to change their habits. It's just embedded in their existing habits. That's really great if you can get that. Otherwise, you have, this double challenge of in addition to making your thing work, you have to actually Get people into your flow.

And it's particularly hard for us because Notion already does so many different things. So yeah, that's a really big one. And it's not always obvious at the beginning. So yeah, I would say like general umbrella there is like just a lot of iteration speed and capable of dealing with that uncertainty.

On the kind of modeling ML side, I would say one of the biggest predictors of success is just like how often do they touch reality? Like how often do they like try that [00:25:00] idea and run that experiment and then the kind of loop on that because it's such a weird technology.

Like it has to be so empirically driven You can have an idea You're not gonna get anywhere unless you try it and you should try it really quickly and then move on to the next idea Not just at the project level, but you know on a day to day level of just like different ways to solve each problem So yeah, the really key thing there.

It's just like they have to just be able to just really quickly, empirically iterate. And if you don't do that, it's not going to work.

**Guy Podjarny:** It's a great advice. And I think what I find really interesting is that it actually is very grounded in product. None of the skills you mentioned are, Hey, it'll be useful if they have, a master's in machine learning or in general kind of mileage with working, do you, I relate to that. I don't know, like I come from software development and from product. Also, it's hard to find good software engineers that also happen to have ML backgrounds. So it was an advantage to that. But do you find those are advantageous? Like when you see someone with a resume or when you look at the people in your team that feel like they're the best or have been the best at adapting, to [00:26:00] it.

Does a machine learning background help?.

**Simon Last:** Yeah, I think not necessarily It's funny. I'm, not sure. I mean we definitely have some really great folks on the team that have an ML background I've definitely met people where they had an ML background, but it felt like it was actually harming them especially if the background is like from the, pre language model era.

**Guy Podjarny:** Yeah. The symbolic AI

**Simon Last:** yeah, it's just a different, you can easily get trapped into small ways of thinking. A lot of the concepts do still apply and it is helpful to apply those, but I don't think I look for that as like one of the primary things. In fact, I actually, like if someone has a strong ML background, I have to suss out like like how locked in are they, to the previous paradigm.

But yeah, I will say that if you have a strong ML background and you're into the new tech and a strong open mindedness, then yeah. It can be a huge strength.

**Guy Podjarny:** Yeah, it's a great combination for it. So I guess maybe one last question still on the dev side of it. How strongly did you adopt, in Notion, not just in the AI team the use of AI in [00:27:00] development, any, maybe starting with you do you use any of the Cursor or the AI test generations or docs, whatever, what's your tool of choice, if any, in the AI dev space?

**Simon Last:** Yeah. These days I'm using a Cursor. It's really good. I, in particular, I really like their autocomplete model. It's just extremely good, that it, it can actually, produce arbitrary edits around the code. And I often find myself just like tab, tabbing. I think there's a meme on a Twitter I saw a few weeks ago.

I just like, the new engineer experience, just like tab. That's pretty funny. Yeah. I think yeah. Cursor is really great. I just, yeah, I use like Claude, ChatGPT or more Claude for coding these days.

**Guy Podjarny:** And then just in the chat to ideate and yeah.

**Simon Last:** Yeah. Yeah. And yeah write code for me.

**Guy Podjarny:** And what about the rest of the team? Are the team, is there a standard, when you think about Notion as an organization, have you pushed to, do you push people to use it? Did you try to standardize on anything?

**Simon Last:** Yeah, we definitely do try to push people to use the tools we think are most productive.

We have most engineers using Cursor now. And then we're experimenting with, we're doing like, like proof of concepts [00:28:00] with a bunch of startups doing various things. I wouldn't say there's anything crazy successful to call out there. Yeah. There's a lot of startups working on just coding agent type things.

And yeah, I think we're working on a few, like for some refactoring projects. And yeah, I would say nothing quite works yet still. I was actually just playing with Devin again. I I'll try Devin every couple of months cause they're always improving and it's only gotten a lot better it's still didn't do the my favorite eval example.

**Guy Podjarny:** Yeah, I'm curious now, what's your favorite eval example? Is it just that?

**Simon Last:** Oh, it's an app. Yeah. So it's a it's pretty simple app but it touches all the pieces that I think are interesting. So it's a calorie tracking app. All it is it's a table of with the food and the amount of calories, and then there's an input, and you can just type whatever you want, and then it's supposed to extract out each food item and the calories added to the table.

So it involves a database, a backend, a frontend, and then it also has to write a prompt and call into a language model. To do the extraction. And last time I tried, it [00:29:00] actually did get the initial version. And then I asked for just one small feature change and then it didn't work from there.

They've made huge improvements and it's like super impressive, but yeah I feel like we're not quite there yet still.

**Guy Podjarny:** Yeah. I want to shift gears a little bit and talk a bit about trust in AI. And then maybe we do come back a little bit to this like future of both development and Notion AI itself.

I think in AI, a lot of the challenge, with sort of users is learning and knowing what to do, what can they do with that sort of empty chat box or actions, as you pointed out, like learning new workflows is also quite hard, but I think another challenge is trust. You're putting something in.

How do the simple example of I don't know if it hallucinated something, but also like, how do you know if it succeeded in a thing or if the answer is correct, like it could hallucinate, but it could also just be wrong in that sort of path, how do you think about trust in Notion AI?

Is there anything you've learned around helping users figure out when can they, trust an answer in Q and A more than elsewhere, or. And again, some [00:30:00] of it is true or legit your product did or didn't work, but also some of it is, it's just the user adaptation and learning new skills. How do you think about that?

**Simon Last:** Yeah. Yeah. That's something we think about a lot. I would say, the most obvious tools are one, if the AI is taking an action, have the user confirm that and be able to see in a nicely visualized way. What is action being performed? I think that's just like a really key thing. So even for our original AI writing the first product that we came out with, it never just edits your doc.

It's always producing a diff and then you preview the diff and you can accept it or cancel. And if you want both, you can, there's like an insert below button and then you get both. You sidestep a lot of those problems. Just by, letting users visualize.

**Guy Podjarny:** Your attendance, eventually.

So the user decides. You just prompt them to Yeah,

**Simon Last:** I think that's, that's the most critical tool. And that's, extremely useful. sidesteps a lot of it. Just let the user decide if it, if they like it or not. And it especially makes sense for the writing task.

Cause it's a fuzzy task. Like it might, it might have partially done it right. And that's still useful. But it's not something you'd want [00:31:00] to like immediately send an email about. And then you mentioned Q and A, I think, yeah, super important thing there is just citations and visualizing those in a good way.

So it's grounded and then, and

**Guy Podjarny:** yeah. And how often do people use those? Do people when you look at the at the stats, do people actually go off and click those links and make sure that you didn't make it up? Or is it more. Hey, it's actually needed because we get it wrong often enough that we actually still want users to correct us.

Or is it typically, hey, it's just like for the warm and fuzzy, but we're pretty good at this.

**Simon Last:** People click the citations quite a bit. I think it's more than like a web search tool because, you actually might want to go see more information. It is like an internal doc that somebody wrote.

But I would say, yeah, it really depends on the nature of the question. There's a class of questions where, you know, they, I can just really robustly answer them. Some kind of factual question do we have president's day off or something?

Yeah, it's gonna, if it's written somewhere, it's going to figure that I was going to cite it. There's not much chance of it like, like messing up on that. And then if you see that, if you see that answer, yes, we have it off. And [00:32:00] then here's the page and it says 2024 benefits.

Other users not gonna click through on that probably. But yeah, Q and A at the limit is an arbitrarily hard task. People ask and especially when we get to our, connectors like pulling from many sources, people can ask a kind of complicated one thing we're working on right now that I'm really excited about is a GitHub connector so we can actually see your code and your PRs.

And that's really fun because, a non technical person can ask a technical question and then you get this really interesting answer where it's like weaving together something from the code, something from slack that somebody said, and then something from some Notion docs or tasks that somebody wrote.

And then, yeah, it's doing a much harder task at that point. We've been getting information from multiple sources and then there's many potentials for it to make reasoning mistakes. It might be conflating, two things that are actually similar.

**Guy Podjarny:** How do you handle those situations then? Like you're representing an answer. Is there an element of like confidence or something like that? If you considered adding that to the product of hey, I'm 70 percent sure this is right.

**Simon Last:** Yeah. Yeah. We do definitely prompt it about that.

I think that's pretty [00:33:00] important. We want the model. We so mostly using language. We don't have like an indicator. I think you can go a pretty long way using language to telling it that needs to communicate uncertainty when it's there and the models are getting pretty decent at doing that.

**Guy Podjarny:** And would it sometimes say it doesn't know?

**Simon Last:** Yeah, definitely. Yeah, if it's not there, it doesn't know. Yeah, we wanted, yeah, that's part of the prompt too. We wanted to have reasonably high bar.

**Guy Podjarny:** A bit of a side question on it, but like, how big do the prompts get? Like, all of that, so the system prompt.

**Simon Last:** Pretty big. We're definitely pushing the limits. Especially the Q& A prompt, definitely, also the writing prompts. We're definitely pushing the limits of what the models can do especially because we want it to work in all the cases, right? And so over time, we've incorporated all these Interesting corner case people find that we don't want to progress on.

Yeah, I feel like we're maxing out the prompt, not in terms of the actual context length, but in the context lengths have gotten pretty long, but I think in terms of the model's ability to actually attend over complex instructions.

So the system instructions are quite long and there's many examples.

And then also there's the actual user task, which can have a lot of additional context. I often feel like [00:34:00] after a few, like maybe weeks of iteration or something and like finding all the regressions, it can actually be pretty hard to improve more beyond that. So we often will find cases where it's okay, the engineer is working on it.

They'll add it to the system prompt. In this case. Don't do this or something, but then it just doesn't work. There's too much going on already.

**Guy Podjarny:** Instructions. Yeah. I guess like humans, if you're going to give me too many instructions, eventually I will I might technically remember it, but won't pull it up at the right time.

**Simon Last:** Yeah. Yeah. So we often have to just like prioritize and just be like, all right this case, maybe the only way to solve it is just for it to attend over that instruction somehow, and it's just not working. So then at that point, unless we have some more creative idea, we have to wait.

**Guy Podjarny:** And do you ever engage users and have you considered doing so in the thought process of hey to answer this Q and a, I pulled information from these, seven locations and I've blended them together. And get user feedback on it. Maybe I'll broaden even that question, which is like Notion is a very powerful tool and it has probably a lot of regular users, but it also has some pretty real power [00:35:00] users who can really squeeze a lot more out of the product because they know how to make the most of it.

Is there a power user of Notion AI?

**Simon Last:** Yeah. Yeah. So for Notion, we think a lot about primitives or building blocks. Our goal as a company is, to try to break the pattern of these like rigid vertical SaaS tools and instead reconstruct them from their underlying primitives and allow users to make their own custom software.

So things like a relational database or a table view or different blocks within a page. I think of AI as another primitive in the toolbox lets you do really useful automations on top. So yeah we have AI primitive concepts. So the writing, the AI autofill is a great example.

That's a primitive where you can write a custom prompt and it can do whatever you want. And then our power users basically are writing these really long complex prompts, to do something crazy. And then, the way that has impact for us is then, their complex prompt gets packaged as a Notion template, which then can be distributed to other people.

**Guy Podjarny:** Yeah, interesting. So I guess I'm [00:36:00] taking two things. One is it's not a notion AI power user. It's a Notion power user and AI is part of their kind of toolkits now that they have one of the primitives they can understand that can make the most. And then two is that yet within notion AI you're looking for capabilities to make available to those users when they really want to double down and prompts are the core version of it.

Have any of these prompts eventually make it back to the product of hey, these users figured these things out with the proper permission.

**Simon Last:** I think so. Yeah. We've, definitely several times had, in like beta testing a feature, we like for example, for the writing.

AI writing product, the way it works is we have these kind of predefined prompts that we control. And there were definitely several examples of like people saying, hey, I want to do this. It's not quite working. And then we make the prompt and then, we'll optimize it like, like much more than they could.

**Guy Podjarny:** Yeah. One more question on the trust side. A lot of the conversations I've had here on the podcast came down to this kind of a garbage in garbage out type reality where maybe indeed in support cases like an Intercom if your knowledge base is incorrect, [00:37:00] then your support bot will give incorrect answers because it's relying on that information on it or talking to Jason Warner at poolside, we talked a little bit about , how do you overcome bad coding practices that you might have in the code and sift those out?

I imagine within the. any organization that has enough maturity with Notion, you're going to have a lot of documents that are out of date. They no longer represent kind of reality or truth. How do you think about I don't know if it's ranking or identifying the better, the correct data, the good data from the bad one.

**Simon Last:** Yeah. No, yeah, that's extremely important. That works on a lot. Yeah, very similar to code. People often don't update the old thing. It's probably even worse than code. Knowledge bases are often like, like very neglected. Yeah. You almost touched on it, yeah. I think, there's two steps where, this can happen or where you can solve this problem.

One is in like some kind of ranking or filtering step. We definitely do a lot there. We want to make sure that we're ranking, the highest confidence documents using all the signals that we can like, like how recently was the document [00:38:00] edited? Is it maybe where is it located?

Is it connected to other documents? Who wrote it? We try to use all these like signals and you can get quite far with that, even just when the document was last edited or viewed, is it like very powerful signal. And then, yeah, then the second step is giving it all to the model and having it decide.

And the models have gotten like much better at that. I think, yeah, our case is different than like the Intercom case, cause the Intercom case, you're sending this message to a customer autonomously, so that's pretty high stakes. I think for our Q and A product. It's a bit lower stakes. It's all internal information and then we're giving

**Guy Podjarny:** users,

**Simon Last:** Our prompt is more about just like presenting to you all the relevant information in a really distilled actionable way.

And so if there's like multiple sources that say conflict things, we want our answer to just say that whereas in the Intercom case, you probably wouldn't want that. So it's a bit of a different,

**Guy Podjarny:** you'd have to conclude what is the answer you actually want to,

**Simon Last:** yeah, exactly. Yeah.

Yeah. If it says, if one document said the product is shipping this month and one says the other month, the answer should just say both. Like it just says, in the most recent document it says this, but in this document it said that. [00:39:00] And that's really useful for people.

**Guy Podjarny:** Yeah. And I guess the how when you say you give it all to the LLM, that includes then the metadata of probably obviously like dates, the documents were written and things like that. But does it cope with more metadata as well with things like. view, like visit counts and I don't know, like organizational charts of like seniority, if that's more of an authority or like affinity to a domain, or do you end up going down those routes and are those impactful?

**Simon Last:** Yeah I would say, yeah, like I'm a bit less bullish on like showing it some kind of like rank score or something like that. I don't think it would be able to reason that well over that, but I don't know. I think of it like. If you were showing a person a stack of documents and you had this, you give them all day to figure out like what's the answer.

What would you give them? Like when the document was last edited, for example, it feels like a great one, right? Cause that's it's a very transparent signal. And it's actually useful for the answer itself. Like the model will actually often directly use that information in the answer.

It will say, [00:40:00] in, this document was edited a month ago or something like that, but yeah, I would say, yeah, I think it's, I'm a little bit, I'd be wary of showing it, Like non transparent, like black boxy type information, because you could usually steer in the wrong direction.

If the answer is wrong, but somehow your system produced affinity score 0. 9 or whatever and the model uses that information yeah, that's, I don't think that's going to work very

**Guy Podjarny:** well.

I heard you say on another podcast that you're better off trying to stay close to the way the model was trained to the data that the model might have seen, right? And I guess that aligns with date stamps is probably something the models saw a lot in their training data view counts is probably not something that was frequently there.

**Simon Last:** Yeah, , especially any kind of like novel score that they are producing, certainly would never seen before.

**Guy Podjarny:** Yeah. So let's maybe for the last section here, talk a little bit about being ambitious which I heard you refer to of you need to be, when you think about AI you need to be ambitious around where it's headed and [00:41:00] plan for it. And so let me start by, by asking a bit of a meta question and then we'll maybe try to get a bit of a bit of a glimpse into your ambition, with AI, you, when you build an AI solution you have this duality of, on one hand, you want to fit into the current flow, as you pointed out here as well, because users don't really know what they want and how to use this.

So you have to fit into their current realities. And, you also want to be ambitious. You want to think about the future and something that's entirely different. But you think when you stop and think about it from first principles that there might be a better way to do it.

I guess, how do you think about the two and how do you prioritize between these two tracks as you build AI capabilities?

**Simon Last:** So just to understand, it's like you're asking about incrementally changing existing behaviors and,

**Guy Podjarny:** yeah. So maybe let me reiterate a bit as so I have a, this mental model that thinks about the level of change and the level of trust that you need in any AI solution.

And the solutions that get the most adoption today are the ones that require little change and little trust. So there are [00:42:00] things that are very easily human verifiable. And they are there for, easy to adopt on it, but they also have a bit of a local maxima because they work within the workflow.

They don't change anything around it. The solutions presumably that have the top potential of it are ones that assume you have high trust. And so you allow autonomous actions and the likes. And you're willing to adapt the way you work to be able to make the most out of the new capabilities of it.

But those are hard to adopt because you don't really have that trust and you haven't changed. And so that's a, as you pointed out, a double ask of the user. But eventually, I guess my theory or kind of my perspective here is that it is a local maximum if you limit yourself to existing workflows.

So you have to think a little bit about the head. So I guess what I'm asking is, one is do you agree? You might, this is just theory. That's not at all proven. And and two is like, how do you think about the two? I'm sure you have a lot of like longer term thoughts about where this is all headed.

Those just basically they go into the parking lot or do you try to build them [00:43:00] into the product in some wilder bets that you do in parallel?

**Simon Last:** Yeah, I don't think those things need to be separate necessarily. I think there's yeah, when I'm thinking about a new product, I have two considerations.

One is, what is the capability and the value being created and does it actually work? And then there's a second one around like, how do you onboard people into it and expose them to it? And I think if you're clever about it, you can be both familiar and in their workflow and also giving a new capability.

Even an autonomous one, like in you should be able to connect it to something they're already doing. And also propose a new behavior and then incrementally get them into The more autonomous thing. Yeah, , I think if you cleverly design it, like it, it doesn't feel like a fundamental correlation.

It just feels like it happens to be correlated in that way. But if you're clever about it it's doesn't need to be. And I think the best thing, the best example is probably just that, the code autocompletes, so originally copilot and then like Cursor's one is extremely good.

It's [00:44:00] actually doing quite a bit for you and like scaling it up. Scaling up and what it does for you. And then yeah, I think you can ambiently show people a new experience and something that they're already doing. That's the ideal

**Guy Podjarny:** Yeah. Yeah. To get the adoption. Yeah, I get it.

I've got all sorts of thoughts about the code piece on it because I do think that sort of moving out of editing the code, even if you can edit the code much better to a different source of truth.

Different different conversation, but well set around making it a journey and making it be incremental steps that lead user to that destination.

So I guess with that, what is the future for Notion AI? If you could overcome and educate everybody or whatever, walk them through this path over a bunch of years, what do you envision as the destination for Notion AI or how would it transform Notion?

**Simon Last:** I would say the North star for me for AI is basically, can you automate all the things that people don't want to be doing, right?

All the cumbersome, tedious tasks that people do every day. And ultimately lift humans up into a higher level of abstraction where they're doing like higher leverage, more fulfilling activities. To me, that's like the North [00:45:00] Star and what's possible with AI. For Notion, I think about, we're basically like a knowledge work system, right?

And people do all kinds of different knowledge work tasks on it. You might be an engineer. Managing your team's knowledge base and the tasks and the projects, maybe a product manager doing your product road map and giving updates, you might be like a sales leader, making like a like a custom CRM and so much of what people do in Notion is these tedious tasks, right?

Notion's core important concept is the database, like a relational database. And a lot of what people are doing is essentially like manual data entry into the database, right? Manually reading and writing to it. There's a huge opportunity there, but I see basically, let's lift up one level of abstraction and these tools like a database should become like an implementation detail and the AIs or some workflow that involves AI should be automating all those tedious steps and the human should be uplifted to be more like observing the final outputs and steering [00:46:00] them and, figuring out what like high level tasks are more important to us.

**Guy Podjarny:** And is it a curation exercise? If you can reliably. extract information from their original source, whatever, listen to a conversation and capture it, or indeed with your connectors, pick it up from other systems and you put them into Notion. Then Notion as a knowledge base, is it about it being a curated knowledge base?

Because humans have chosen what goes in and what doesn't go in or verified it in the process, as opposed to just having a fire hose access at any time.

**Simon Last:** Yeah, in the pre AI world, like for any use case, Notion is like the repository for it and the way that you read and write to it.

I think in the AI world, it's also like the host for the workflow that actually, directly does that job to be done that the data is trying to accomplish. For example, let's say you're an engineering team, one of the things that an engineering teams do is they manage their tasks, incoming bugs, triage [00:47:00] them, figure out which bugs are important which ones are going to work on this week.

When they're finished, maybe you want to like notify the right people, maybe tell the customer it was fixed. That whole end to end workflow should be automated. And then , the level of abstraction of the engineer working on that is thinking less about there's this ticket that came in.

Is it a clear bug? Do I need repro steps? Should I send it back to the customer? And a little abstraction is like a little bit higher , on more of a strategy of which let's encode into the system which classes of bugs are most important for us to fix when it's something high priority and should be flagged to us, and when it's something low priority and should be backlogged.

And then the AI is actually making those micro decisions and managing the database for you.

**Guy Podjarny:** And I guess, how do you think about evolution? Notion. There might be like a lot of tedious data entry work done in Notion, but Notion is also like really good at that. Or rather it's a much more delightful experience to enter that data into Notion that it is into other tools.

And I dare say that's one of the competitive advantages of Notion compared to some tools. And so if [00:48:00] that's the case, is there like a strategy change that you have to apply it? So say, Hey, of course we'll keep maintaining this but it's actually more important now, for instance, to invest in workflows than it is in I might be getting into secret sauce material over here.

**Simon Last:** So I think it's still both important. So the limit, let's say for the engineering tasks, triage use case that I just talked about at the limit, you could just ask like a coding agent, make this entire thing for me and it will make like a Postgres database, and then like Postgres database for your tasks, and then it will go, connect to the Zendesk API and input them, right? Like that will work at the limit. It doesn't quite work now, but I think in that world though, you still want a convenient data container and a permission model that is end user accessible.

I think that would still be like much preferred. I don't want my tasks to be in some like Postgres database with a one off custom UI that was built on top of it. And who has access to it? And have I verified that? Is it also building me a whole permission model where I can see who has access and is it tested?

And right [00:49:00] so I still think that world is really important. Even when the AI is automating 90 percent of the workflow, You still want to, maybe go in and see the current task list and, go look at the backlog and maybe be able to see one that, that shouldn't have been backlogged.

And then, go add it to the current sprint or whatever it is. And so that's the way important. I just think of it, more like the balance will shift a bit more to the workflows. But this, like highly user accessible data container is still extremely important. We're going to, I think of it like, yeah, keep investing in that.

And that's the foundation. And then, the workflows are like. reducing the tedious work.

**Guy Podjarny:** And additional, I absolutely relates to that. I oftentimes say that you don't want to be in a place in which when you ask the LLM to create an LLM to create a tic tac toe game for you, it will have to create an operating system.

Like you want to be able to build on reusable components that are good, shared jargon between different users. You want to go from one company to the other and be able to see familiar interfaces especially amidst things that are not necessarily true or not. They're just tastes.

They're just preference, but [00:50:00] at least you can create commonalities and norms.

**Simon Last:** So I think in the category of knowledge work like I think notion is conveniently made, like a really good data container for that. What do you need for knowledge work? You need a database, definitely multiple databases related to each other.

You want like unstructured documents as well. And then you want them all wrapped up in a container that's convenient for users and for the AI. And with a clear permission model so that we can trust, what the AI has access , to read or write to.

**Guy Podjarny:** Yeah. Yeah. Makes a lot of sense. So I guess maybe one last question is what is your wishlist in terms of the future of development?

So when you think about AI powered software development. What's a, if you cast your eyes to the future, what would you like to have?

**Simon Last:** Yeah. I would love to I guess similarly to the knowledge work thing, like I think of it as just yeah, successive layers of abstraction of abstracting away tedious work and getting me into higher level modes of operating autocomplete is like the most basic layer and with a new like Cursor one, it took me from auto completing this one line to actually it can do like [00:51:00] semi complicated refactors just by tabbing through the document.

And then, if you actually like chat with AI, you can ask for a function basically, or maybe write some tests for me. And then, yeah, just like successive layers of abstraction can it do the micro tasks that I'm trying to do right now? That's like a, 30 minute task.

Can I actually design and spec this entire part of the code? And, write all the tests and implement it? I would love for my level of abstraction to be more like a technical product manager. Then a coder at some point,

**Guy Podjarny:** right?

**Simon Last:** Where all I'm doing is just, getting rigorous about what the requirements are and roughly how it should work and then, yeah, letting it figure out like how to actually implement it and how to test it and, prove to me that what it wrote actually works and then, when when I have a feature request, I can just communicate that in some clear way that would be incredible.

I think, yeah Not quite there yet, but that sounds good.

**Guy Podjarny:** Yeah. We're one of those working on it. We'll get back to you in

**Simon Last:** the,

**Guy Podjarny:** Simon, thanks for all the the insights and the sharing here. It was a pleasure having you on the podcast.

**Simon Last:** Yeah, it was super fun. Thanks for having me.

**Guy Podjarny:** And thanks everyone for tuning in. I [00:52:00] hope you join us for the next one.

**Simon Maple:** Thanks for tuning in. Join us next time

**Guy Podjarny:** on the AI Native Dev

**Simon Maple:** brought to you by Tessl.

Podcast theme music by [Transistor.fm](https://transistor.fm/?via=music). Learn how to start a [podcast](https://transistor.fm/how-to-start-a-podcast/?via=music) here.