The Question is the Answer

LLMs as Interviewers

Written by Austin Poor (8 min read)

Dec 24, 2024 (8 months ago)

llmragdocumentationknowledge-base

An abstract digital illustration of a man holding a clipboard and a pencil. — Generated by DALL-E

It seems like most of the current use cases for LLMs have to do with summarization and RAG. Both involve an LLM answering questions, for a user, about a piece of text. Either "summarize the following..." or "reference the following document...answer the following question...".

By flipping that paradigm -- using LLMs to generate questions for a user to answer -- there would be a huge potential value unlock.

Setting the Scene

I think we can all agree, good documentation is vital. Maybe you're documenting your code, maybe you're documenting a business process, or maybe you're taking scratch notes to remember what you just did. It's vital for other people on your team. It's vital when you leave your company, for the next person who replaces you. It's even just for you in 3 months, when you forget how something works.

The act of documenting can be helpful, too. The writing process can help you solidify your thought process and can help commit things to memory.

But, all that being said, good documentation takes time -- time to make, time to organize, and time to maintain. Sometimes it's worth it and sometimes it may not be. And what if we could use LLMs to help bridge that gap.

Talk to Me Goose

Agent dale cooper is driving a car while recording a note on a hand-held recorder. A still from Twin Peaks where Agent Dale Cooper is recording a note for his secretary, Diane.

Now imagine -- rather than writing up, organizing, and distributing documentation -- you just had to record quick, extemporaneous notes.

They don't have to be recorded, they could also be written, but the key is that they're quick -- and most of the time it's faster to speak than to write. You don't have to break out of the flow of your work.

This tool could have access to some context about you (e.g. your role, your projects, your team, your calendar, etc.), it could also have context about what you're doing (e.g. "you're currently working in an IDE, on repo X, and you have file Y open"), and it could have the context of previous notes you've created. Using that information, it could generate a small update / note / log-entry which could be queried later (e.g. an auto-generated update for your stand-up) or used to update other documentation (I'm picturing something like an event-sourced real-time database -- where these notes are the events and the documentation is generated by replaying these updates).

If you're thinking this sounds like just another note-taking or activity tracking app, I get it, but I think there's a key difference. The goal is not to create a comprehensive record of everything you've done or every site you've visited. The artifact being stored is the note. At most the contextual data would help enrich the model's understanding in the moment. So you could say something like "this email" or "the error on this line" and the model could understand what you mean.

The Five(-ish) Whys

A curious taxidermied cat diptych. Aka the "Persian Cat Room Guardian" meme.

So far we've just discussed a context-aware note-recorder. What about the interview side of things?

This is where I think LLMs have the potential to shine. Up to this point we've been relying on LLMs to answer our questions, but what if instead they asked the questions?

Back to our examples, imagine you just fixed a bug in your codebase and you're logging a note. You start recording. You highlight a line of text and explain that there was an error in this line of code. Your AI-note-taker can see the repo you're working on, the file you're in, and the line you have highlighted. But there's missing information. What is the error? How did you fix it? What are the implications? Etc.

What if, rather than a static note-taker, it could be an active participant? It could review the information you've given it, look for gaps in its knowledge, and generate a list of questions to fill in those gaps. You answer the questions. It updates its understanding. And the process can continue until its happy or until you feel like it has enough information.

You're having a conversation -- like you would with another person -- and the LLM is trying to understand what you're explaining.

The process doesn't only have to be triggered by your quick status update. You could also go to the bot and let it ask you a bunch of questions. Imagine you want a more in-depth understanding of a process. You could be "interviewed" by a bot. The process might be similar but whereas in the previous example, you don't want to break your flow and the bot just wants to ask a couple of follow up questions for clarity, in this case its goal is to build out a more comprehensive, structured piece of documentation.

That's a Dumb Question

A screen-grab from the Simpsons, with the family in the car and bart asking "are we there yet?".

What could go wrong?

At this point it may sound like I'm suggesting we turn loose a pestersome LLM that never sleeps and has nothing better to do than ask you a million questions. There's a lot about this that could go wrong.

Let's go over just a couple possible issues.

Q: Am I just going to be subjected to an infinite stream of bad questions?

If you do, then it won't work. Just as important as its ability to know what questions to ask is its ability to know when to stop asking questions.

It needs to have some situational awareness -- to gauge what questions it needs to ask to fill in the critical gaps in its knowledge, it needs to be able to prioritize those questions, and it needs to know when to stop asking.

That can be a moving target. If you're giving a short update, maybe it only has time for one or two clarifying questions. Or maybe no questions. Maybe it just needs to make an internal note to follow up later, when you have more time.

If you have more time, then it can really start to fill in gaps in its knowledge.

Q: Is this just going to be used to automate away my job?

Again (as with all AI tools), maybe. But it has the potential to help facilitate a lot of information sharing.

Screenshot of a post saying "Write undocumented unmaintainable code and get a lifetime of employment"

You could always strip out all of the comments and tests from your code as job security. But you probably don't do that, because you don't want to live like that and you're better off collaborating and empowering yourself and the people you work with.

Q: What happens when the AI misunderstands something crucial?

While the LLM can do a lot of the leg work, it would still require some human intervention to verify, adjust, and correct the information that's being stored.

Ideally the system would be set up to go back and adjust as it gets new information. That way, if you provide a correction or add a note manually, it won't contradict past information but can be incorporated into its "mental model".

Q: Won't this create more work? Why not just write the notes myself?

You can just write the notes yourself. The goal is for the tool to help you fill in those notes quickly but if it isn't helpful you shouldn't have to use it.

Q: What about privacy concerns?

Privacy is tough for AI tools with access to your computer. Just ask Microsoft Recall.

But the tool could have degrees of access. And while providing access to your computer could be helpful, it certainly isn't necessary.

OpenAI, for example, is experimenting with providing application access to the ChatGPT desktop app.

Conclusion

Good documentation takes time to write and even more time to maintain. Often it doesn't get done. And often a lot of key information stays locked in people's heads.

What if we flipped current paradigm of LLMs answering questions and instead had them ask questions?

They could build and maintain a knowledge base of your company or project. As you make changes, it sees those changes and updates the documentation accordingly.

It would likely need human intervention at times -- to make corrections or to add larger notes without waiting for the LLM to ask the right questions -- but having it focused on documentation could lead to documentation that's more complete and up to date, while at the same time freeing up humans to write more code and be more productive.

The goal isn't to replace careful documentation, it's about capturing information that typically goes undocumented and lowering the barrier to updating that documentation.