Hospitality - Chatbots Out, Agents In - Matija Marijan (Turneo)

Authors: Elena Vrabie and Adriana Spulber
"Hi, we're a team of 12 looking for recommendations for a team-building weekend. Something outdoors and indoors, nothing too extreme, but that will bring us together. We arrive this Friday."
"Perfect! I'd suggest starting with a private beach barbecue, followed by a half-day of workshops in our meeting room, then an afternoon of tree climbing and a group dinner to close it out. Want me to block it and send confirmation back?"
No concierge. No human typed that. Just NEO, doing in one exchange what most hotel chatbots would hand off to an overworked staff member replying the next morning.
Matija Marijan’s entrepreneurial journey started early, by founding Entrio, an online ticket and event management platform. He then took his skills into the corporate world as a strategy consultant at PwC, and then evolved as part of the leadership team behind online travel agency Expedia's $15B+ B2B division. Now, we find him as CEO and co-founder of Turneo. His path from founder to global corporate professional and back to experimental startup founder gives him a hard-earned view of why hospitality has resisted digital transformation for so long, and what it will actually take to change it.
Turneo started as a plug-and-play experience commerce platform, helping hotels offer and manage in-destination activities directly to their guests. Four years in, Matija, his co-founder Fran, and their team have now launched NEO, an AI agent platform built specifically for hotels, pre-integrated into core systems, and trained on thousands of real guest conversations.
In this interview, Matija reveals what it takes to build agentic AI in a low-tech vertical: why older models produced great demos but failed in the real world, why fine-tuning on proprietary data is largely a distraction, and how NEO eliminates hallucinations not through guardrails but through reasoning. We are also making a bet, looking into history repeating itself, drawing a parallel to how Google never managed to replace Booking.com.
UV: You launched Turneo 4 years ago, and you just announced NEO at the beginning of 2026. What convinced you that hospitality, a tech-resistant industry, was ready for the next AI wave?
Matija Marijan: When we started, we didn't fully appreciate how slow hospitality is at adopting technology. But I've learned to appreciate it. Most hospitality companies are right to be slow because they run complex businesses, and if the technology is not working perfectly, it creates operational chaos. If you're a startup and something goes wrong, you can recover fairly quickly. But if you have hundreds or thousands of staff following procedures and technology messes it up, it's a lot worse.
When thinking about NEO and what to do with AI, we obviously had this worry about when hospitality would be ready. But two fundamental things convinced us it has to happen:
1. Hotels are in the business of serving guests. If guests are adopting AI in their trip planning and booking, you can't stay ignorant of that. Adoption of products like ChatGPT and Claude has seen such rapid growth over the last two and a half years that if you're running a hotel, you can't be saying "Do we really need it?"
2. AI is becoming much easier to use than traditional software. Training staff on SaaS products is actually hard - think complex navigation, multiple levels deep, 700 buttons, and exact procedures to follow. What's brilliant about the new models is they're so good at understanding what the user is asking and figuring out which of those 700 buttons the guest wanted to click, except instead of clicking, it's conversational. You don't need to be extremely techie; you just need to eloquently say something, and AI handles everything else.
We launched a couple of weeks ago, and we are now getting feedback, iterating, and refining exactly where the biggest fit is. The cool thing with AI is you can iterate a lot quicker, as you're not building interfaces, just conversations.
UV: There's a common frustration with customer-facing tools, where the bot fails users and just redirects them to a human. Is that a problem you see a lot?
MM: If you open Claude or ChatGPT, you expect it to be great. If you're on a website and you open a chatbot, you expect it to be bad. That's because the pace of development of AI companies has been so quick.
If you used ChatGPT 3.5 - the one from two and a half years ago, today you would be like, ”This is so bad!”, ”How did we think this was great?” And unfortunately, a lot of chat products are even worse than ChatGPT 3.5, while products that are now on the market are so much better than that. It's kind of on par with the conversation you could be having with Claude or Claude Cowork.
But it will take us a little bit of time to adopt the right expectations. We made a conscious decision in our UI (user interface) to not make it look like a chatbot, but more like ChatGPT or Claude, because we think it creates a different frame of mind. You're like, ”Okay, this is going to be more capable than a chatbot that's just going to redirect me to a human”.
UV: Can you walk us through building with AI when things change so fast?
MM: Traditional chatbots, the previous generation products, were built on very hard-coded rules. These products were not very smart. They were kind of good at reading and writing, but not at reasoning. So you would have to give them a lot of rules.
It's counterintuitive, but that's how it works. If you give it no rules, it would be kind of okay. If you give it a lot of rules, it becomes good at what it has a rule for, and terrible at where it doesn't have one. There's a whole generation of products built with that premise.
In a way, this is what happened with AI - you can throw all of that technology away. The models are so good that you don't need to give them almost any rules. You give them access to context and to tools to contact certain external providers, like send an email, and call an API, to name a few. Then, you let the model figure it out.
It sounds incredible, but it's almost like giving it to a human and saying: ”Here are all the tools you can use; someone is going to ask you a question, so go solve the problem”. Models evolved from being pretty stupid but sounding smart, to actually being pretty smart.
I was initially skeptical because we tested it and it just didn't work; we had to create so many rules. Then, about six months ago, we started getting results. But there were still edge cases where it would fail miserably.
With the new models, if you set it up in the right way, you can no longer break it. You can throw anything at it, and it will handle it extremely well. And if it cannot solve something, it will gracefully send you somewhere else, not to an FAQ.
In our case, if the model can't answer, it will email the hotel, ask the reception, and when reception answers, it will get back to the guest and update its knowledge base. It's kind of self-learning. That was not really possible even a few months ago, because the models were not smart enough, yet.
UV: What is actually agentic about NEO?
MM: The new models are agentic in themselves. In our previous version, we had to build the agent intelligence ourselves. What people tend to talk about with agentic AI is the ability for the agent to reason, and usually, it requires multiple steps. With the first models, whatever you gave them, they would just spit out the answer. Imagine interviewing someone, and they just start talking right away; they're probably not going to give a great answer.
The first step of reasoning is to analyze and write a summary, like a good candidate who says, ”Let me think about it”, and then presents the answer. It's this idea that the agent goes through certain steps that humans would do to answer a question, rather than just spitting out the first thing.
However, new models do that by default. So if you try to add agentic loops on top of a new model, you're actually hurting it, because the new model already has these loops inside. You don't need to spend so much time as a startup creating them. What you need to spend time on is figuring out how to leverage that.
If you built a product like six months ago, or a year ago, you've done so much work that Claude - whether it's Sonnet, Opus, or Haiku (Claude’s models) - is way better now, and that work is, I would say, wasted. You just need to rebuild it, knowing that this model is smarter and better at reasoning than whatever you built yourself. It's inherently agentic, and you just give it the right context, the right tools, the right data, and let it be smart.
UV: Why was hospitality a harder use case to crack?
MM: We tried so many things and were never very happy, because models were good for demos, but hospitality deals with the real world, and the real world is very complex. In the SaaS industry, if a chatbot tells you to go to an FAQ page and there's no page, that's a little bit annoying. In hospitality, if AI tells you to go to a restaurant and you go there and there's no restaurant, that's worse. Or you tell someone they should totally climb this mountain with their children, and it turns out to be a really bad idea…the consequences are a lot worse. That's why whenever we tried to build things on older models, it just didn't work.
We kept trying, but we didn't launch, and we couldn't get it to work. We finally got one version to work, and then the new models came in, and we were like, ”This is way better”. We had a period where we didn't want to demo the old one to anyone anymore because it felt so much worse. We just had to rebuild it. I suspect the people at Anthropic and OpenAI feel something similar when they build a new model; they don't want anyone using the old one anymore.
UV: If you could go back four years ago with all this knowledge about chatbots, AI, and agentic, what would you tell a technical founder considering a similar move today?
MM: I believe that young companies that are still in the process of establishing themselves can greatly benefit from AI. However, even larger companies, which may seem unlikely to leverage AI, have successfully created impressive AI products. AI is a transformational technology, and as long as you're utilizing it effectively, you can build something truly valuable.
If someone had started a company four years ago, it probably wouldn't have been about AI, because the models were just starting to become capable. But they would have fairly quickly moved to do something around it. It depends on the industry, though, as some industries like law and coding took off earlier, while hospitality, finance, and consulting haven't really taken off yet. I think it's just because the models were not capable enough, and the risk of getting it wrong was too big.
Four years ago, you wouldn't have done anything. Some people jumped on it a year later because they saw that models could solve their problems. We tried, we couldn't get it to work, we went back to something else, and now we finally see that models can solve it. Timelines are not important. You just need to see what the underlying capabilities are. But you need to play with the models, you need to be building products often. That's the only way to test the capabilities.
And you need to go deep, because a lot of AI products can look incredible but only handle 80% of things, and the other 20% just fail miserably. You need to do the work to get above that 80-90% threshold to something good enough for your industry. In some industries, 85% is good enough; in others, 99% is not good enough.
If you're in an industry where AI hasn't yet taken off, it may be because the current capabilities of AI models do not meet the necessary level of reliability. However, it's only a matter of time before that changes. It's advisable to experiment with AI today, and if it doesn't succeed, try again in three to six months, as progress is inevitable. I believe there will be no industry where AI cannot replace a significant amount of human work, particularly when it comes to digital tasks.
UV: There's a lot of talk about feeding AI the right data, and hotels have a lot of data they're not using. What has been your experience?
MM: I don't know if this is a hot take or not, but I think training data sets are next to useless because models are just too good. There's been this belief, like two years ago, that you take the core model, post-train it with your own data, and make it better. But by the time you post-train the models on your data, these AI labs are going to come up with a new model and blow everything you've done away. You will be just wasting your time.
There's this founder of a company called Manus, which got bought by Meta for around 2 billion dollars. He said something similar. They tried post-training models, took them four months, and by the time they were done and thought "This is 2% better than the benchmark," a new model came out showing benchmarks 30% better. They did that twice and then realized there's just no point, because you cannot compete on core intelligence with these AI labs.
What you can do instead is focus on context. It's not about training the models, it's about injecting the right data at the right point in time so that AI can reason on it. If hotels have unique data about customers or destinations, it's not about training the model to use it better. It's about giving that data to the model at the right point in time, in the exact right amount, so it can make a better decision. And there's a huge amount of value there.
The good thing about that approach, which we've also taken, is that when a new model comes, you don't need to retrain anything. You just give the new model the data and see whether it needs more or less. If you're a small company, that's what you can do. If you're Meta, you could argue that you can post-train models better than someone else. But if you don't have hundreds of millions of dollars in funding, I don't think it's realistic that you can train models better than these labs, even with the best data sets. Companies like OpenAI and Anthropic are paying hundreds of millions of dollars to experts, lawyers, and consultants to train those models to be better than anyone else.
UV: How are you approaching training data and personalization with NEO?
MM: With NEO, imagine you're staying somewhere, and AI is either proactively or reactively making recommendations. What it's going to do is use a bunch of tools to look up your profile in the hotel system, your profile in Turneo, and your previous conversations with AI. It pulls all of that relevant data. It can also look at hotel data about what guests typically like, but this is all textual data; it's not retraining the model, it's just saying ”Hey, guests typically recommend this”.
It gives all that data into context and applies it to you. So if you say you're interested in romantic things or horse riding, AI is going to decide which data points are relevant to that conversation. You're not giving it the whole data set; you're telling it these are the places where you can look up data, and this is the type of data you can find there. Then AI queries those data sets, finds the relevant information, decides if it's going to use it, and comes up with an answer. It might say: ”Hey, yesterday you told me you're coming with your boyfriend, and you like nature - I think you're going to love horse riding, we have two horses, and other couples liked it too.”
You're not training the model. You're just giving it data, and the model is smart enough to understand that, given your circumstances and what's available, this is the best recommendation for you.
UV: A lot of enterprises get stuck in the integration layer, where tools hallucinate, data is hard to incorporate, and pilots never reach adoption. How did you navigate that?
MM: Integrations have been a massive pain, always, whether it's AI or not. What we've done is pre-integrate. We have systems that we pre-integrate to, so hotels don't need to do the custom integration. That's one of the benefits of using a product like ours versus a general AI chatbot.
We integrate into hospitality industry software; it's already done, we've tested it, and we know what data is right. We've built the data models to extract the right data, because another problem with integrations is that if you extract all the data, you give too much to the model, and you confuse it.
Similar to a human, if I gave a human your whole life history and said to recommend something, they'd be like, ”I don't know, there are hundreds of pages”. So we've done that work, and that's where vertical AI companies have an advantage. We've done it for PMSs (Property Management Systems), for our own system, and we're doing it now for CRMs (Customer Relationship Management).
On hallucinations, we've had problems with them, but not so much with the new models. The way we've tackled it is that you need to give AI options of what it can do, and once you give it options and let it decide, it's unbelievably good at not doing other things.
If our product goes through all the tools at its disposal and cannot come up with an answer, it will tell the guest: ”I used all the tools at my disposal and couldn't find the answer. Do you want me to reach out to the hotel reception?” If you don't give it access to tools and just give it a knowledge base, AI tends to hallucinate because it didn't actually reason through it; it just reads something and comes up with the best interpretation.
This is a little bit of a black box. You try things and see what works, and sometimes you don't even know exactly why it works. Even researchers can't exactly explain why LLMs can do something in a certain way. There's a lot of trial and error. But in the new version, we don't see hallucinations at all. Combining tools, context, and new models, we've largely eliminated it.
UV: So the human-in-the-loop has evolved: the AI looks through all the data, can't find the answer, and asks the hotel. But the AI still retrieves and delivers the information back to the user; the hotel doesn't answer directly?
MM: Exactly. And that's a different approach from most chat products, where if the agent can't answer, it either goes into a loop and frustrates you, or says I'm going to connect you to a human now. We didn't like that, because you could be talking to a chatbot at any time of day, and the reception is not always open. And if you're traveling, you're not going to keep a chat open waiting for a human - it's mostly mobile anyway.
So we built a system where we're not going to have a human jump into the chat. We ask a human, and it might take them 12 hours to respond. But then the agent comes back to you, and you can continue the conversation with the agent. Just because it couldn't answer one thing doesn't mean it can't answer your next question. In the old system, it would send you to a human, and that made sense because agents couldn't answer so many things. But now it hurts the conversation; you want to continue, but you're waiting for a human who might come back in 10 hours, and then you just start a new conversation. It gets complicated.
So if it can't answer something, it treats humans as teammates. Imagine you're responding to a chat and someone asks something you don't know, you'd tell the user, ”Sorry, I need to ask my colleague, I'll get back to you”. You wouldn't say, I'm going to transfer your chat to my colleague who does not know what you've discussed and might reply in 20 hours. That's what everyone hates.
We don't even call this human-in-the-loop, because human-in-the-loop traditionally means AI needs approval to take an action. We built that in the first version and found it creates too much friction. Humans cannot keep up with approving AI because AI is too quick. We do have it in B2B settings, like ”Hey, are you sure you want to delete this document?” But on the guest side, we've completely got rid of it, because it creates friction that might save 0.01% of conversations that could go wrong, but hurts the other 99.99%.
UV: How do you plan to adapt NEO when things are changing so fast? Are you experiencing a tech challenge right now where you're waiting for the technology to catch up?
MM: The only way you can keep up is if you use the technology. You need to be using the most advanced AI tools to see what is possible, and then match the patterns and see if that solves any of the problems your users are having. A lot of what we're doing is not some unique insight; it's more like we use Claude Code, we use Cowork, we use other tools, we see what they can do, and then think: ”We have a similar problem, can we take the same approach and apply it?”
In terms of a current tech challenge, voice. I think it's gone really far, but across a lot of languages, it's not there yet. We're holding off on voice because outside of English, AI is not good enough and not fast enough. Take the use case where AI emails the reception. In theory, it could also call the reception, but if AI is constantly calling and the receptionist is like, ”Oh God, AI is calling me”, it's going to be slow.
So we're holding off. I can't see a world where voice in an international setting does not become good enough that we use it; it's just right at this edge where it's useful, but it's also going to bring in friction, and relying on chat and email is zero friction and 100% useful still. But the second we see that problem solved somewhere, we'll be quickly adding it, because all of these technologies are available to everyone. Once you have the core reasoning and the data, whether it's chat, voice, or whatever else, it's all easily replaceable.
UV: Where do horizontal LLM platforms like OpenAI and Anthropic become good enough competitors? What do you think about defensibility as the foundation models improve?
MM: I'll draw a parallel to what happened pre-AI, especially in the travel industry. There's always been this idea that Google was going to destroy online travel agencies like Expedia and Booking. But the truth is, Google is really good at owning the front of the funnel: search. For them to go and do all the integrations, all the messy work to actually work with hotels, contracting, all of that - it doesn't make sense. They're already great at something they can horizontally apply to any industry.
I think it's going to be very much the same with LLMs. They have these amazing horizontal capabilities, and they might try some vertical B2B plays, but they'll probably find it's not worth their time. They need to compete with other LLMs on a horizontal layer for better models and reasoning. If they get bogged down trying to integrate with hotel systems, it doesn't help the horizontal play at all. And in AI, unlike Google, you have very fierce competition. They can't get distracted trying to integrate with a hundred PMS or a hundred legal systems.
So I think you're going to have vertical AI companies that own the integrations, have the best tools and workflows, integrate into legal or hospitality software, and use the best horizontal models to do it. History often repeats itself.
On that note, we actually shifted to Anthropic's models and were blown away. We didn't want to shift because we had credits on OpenAI, so we thought it was the same. Then we shifted, and we were like, ”How is it possible that it's so much better?” It unlocked a lot of capabilities. On one part of our core product, we reduced the codebase from 8,000 to 2,000 lines, getting rid of things that were confusing Anthropic's models, to let it shine.
TAGS:
agentic, agents, travel, hospitality, neo, turneo, matija marijan, founder, startup, chatbot
