Predicting trends is never an exact science—it’s more of an art. Yet, I’m eager to take on the challenge. The good news? At the intersection of Interpreting and Technology I don’t anticipate any dramatic upheavals, apocalyptic scenarios, or seismic disruptions in the space. Change, after all, is a gradual process. Evolution unfolds over time; it doesn’t happen overnight. Let’s see my trends for 2025.
AI Tools Becoming the New Normal for Professional Interpreters
Interpreters, like professionals in many fields, are increasingly embracing AI tools to streamline their workflows. With AI tools becoming more advanced, widely accessible, and often affordable or free, 2025 is shaping up to be the year when their adoption becomes normalized and generalized. No more excessive talk about it—just simple adoption when it makes sense (and there is a lot which makes sense). This trend has been driven in part by community evangelists, at universities and in the professional associations, who highlighted both the possibilities and limitations of AI tools in the past few years. The time is ripe for harvesting the fruits. From a technological perspective, the biggest winners in this trend will likely be general-purpose applications, such as speech recognition apps and chat-bots, which can be used to create glossaries, training materials, and more. With a little inventiveness or a few suggestions by expert users, suitable results can be obtained to support various interpretation tasks. For free. Anytime.
The Consolidation of (the few) Computer-Assisted Interpreting Tools
Different than general-purpose applications, CAI tools have been designed for professional interpreters. Tools will continue to diversify. Some will focus on traditional tasks like glossary management, while others will specialize in AI-powered features, such as automatic term suggestions during live interpretation. Hybrid tools that combine traditional functionalities with AI-driven automation will probably be the winner in this small scale race for adoption. It is important to note the use of the word “small”: the number of available CAI tools is and will remain very limited. There isn’t enough critical mass in the space to drive the development of new applications, as the prolonged development and eventual non-launch of promising projects like SmartTerp demonstrate. Projects like my InterpretBank continue to exist thanks to their long history and the reference value they offer. New ones, like CymoNote, exists due to the incredible dynamics of the Chinese AI and interpreting market. However, it is hard to imagine new players emerging on the horizon.
Within CAI tools, the principle of the professional in the cockpit dominates—empowering human interpreters by allowing them to decide how best to use AI features that are increasingly powerful and sophisticated. CAI will continue to be seen as a valuable productivity tool, with its most impactful applications likely found in tasks surrounding interpretation, such as preparation and training. Technological improvements are on the horizon. For example, features like glossary creation will benefit from the multilingual and contextual capabilities of large language models (LLMs), while improved machine translation will streamline interpreters’ workflows, as an example in the context of translating preparatory documents.
Training is obviously an area of strong impact of CAI tools: natural-sounding synthetic voices will enhance training materials by simulating speeches or dialogues tailored to specific assignments. Imagine practicing with a speech that mirrors the exact topic and tone of an upcoming meeting. These productivity tools will be vital for maintaining high-quality interpretation in an industry where AI interpreting is gaining ground.
Unlike general-purpose applications, which excel in variety and quantity, CAI tools have the advantage of being tailored to immediate professional use, designed in close collaboration with specific users and their needs. They also offer a single point of clear information about data confidentiality, a topic gaining by right a lot of importance.
Increased adoption and higher quality of AI Interpreting
The adoption of AI interpreting will continue to grow, driven by better tools and improved and ubiquitous accessibility. While for casual use and consecutive interpreting free options have already been available for years, AI-powered simultaneous interpreting remains complex and is unlikely to become free in professional settings. In other words, AI simultaneous interpreting will continue to remain a feature confined to professional contexts and not for casual use.
The next major leap in simultaneous machine interpreting will likely involve integrating LLMs as core translation engines (see for example the scientific paper by Gaido et al. or my post Rethinking Machine Translation: Understanding, Reformulating, and Translating). These models promise improved contextual understanding, a pivotal feature for high quality speech translation. On top of this, at least in theory, they might open up new opportunities to drastically reduce translation latency for speech-to-speech systems without compromising quality (this is something I will personally devote much of my efforts in 2025). However, the trade-off between latency and quality will continue to remain a significant challenge for real-life applications. Let’s be clear: while LLMs are advancing, 2025 is unlikely to mark their widespread adoption in commercial applications. The industry’s cautious approach favors proven technologies over untested innovations. But we might see some first adoption at last.
Shift of Speech Translation Research to Real-World Scenarios
Computer science research in speech translation is increasingly focusing on real-world scenarios and challenges rather then only on foundational research. Take one example to illustrate this ongoing shift. Papi et al. discuss an astonishing limitation of past research in simultaneous speech translation. Most research had focused on human pre-segmented speech, simplifying the task and overlooking significant challenges. They propose a systematization of processes and terminologies. Acknowledging such shortcomings is not small feat: it reflects the maturity of speech technology research and their researchers. These maturity is expected to significantly improve real-world applications in the coming years, and increase the impact of the discipline on the applications that we will be build and commercialize in the years to come. Those are the technologies that will power the applications that millions of people will use tomorrow.
Advances of End-to-End Systems for AI interpreting (no adoption)
Currently, most real-world AI interpreting applications rely on cascading systems, where speech recognition, machine translation and speech synthesis are separate processes. These systems are more mature, flexible, and capable of supporting a broader range of languages. But they have their intrinsic limitations. End-to-end systems, which combine these processes, are promising but still lag behind in quality and in the possibility to be used at scale in the real-world.
In 2025, end-to-end systems will continue to make progress in terms of quality, particularly from tech giants like Meta (see for example the description of the Seamless project or the long form scientific paper here), which can address data limitations and immense computational power. However, mainstream adoption is unlikely until these systems achieve comparable quality and easiness to deploy and scale. Smaller startups face significant challenges in this space due to the immense data required to train such models. This might change in the short term, but it is unlikely to happen very soon.
Despite these hurdles, the long-term trajectory points toward end-to-end systems becoming the dominant technology. Achieving this will likely require entirely new paradigms to address data scarcity and quality limitations. Or approaches that are not specifically designed for speech translation…
Rise of Generalist Models in Speech Translation
While specialist models for end-to-end or cascading systems remain the focus of research and commercial development, generalist models are paving a new path for speech translation. Generalist models, such as ChatGPT, are not specifically designed for speech translation but demonstrate excellent capabilities in conversational speech-to-speech translation. This trend underscores the broader evolution of AI toward versatile, all-encompassing systems that are accessible and widely applicable. And this might be the real future of speech translation: general systems outperforming systems designed for tackling speech translation. This will be a paradigm shift for the entire space. In my opinion a very plausible perspective.
Improvement of On-the-Edge Systems for Maximum Confidentiality
For speech recognition and speech translation cloud systems reign supreme today. However, the ability to deploy and run these systems on conventional computers, without data exchange, is steadily advancing. Significant efforts have been made by the community to reduce model sizes, enabling them to run on standard computers, even without specialized hardware. This trend is expected to continue, addressing critical concerns around data privacy and confidentiality.
While complex applications—such as a digital boothmate for professional interpreters or a speech translation system for immigration officials—may not yet be feasible to run on personal computers by 2025, this capability is on the horizon and approaching faster than we might expect.
New normal: Humans and Machines sharing the Stage of Multilingual Communication
Despite technological advancements, 2025 will not be the year that machine interpreting replaces human interpreters. As I wrote in multiple occasions, this dichotomy is premature and quite dull, at least for the short or middle term: even with AI on pair of humans, humans will not become obsolete in this space. AI speech applications are improving at light speed, but they are still not near to reach the performances of high-quality professionals. Therefore, machine interpreting will complement human expertise, expanding access to language services in contexts where human interpreters are unavailable, out of budget, or where performances are similar to average professionals. AI will be adopted by interpreters at large, both through generalist tools or through specialized CAI tools. If AI can support and enhance the daily work of people, it is an imperative to use it to the community benefits.
In summary, 2025 will bring exciting developments in machine interpreting, including the integration of LLMs and a stronger focus on real-world challenges. However, the full potential of these technologies will take years to materialize, and human interpreters will remain indispensable for high-quality communication in the foreseeable future.
With AI improving so rapidly, how do you think it will impact the skill sets required from human interpreters? Will there be a shift towards more technical expertise, or will the core interpreting skills remain at the forefront?
The skill sets will basically stay the same.
The rise of AI tools in interpretation is clear, but what challenges do you think lie ahead in blending machine efficiency with the human touch in complex, high-stakes settings like international summits? I’m curious about the future of collaboration between human interpreters and AI in these critical contexts.