Discussions on the impact of translation and interpreting technologies in the field of multilingual communication are more important today than ever. Technology is improving rapidly and will continue to do so in the years to come. Consequently, its use in a multitude of contexts is bound to increase. While in most cases the use of speech technologies will become a simple and worries free commodity, there are areas, such in highly regulated markets, where their use might be classified as high-risky.
In the context of live speech translation, high-risk scenarios refer to situations where the accuracy and reliability of communication are critical, and any misinterpretation or misunderstanding could have significant and potentially severe consequences. These scenarios typically involve areas such as:
- Judicial Settings: Courtrooms, legal proceedings, and law enforcement interactions where misunderstandings can affect the outcome of trials, the rights of individuals, and the administration of justice.
- Medical Environments: Hospitals, clinics, and other healthcare settings where accurate communication is essential for diagnosing conditions, administering treatment, and ensuring patient safety.
- Emergency Services: Situations involving emergency responders, disaster relief, and crisis management where clear and precise communication can be a matter of life and death.
- Diplomatic Affairs: International negotiations, treaties, and diplomatic communications where misinterpretations can lead to diplomatic conflicts or failures in negotiations.
In these scenarios, the stakes are high because errors in translation can lead to legal injustices, medical errors, loss of life, diplomatic tensions, or other serious outcomes. It must be noted that stakes are equally high when there is no translation available.
The above considerations apply to any kind of translation agent, whether a human or a computer. While there are some – albeit very general – provisions regarding human interpreters, the novelty of speech technologies has not allowed stakeholders enough time to reflect and define the appropriate use of such technology. A round of discussions, research, and applications is necessary. What do decision-makers need to know and what steps must be taken to ensure that speech technologies, such as machine interpreting, are used effectively and responsibly?
Here are some key points that stakeholders will need to keep in mind when starting to discuss this topic, followed by three simple principles that may guide the work ahead of us in the years to come.
Technological Evolution: The shift from rule-based to neural machine translation has significantly improved the accuracy and capacity of machine translation systems. Similarly, speech recognition technologies have evolved to better transcribe speech into text, in an increasing variety of languages and dialects. The advent of Large Language Models is again pushing technology further, and quality of speech translation systems is about to increase in the years to come, for example thanks to emerging abilities such as grounding translation in the communicative event, understanding cultural references, etc.
Practical Applications: Speech translation systems are used by the general public to overcome language barriers while traveling, listening to podcasts, etc. In sensitive and professional settings, speech translation systems are increasingly used due to a shortage of human translators and interpreters, especially for, but not limited to, less common languages, or to reduce the costs and increase availability of translation services. These technologies are also utilized in hospitals and police stations, to name just a few, to fill these gaps and to provide basic language support. These might be obviously considered high-risk scenarios, depending on the scope of the conversation. There is a difference, for example, if a person is requesting an appointment or if a doctor-patient consultation needs to be live translates. The implications for the use of such technologies in potentially critical cases are many, and mostly unexplored.
Legal and Ethical Concerns: The use of speech technologies in high-risk scenarios raises several legal and and ethical challenges:
- Accuracy and Reliability: Speech translation systems can make errors, especially in culturally sensitive contexts. These errors can have significant legal consequences, such as incorrect translations leading to contradictory statements in asylum cases. The rapid advancements in AI and the hype surrounding it often create unrealistic expectations among decision-makers. Grounding expectations in the reality of the technology’s current capabilities is an important step towards more responsible decision making processes.
- Confidentiality: In many cases, using speech technologies involves sharing sensitive information with private companies, which may compromise confidentiality. This is especially true in the case of cloud-based solutions, which represent the standard today. While on-premises solutions may be available in the future, they are simply not there today. In the case of cloud solutions, confidentiality issues can be addressed by enforcing specific data policies with the service provider. Such policies can be vetted by third-party certification bodies.
- Accountability: Language and translation are intrinsically complex phenomena. The issue of accountability is not well defined, even in human interpretation (verba volant, scripta manent). This extends to high-risk scenarios, such as judicial and medical communication, as demonstrated by the fact that hospitals often use underqualified personnel or even family members as interpreters. In high-risk scenarios, addressing the accountability of AI systems must occur within the wider context of the rules governing the use of human interpreters.
Challenges in regulated markets: Regulated markets require systems and solutions to meet minimum standards through certifications, examinations, or other means. This applies both to human labor as well as to machine translation systems. Unlike many devices used in the judiciary or hospitals that undergo rigorous testing and vetting, there is still no certification available for this emerging technology. At the moment of speaking we lack sufficient knowledge about how and if it is possible to provide clear error margins or accuracy thresholds for speech translation systems.
Principles to focus on in the coming years
While the use of speech technology in non-regulated markets will be shaped by its adopters and users based on their needs, perceptions, and evaluation criteria, responsible adoption in high-risk scenarios and regulated markets will require profound knowledge, mutual agreements, and regulations. Many discussions need to be initiated, and significant work lies ahead. Here, I identify three high-level principles, or calls to action, for the discussions to come:
- Increase awareness among stakeholders about the potentials and limitations of speech translation technologies. This is the basis for allowing responsible decision-making. Judging by my daily conversations with various stakeholders, the amount of work to be done here is enormous.
- Clearly and unequivocally identify what high-risk scenarios are. While a simple categorization could be based on whether a use case involves decision-making or is purely informative, a more nuanced system of categories should be developed. This is not an easy task. Real-life scenarios are never binary, and a multidimensional categorization system will be necessary.
- Advance efforts to define standards and certifications for speech translation systems based on their specific use cases. While most use cases will not require particular certification, high-stakes scenarios and use in regulated markets will. This work requires both technical and communicative expertise, which is not commonly found in single individuals or institutions. There are also theoretical issues to solve: what constitute good quality?
While speech translation aims to offer a scalable accessibility solution, particularly in non-critical situations, significant challenges arise in high-stakes scenarios and regulated markets. To mitigate risks and enhance the benefits of automatic speech translation in such critical use cases, substantial efforts are required. A community of experts must be trained, and new knowledge must be generated to address these challenges effectively.
Image by https://st-benchmark.github.io