In August, Meta introduced its Meta’s AI I translation model, SeamlessM4T, encompassing nearly 100 languages for text and 36 for speech. Now, with an upgraded “v2” architecture, the tech giant is enhancing this tool to bring more spontaneity and expressiveness to conversational translations, addressing a crucial element missing in authentic cross-language communication.
The initial addition is “SeamlessExpressive,” a feature designed to convey your expressions seamlessly into your translated speech. This includes aspects such as pitch, volume, emotional tone (expressing excitement, sadness, or whispers), speech rate, and pauses. This breakthrough has the potential to revolutionize both our daily interactions and content production by eliminating the robotic sound associated with translated speeches. The supported languages for this feature include English, Spanish, German, French, Italian, and Chinese, although Italian and Chinese are currently missing from the demo page as of the time of writing this article.
The second addition is “SeamlessStreaming,” a feature that initiates the translation of a speech while the speaker is still talking, enabling faster access to the translated content for others. Although there is a brief latency of just under two seconds, the advantage lies in not having to wait for the speaker to complete a sentence. Meta’s AI acknowledges the challenge posed by diverse sentence structures in different languages, necessitating the development of an algorithm dedicated to analyzing partial audio input. This algorithm determines whether there is sufficient context to begin generating a translated output or if it should continue listening.
Meta’s recent advancement in its “Seamless Communication” suite appears particularly noteworthy, surpassing the capabilities of mobile interpreter tools offered by companies like Google and Samsung. While there is no specific information about when the public will gain access to these new features, envisioning Meta incorporating them into its smart glasses in the future seems plausible, enhancing their practicality even further.