how to speak internet in multiple languages

There was a time when everyone on earth spoke the same language, and thus united, decided to build a tower so that he could reach heaven. They progressed so rapidly on the tower that the Lord, the God of Israel, is said to have said, “Truly the people are one and they have one language, and that’s what they begin to do; now whatever they want to do, He will not be held back from them.”

Yahweh had to stop the building of the tower or else the mortals would consider themselves equal to God. So Jehovah created many languages, dividing them into different linguistic groups of people who could not understand a word spoken by another. Work on the tower came to a halt and everyone went in their own small groups, never to unite again.

You might also like

How three Sinha generations meet financial goals

For 20 months, MSCI India outperforms MSCI Emerging Markets

When will Muthoot Finance shares regain their shine?

Will India benefit or harm due to China’s slowdown?

There are more than 7,000 languages ​​in the world today and content created in one language is incomprehensible to speakers of another language. With the Internet becoming the largest repository of information on the planet, the fact that the world speaks so many languages ​​is becoming a real challenge. Even though English-speaking accounts for just 16% of the population, it accounts for more than 60% of the top 10 million websites on the Internet. On the other hand, despite the fact that China has the largest number of Internet users, only 1.4% of the top 10 million websites use Chinese. Hindi, the third most spoken language in the world, has just 0.1%.

It may all be temporary. The Internet was invented in the English-speaking world, so it stands to reason that most of its content will be in English. As the Internet becomes more linguistically diverse, it is already showing signs of change. In 1996, 80% of Internet users spoke English. By 2010, this had decreased to 27.3%. Today, 12 times more people in China and 25 times more people in the Arabic-speaking world use the Internet than in 1996. It seems inevitable that the language of online content will follow suit.

But creating a more linguistically representative Internet is not the solution we are looking for. This will require us to create new content in a wide range of languages, while making the information we already have available to be understood by a wider range of people. We need translation technology that can consistently (and with a high degree of accuracy) ensure that content in one language is understandable to speakers of another language, so that it no longer matters in which language the content was created. Went.

Nowhere does this need to be resolved more urgently than in India. With over 3,000 languages, the only way to ensure that our development goals reach every nook and corner of the country is to ensure that no content is out of their reach just because of their language.

Earlier this year, the Indian government launched Bhashini, a digital public platform for languages, designed to ensure that speech-to-speech translation uses artificial intelligence (AI) and associated technologies. To distribute digital content in all Indian languages. To achieve this on a large scale, we need to create huge training datasets of text and speech in several Indian languages. Bhashini Project has launched Bhasha Daan, a crowdsourcing platform through which volunteers can contribute spoken words in languages ​​they are familiar with, or by translating texts into languages ​​they understand, or other languages. You can support the project by labeling the images.

As innovative as it is, it’s unlikely to be enough. To solve a problem of this magnitude, we need to work with large datasets of annotated information that accurately cross-references speech or text in one Indian language to many others. This will allow us to build AI models and train them to translate quickly.

An obvious source of this is the archives of All India Radio and Doordarshan, which have been producing content in multiple regional languages ​​at the same time for decades. Past recordings of daily news stories alone would give us comparative samples of nearly identical material spoken in many different languages. I have no doubt that many other sources of similar interpretive language data exist in the private sector.

A potential obstacle to this is copyright law which prohibits the material from being used without the permission of the work owner. Several countries have amended the fair-use provisions of their copyright statutes to make an exception for data analysis – to allow the non-commercial use of this data for the purpose of creating training data sets. India should consider similar amendments to encourage more innovation in the field of language translation.

Douglas Adams, one of my favorite science fiction writers, introduced a generally irreverent literary device in his Hitchhiker’s Guide to the Galaxy series of books to solve the problem of translation on a galactic scale. In her fantasy world, everyone has a BabelFish, a symbiotic creature that lives in your ear and translates all communication signals into language you can understand. As a result, every species, including the most exotic extraterrestrials, can understand each other.

I’ve always thought it would be nice to have babelfish in my ear wherever I travel. May be Bhashini make it possible.

Rahul Mathan is a participant in Trilegal and also a podcast called Ex Machina. His twitter handle @matthan . Is

Elsewhere in Minto

In opinion, Prameet Bhattacharya Profile a 102 year old man, the last surviving link with India’s rich statistical heritage. Rajrishi Singhal revealed a broad idea India could be chasing during its G20 presidency. Amitabh Kant a. writes on Ambitious Modi government’s plan Which turns India, district into district.

catch all business News, market news, today’s fresh news events and breaking news Updates on Live Mint. download mint news app To get daily market updates.


subscribe to mint newspaper

, Enter a valid email

, Thank you for subscribing to our newsletter!