Centre for Modern Languages and Literature (CMLL)

Activities Report

Talk on Machine Learning and the Google Crowdsource App

Collaboration with: Google Crowdsource
Date: 22 August 2017
Venue: UTAR, Sg Long Campus
Participants: 56



In collaboration with Google Crowdsource, the Centre of Modern Languages and Literature (CMLL) parked under the Faculty of Creative Industries (FCI) and Department of Soft Skills Competency (DSSC) organised a talk on “Machine Learning and the Google Crowdsource app” at Sungai Long Campus on 22 August 2017.


Invited to speak was the Community Manager of Google India Reshma Sanghvi Dilip Kumar, who particularly takes great concern over languages in the world. According to Reshma, language plays an important role as it enables communication, one that is essential to humans.
“It would be unreasonable to assume that everyone speaks English. Fifty-five percent of the content on the internet is in English and the rest is made up of other languages, however only five percent speaks English,” said Reshma.


She pointed out that the language barrier has become a major problem, especially to those who love to travel. “It is true that passport and planes can take you to anywhere in the world, but it will be difficult to travel without understanding the language. In order to bridge this gap, Google has introduced many products over the years. If you have ever read a translation in your native language that has been translated from another language, you will understand that some of the translations are made word by word, and there are a lot of idiomatic phrases that cannot be translated because they only make sense in that language. Things like humour and play on words are often translated incorrectly.”
“This is where Google Crowdsource comes by to open a new frontier in which crowd sourced information assist in the improvement of the machine learning that makes Google Apps like Translate, Maps, Photos and Keyboard better,” she explained.


The Google Crowdsource is an app that requires the user to validate and verify a set of data, which in return would ‘teach’ or ‘educate’ the machine to provide a better translation in the future. It is a way of providing the machine with resources that are not only reliable but used widely, in all the languages possible.


There are a total of eight categories altogether, namely image label verification which enables the machine to identify objects better; entity detection which allows machine to detect the right image; sentiment evaluation to educate the machine of the meaning of each sentence and how one feels while reading it; handwriting recognition to help machine understand and recognise individuals hand writing; translation to help machine translate from English to native language and vice versa; translation validation to help validate translation made by the users globally; landmarks to validate signage; and maps translation validation to validate landmarks entered by the users.
CMLL Acting Chairperson Maxwell Sim Yik Seng commented, “Overall, the organisation was successful as it brought a fair number of staff and students to come and get to know about crowdsourcing as a new way of information. This has rather positive implications of how crowdsourcing can be a model of knowledge leveraging and building in the future workplaces where our students are going as well as allowing the faculty to leverage on crowdsourcing as a resource for teaching and learning. The most obvious benefit is that the students and faculty were able to get to know about the possibilities towards the true democratization of knowledge building through crowd sourcing especially in the use of language translation and of language research, which is one of the key research outcomes of the CMLL.”