Agenda

28 Ott 2016 14:30

Computational Linguistics and Artificial Intelligence

European Centre for Living Technology - San Marco 294

Are they Collaborating for Language Understanding?

Speaker: ​Rodolfo Delmonte, Ca' Foscari University

Abstract​

Language Understanding has become the real issue that both big companies and small spinoffs are now trying to tackle in order to produce new products that allow a user to interact using natural language with an avatar or ChatBot. However, AI and CL or NLP seem to continue taking different and diverging directions. AI has now rediscovered neural networks which have become recurrent, and have “invented” Deep Learning, thus suggesting that machines can now Learn anything and also human language by simply elaborating Big Data, i.e. enormous quantities of texts that current technology allows to explore and manipulate. CL and NLP are on contrary fixed on a whole series of linguistic problems that may arise whenever a machine tries to move from the surface word level of analysis, to the higher discourse and text level. Up to now, statistical approaches to sentence level analysis has produced reasonable results which however only account for syntactic structure. The possibility to model linguistic phenomena recurring at discourse/text level is looming far away and there is still a long way to go before we can understand how human language really works. So is it reasonable to continue working at the fine-grained structure of language, or rather should we throw NLP and CL away and opt for a blind approach that is completely “machine-driven”? The example is available and comes from TTS - that is Text To Speech systems. What we have in our smartphones or at the train station is the result of a choice which has been decided in the ‘90s when the idea to use language modelling has invested all the research areas related to speech. The fine-grained approach was the so-called articulatory synthesis which was abandoned in favour of a statistically driven approach. Now synthetic voices sound really human but they cannot be modified at will to produce emotion, unless different language models are being made available to the TTS system. Also, the idea to improve the ability to understand what is being spoken is also been abandoned: the statistical approach requires TTS to choose blindly the best sequence available in the database. Will it be possible to join efforts for a really hybrid architecture? Something similar has been attempted by IBM Watson, the system that has managed to beat best human champions in Jeopardy, the quiz game. However, that effort was 30 million dollars worth and some 3000 cpu organised in 90 servers with 16 terabyte of RAM memory and 4 terabyte of storage. It will take many years to come to produce something similar in a single affordable computer and many more years to transform this computer into a smartphone.

Lingua

L'evento si terrà in italiano

Organizzatore

ECLT

Link

http://www.unive.it/nqcontent.cfm?a_id=203588

Cerca in agenda