Reverse Engineering Infant Language Acquisition

Infants learn their first language at an impressive speed. During the first year of life, even before they start to talk, infants convege on the consonants and vowels of their language, and start segmenting continuous speech into words. Such performance is very difficult to acheive for adults learning a second language. Yet, infants, manage it effortlessly, without explicit supervision, while being immersed into a complex and noisy environment. In addition, infants do not seem to follow a logical order (sounds, then words, then sentences) as adults would do, but rather, they start learning all of these linguistic levels in parallel.

The aim of this project is to decipher this puzzling learning process by applying a ‘reverse engineering’ approach, i.e., by constructing an artificial language learner that mimicks the learning stages of the infant. We use engineering and applied maths techniques (signal processing, automatic speech recognition, natural language processing, machine learning) on large corpora of child-adult verbal interactions in several languages. We develop psychologically plausible (unsupervized) and biologically plausible (bio-inspired) algorithms which can discover linguistic categories (words, syllables, phonemes, features). The validity of these algorithms arel then tested in infants using behavioral techniques (eye tracking) or noninvasive brain imagery (Near InfraRed Spectroscopy, EEGs).

To know more about this topic, look at the ERC-funded BOOTPHON Project or have a look at the Publications tab. If you are interested in an internship (engineering school, master, PHD) or a post-doc job, click on the Jobs tab.

Human / Machine Benchmarking

Increasingly powerful machine learning systems are being incorporated into real-life applications (e.g. self-driving cars, personal assistants), even though they cannot be formally verified, guaranteed statistically, nor even explained. In these cases, a well defined empirical approach to evaluation may offer interesting insights into the functioning and offer some control over these algorithms, as well as revealing differences between artificial and natural cognition.

Several approaches exist to evaluate the ‘cognitive’ abilities of machines, from the subjective comparison of human and machine performance (Turing, 1950) to application-specific metrics (e.g., in speech, word error rate). A recent idea consist in evaluating an AI system in terms of it’s abilities , i.e., functional components within a more global cognitive architecture (Mueller 2010). Psychophysical testing can offer batteries of tests using simple tasks that are easy to understand by humans or animals (e.g, judging whether two stimuli are same or different, or judging whether one stimulus is ‘typical’) which can be made selective to a specific component and to rare but difficult or adversarial cases. Evaluations of learning rate, domain adaptation and transfer learning are simple applications of these measures.

We develop datasets and evaluation tests designed to directly compare human and machine abilities in the area of unsupervised languages learning (see the Zero Resource Speech Challenge) and visual common sense reasoning (see the Intuitive Physics Benchmark).