|
|
| |
PhD ProposalsSupervisor: Professor M J Russell
Integration of Speech Recognition and Eye-Tracking
In order to exploit speech fully as a modality in human machine interaction it is likely that speech will need to be properly integrated with other alternative modalities. One such alternative is the focus of the users gaze – what the user is looking at. The goal of this project is to investigate the integration of speech and eye-tracking data for improved speech recognition and understanding. The project will concentrate on a particular type of application, for example analysis and manipulation of maps, where the focus of the user’s gaze is clearly relevant. The project will include investigation of techniques for using focus of gaze and knowledge of image content to predict what the user is likely to say, and the development of computer models for integrating speech and eye-movement. The project will be undertaken in collaboration with the School of Psychology, who have considerable expertise in the area of speech and eye-tracking.
Automatic Classification of Air Pollution Particles
This project will be conducted in collaboration with the Division of Environmental Health and Risk Management, The University of Birmingham. The Institute has a large quantity of spectroscopy data for various particles, which are of interest in the context of air pollution. Automatic classification of these particles would allow monitoring of pollution levels, early detection of pollution and early identification of potential sources. The goal of the project is to classify air particles automatically based on spectroscopy, or similar, data. The project will begin with the evaluation of a range of pattern recognition techniques on this data. Based on the knowledge gained in this phase, new techniques will be developed which, for example, might model the time-evolution of particular types of pollution.
Emergence of Dialogue in Simulated Populations In order to build effective systems for human-machine interaction it is necessary to develop powerful and computationally useful models of dialogue. The majority of current dialogue systems are either ‘passive’ and based on descriptive models of dialogue, or very simple (e.g. the “if you want x then press 1”). One of the essential ingredients which is absent from these models is the goal-driven motivation which drives the participants to enter into the dialogue and to carry it to a conclusion. The objective of this project is to conduct research into the creation of artificial environments in which dialogue between individuals (or agents) emerges automatically in response to the need to achieve particular complex goals, rather than as a result of pre-programming. The work will build on similar research into the emergence of communicative behaviour in such environments. The project will draw on research in ‘artificial life’, genetic algorithms and evolutionary programming, and artificial neural networks.
Following an Articulatory Dance
For a machine to understand what someone is saying, it needs to listen carefully, but also to have an intuitive notion of the way the sounds are being produced. The properties of the acoustic signal we hear vary widely for different vowels and consonants, depending on movements of the tongue, jaw, lips, etc. However, the projection from the acoustic to the articulatory space in highly non-linear, making it hard to follow the movements. The goal of this project is to address this challenge using a neural network and an appropriate dynamical model to track articulatory gestures. Potential applications, other than automatic speech recognition, include creating a "talking head" mimic of the speaker and aiding in phonetic-acoustic investigations.
|
|
|
Last updated by Jonathan Mangnall on 25th May 2001. |