The amount of audio data being generated has dramatically increased over the past decade, spanning from user-generated content, recordings in audiovisual archives, to sensor data captured in urban, nature or domestic environments. The need to detect and identify sound events in soundscapes as well as to recognise the context of an audio recording has led to the emergence of a new field of research: acoustic scene analysis. Applications of acoustic scene analysis include sound recognition technologies for smart homes and smart cities, security/surveillance, audio retrieval and archiving, ambient assisted living, and automatic biodiversity assessment.

However, current sound recognition technologies cannot adapt to different environments or situations. Likewise, context is typically characterised by a single label for an entire audio stream, not taking into account multiple acoustic factors and ever-changing environments, for example when recording using hand-held devices in complex and noisy acoustic scenes.

This project will address the aforementioned shortcomings by investigating and developing novel signal processing and machine learning methods that integrate context and sound recognition, applied to continuous audio streams in urban, nature and domestic environments. This project will advance the area of acoustic scene analysis by investigating and developing novel adaptive methods for joint context and sound recognition and by modelling context as multiple time-varying factors. The potential of the proposed approach will be demonstrated by applying sound recognition technologies for smart homes in collaboration with the project partner and by organising a public evaluation task on the topic.

This project is supported by the UK Engineering and Physical Sciences Research Council under grant no EP/R01891X/1.