TK140 : Multiple Speaker localization in a smart room
Thesis > Central Library of Shahrood University > Electrical Engineering > MSc > 2010
Authors:
Mohamad Hesam mahmodi nejad [Author], Hosein Marvi[Supervisor], Alireza Ahmadifard[Advisor]
Abstarct: Recent advances in computer technology and speech and language processing have made possible that some new ways of person-machine communication and computer assistance to human activities start to appear feasible. Concretely, the interest on the development of new challenging applications in indoor environments equipped with multiple multimodal sensors, also known as smart-rooms, has considerably grown. This thesis investigates the analysis of spontaneous multi-party speech; the goal is to estimate where the various speakers are talking. The speed, the versatility and the robustness of the proposed techniques are tested on a variety of real indoor recordings, including multiple moving speakers as well as seated speakers in meetings. Optimized implementations are provided in most cases. At the first, by using the Combined of hyperbolae produced by time delay estimation (TDE) between several microphones pair and the head orientation information, a new acoustic multi-speaker localization function has been proposed that we call it OPROD-PHAT function. We implement a grid-baxsed, multiple speaker localization method. On the multiple moving speaker location estimation, the new approach has been proposed, that to find number of active source in each time frxame, the power of cross correlation function has been used. After find the loudest source present by maximizing the energy of a steered beamformer, in order to localize other source, the process is repeated by removing the contribution of the first source. We used to discretize the physical space into a few sectors, to reduce the impact of background noise. And speed up, and for each time frxame, an automatic threshold selection system by using the EM algorithm has been implemented to determine which sectors contain active acoustic sources Then, using the LI method, the location of speakers in each active sector, has been determined.Finally, the proposed algorithms has been evaluated. The result of simulation show superior performance of proposed system.
Keywords:
#- multiperson localization #time delay of arrival (TDOA) head oriantation #microphone array Link
Keeping place: Central Library of Shahrood University
Visitor: