A. Mesaros, T. Heittola, A. Eronen, and T. Virtanen, Acoustic Event Detection in Real-Life Recordings, European Signal Processing Conference (EUSIPCO), pp.1267-1271, 2010.

T. Heittola, A. Mesaros, T. Virtanen, and A. Eronen, Sound Event Detection in Multisource Environments using Source Separation, Workshop on Machine Listening in Multisource Environments (CHiME), pp.69-72, 2011.

C. Cotton and D. Ellis, Spectral vs. Spectro-Temporal Features for Acoustic Event Classification, IEEE Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp.69-72, 2011.

K. J. Piczak, Environmental sound classification with convolutional neural networks, 2015 IEEE 25th International Workshop on Machine Learning for Signal Processing (MLSP), pp.1-6
DOI : 10.1109/MLSP.2015.7324337

P. Paatero, Least squares formulation of robust non-negative factor analysis, Chemometrics and Intelligent Laboratory Systems, vol.37, issue.1, pp.23-35, 1997.
DOI : 10.1016/S0169-7439(96)00044-5

D. D. Lee and H. S. Seung, Learning the parts of objects by nonnegative matrix factorization, Nature, vol.401, pp.788-791, 1999.

V. Bisot, R. Serizel, S. Essid, and G. Richard, Acoustic scene classification with matrix factorization for unsupervised feature learning, 2016 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp.6445-6449, 2016.
DOI : 10.1109/ICASSP.2016.7472918

T. Komatsu, T. Toizumi, R. Kondo, and Y. Senda, Acoustic Event Detection Method using Semi-Supervised Non-Negative Matrix Factorization with a Mixture of Local Dictionaries, IEEE AASP Challenge on Detection and Classification of Acoustic Scenes and Events, 2016.

C. Knapp and G. Carter, The generalized correlation method for estimation of time delay, IEEE Transactions on Acoustics, Speech, and Signal Processing, vol.24, issue.4, pp.320-327, 1976.
DOI : 10.1109/TASSP.1976.1162830

P. Aarabi, Self-localizing dynamic microphone arrays, IEEE Transactions on Systems, Man and Cybernetics, Part C (Applications and Reviews), vol.32, issue.4, pp.474-484, 2002.
DOI : 10.1109/TSMCB.2002.804369
URL : http://www.apl.utoronto.ca/publication/i/aarabi_tsmc_02.ps.gz

C. Blandin, A. Ozerov, and E. Vincent, Multi-source TDOA estimation in reverberant audio using angular spectra and clustering, Signal Processing, vol.92, issue.8, 1950.
DOI : 10.1016/j.sigpro.2011.09.032
URL : https://hal.archives-ouvertes.fr/inria-00630994

A. Jourjine, S. Rickard, and . Yilmaz, Blind separation of disjoint orthogonal signals: demixing N sources from 2 mixtures, 2000 IEEE International Conference on Acoustics, Speech, and Signal Processing. Proceedings (Cat. No.00CH37100), pp.2985-2988, 2000.
DOI : 10.1109/ICASSP.2000.861162

S. Rickard and F. Dietrich, DOA estimation of many W-disjoint orthogonal sources from two mixtures using DUET, Proceedings of the Tenth IEEE Workshop on Statistical Signal and Array Processing (Cat. No.00TH8496), 2000.
DOI : 10.1109/SSAP.2000.870134

M. I. Mandel and D. P. Ellis, EM Localization and Separation using Interaural Level and Phase Cues, 2007 IEEE Workshop on Applications of Signal Processing to Audio and Acoustics, pp.275-278, 2007.
DOI : 10.1109/ASPAA.2007.4392987
URL : http://www.ee.columbia.edu/ln/labrosa/proceeds/waspaa/2007/paper/0026.pdf

M. I. Mandel, R. J. Weiss, and D. P. Ellis, Model-Based Expectation-Maximization Source Separation and Localization, IEEE Transactions on Audio, Speech, and Language Processing, vol.18, issue.2, pp.382-394, 2010.
DOI : 10.1109/TASL.2009.2029711
URL : http://www.ee.columbia.edu/%7Eronw/pubs/taslp09-messl.pdf

H. Viste and G. Evangelista, Binaural Source Localization, Digital Audio Effects (DAFx) Conference, pp.145-150, 2004.

M. Raspaud, H. Viste, and G. Evangelista, Binaural Source Localization by Joint Estimation of ILD and ITD, IEEE Transactions on Audio, Speech, and Language Processing, vol.18, issue.1, pp.68-77, 2010.
DOI : 10.1109/TASL.2009.2023644

A. Deleforge, F. Forbes, and R. Horaud, Variational EM for binaural sound-source separation and localization, 2013 IEEE International Conference on Acoustics, Speech and Signal Processing, pp.76-80, 2013.
DOI : 10.1109/ICASSP.2013.6637612
URL : https://hal.archives-ouvertes.fr/hal-00823453

J. Woodruff and D. Wang, Binaural Localization of Multiple Sources in Reverberant and Noisy Environments, IEEE Transactions on Audio, Speech, and Language Processing, vol.20, issue.5, pp.1503-1512, 2012.
DOI : 10.1109/TASL.2012.2183869

D. Fitzgerald, M. Cranitch, and E. Coyle, Non-negative tensor factorisation for sound source separation, IEE Irish Signals and Systems Conference 2005, 2005.
DOI : 10.1049/cp:20050279

A. Ozerov and C. Fevotte, Multichannel Nonnegative Matrix Factorization in Convolutive Mixtures for Audio Source Separation, IEEE Transactions on Audio, Speech, and Language Processing, vol.18, issue.3, pp.550-563, 2010.
DOI : 10.1109/TASL.2009.2031510

P. R. Mitchell and I. A. Essa, Estimating the spatial position of spectral components in audio, Independent Component Analysis and Blind Signal Separation (ICA), pp.666-673, 2006.

S. Lee, S. H. Park, and K. Sung, Beamspace-Domain Multichannel Nonnegative Matrix Factorization for Audio Source Separation, IEEE Signal Processing Letters, vol.19, issue.1, pp.43-46, 2012.
DOI : 10.1109/LSP.2011.2173192

J. Traa, P. Smaragdis, N. D. Stein, and D. Wingate, Directional NMF for joint source localization and separation, 2015 IEEE Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), 2015.
DOI : 10.1109/WASPAA.2015.7336944

F. Keyrouz, W. Maier, and K. Diepold, Robotic binaural localization and separation of more than two concurrent sound sources, 2007 9th International Symposium on Signal Processing and Its Applications, pp.1-4, 2007.
DOI : 10.1109/ISSPA.2007.4555468

K. Youssef, K. Itoyama, and K. Yoshii, Identification and Localization of One or Two Concurrent Speakers in a Binaural Robotic Context, 2015 IEEE International Conference on Systems, Man, and Cybernetics, pp.407-412, 2015.
DOI : 10.1109/SMC.2015.82

H. G. Okuno and K. Nakadai, Robot audition: Its rise and perspectives, 2015 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp.5610-5614, 2015.
DOI : 10.1109/ICASSP.2015.7179045

W. G. Gardner and K. D. Martin, HRTF measurements of a KEMAR, The Journal of the Acoustical Society of America, vol.97, issue.6, pp.3907-3908, 1995.
DOI : 10.1121/1.412407

N. Roman, D. Wang, and G. J. Brown, Speech segregation based on sound localization, The Journal of the Acoustical Society of America, vol.114, issue.4, pp.2236-2252, 2003.
DOI : 10.1121/1.1610463
URL : http://www.cis.ohio-state.edu/~dwang/papers/RWB.ijcnn01.pdf

C. Viña, S. Argentieri, and M. Rébillat, A spherical cross-channel algorithm for binaural sound localization, 2013 IEEE/RSJ International Conference on Intelligent Robots and Systems, pp.2921-2926, 2013.
DOI : 10.1109/IROS.2013.6696770

C. Févotte and J. Idier, Algorithms for Nonnegative Matrix Factorization with the ??-Divergence, Neural Computation, vol.11, issue.9, pp.2421-2456, 2011.
DOI : 10.1109/TASL.2009.2034186

J. Durrieu, G. Richard, B. David, and C. Févotte, Source/Filter Model for Unsupervised Main Melody Extraction From Polyphonic Audio Signals, IEEE Transactions on Audio, Speech, and Language Processing, vol.18, issue.3, pp.564-575, 2010.
DOI : 10.1109/TASL.2010.2041114
URL : http://perso.telecom-paristech.fr/~grichard/Publications/TSALP_Durrieu10.pdf

D. Bouvier, N. Obin, M. Liuni, and A. , A source/filter model with adaptive constraints for NMF-based speech separation, 2016 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp.131-135, 2016.
DOI : 10.1109/ICASSP.2016.7471651
URL : https://hal.archives-ouvertes.fr/hal-01294681

A. Ozerov, E. Vincent, and F. Bimbot, A General Flexible Framework for the Handling of Prior Information in Audio Source Separation, IEEE Transactions on Audio, Speech, and Language Processing, vol.20, issue.4, 2010.
DOI : 10.1109/TASL.2011.2172425
URL : https://hal.archives-ouvertes.fr/inria-00536917

H. Sawada, H. Kameoka, S. Araki, and N. Ueda, Multichannel Extensions of Non-Negative Matrix Factorization With Complex-Valued Data, IEEE Transactions on Audio, Speech, and Language Processing, vol.21, issue.5, pp.971-982, 2013.
DOI : 10.1109/TASL.2013.2239990

L. Benaroya, F. Bimbot, and R. Gribonval, Audio source separation with a single sensor, IEEE Transactions on Audio, Speech and Language Processing, vol.14, issue.1, pp.191-199, 2006.
DOI : 10.1109/TSA.2005.854110
URL : https://hal.archives-ouvertes.fr/inria-00544949

T. Carpentier, M. Noisternig, and O. Warusfel, Twenty Years of Ircam Spat: Looking Back, Looking Forward, International Computer Music Conference (ICMC), pp.270-277, 2015.
URL : https://hal.archives-ouvertes.fr/hal-01247594

V. Zue, S. Seneff, and J. Glass, Speech database development at MIT: Timit and beyond, Speech Communication, vol.9, issue.4, pp.351-356, 1990.
DOI : 10.1016/0167-6393(90)90010-7

D. B. Dean, S. Sridharan, R. J. Vogt, and M. W. Mason, The QUT- NOISE-TIMIT corpus for the evaluation of voice activity detection algorithms, Interspeech, pp.3110-3113, 2010.

S. S. Stevens and E. B. Newman, The Localization of Actual Sources of Sound, The American Journal of Psychology, vol.48, issue.2, pp.297-306, 1936.
DOI : 10.2307/1415748

R. A. Butler, The bandwidth effect on monaural and binaural localization, Hearing Research, vol.21, issue.1, pp.67-73, 1986.
DOI : 10.1016/0378-5955(86)90047-X

E. Vincent, H. Sawada, P. Bofill, S. Makino, and J. Rosca, First Stereo Audio Source Separation Evaluation Campaign: Data, Algorithms and Results, International Conference on Independent Component Analysis and Blind Source Separation (ICA), pp.552-559, 2007.
DOI : 10.1007/978-3-540-74494-8_69
URL : https://hal.archives-ouvertes.fr/inria-00544199