Robust sound event detection in bioacoustic sensor networks


Autoři: Vincent Lostanlen aff001;  Justin Salamon aff002;  Andrew Farnsworth aff001;  Steve Kelling aff001;  Juan Pablo Bello aff002
Působiště autorů: Cornell Lab of Ornithology, Cornell University, Ithaca, NY, United States of America aff001;  Music and Audio Research Laboratory, New York University, New York, NY, United States of America aff002;  Center for Urban Science and Progress, New York University, New York, NY, United States of America aff003
Vyšlo v časopise: PLoS ONE 14(10)
Kategorie: Research Article
doi: 10.1371/journal.pone.0214168

Souhrn

Bioacoustic sensors, sometimes known as autonomous recording units (ARUs), can record sounds of wildlife over long periods of time in scalable and minimally invasive ways. Deriving per-species abundance estimates from these sensors requires detection, classification, and quantification of animal vocalizations as individual acoustic events. Yet, variability in ambient noise, both over time and across sensors, hinders the reliability of current automated systems for sound event detection (SED), such as convolutional neural networks (CNN) in the time-frequency domain. In this article, we develop, benchmark, and combine several machine listening techniques to improve the generalizability of SED models across heterogeneous acoustic environments. As a case study, we consider the problem of detecting avian flight calls from a ten-hour recording of nocturnal bird migration, recorded by a network of six ARUs in the presence of heterogeneous background noise. Starting from a CNN yielding state-of-the-art accuracy on this task, we introduce two noise adaptation techniques, respectively integrating short-term (60 ms) and long-term (30 min) context. First, we apply per-channel energy normalization (PCEN) in the time-frequency domain, which applies short-term automatic gain control to every subband in the mel-frequency spectrogram. Secondly, we replace the last dense layer in the network by a context-adaptive neural network (CA-NN) layer, i.e. an affine layer whose weights are dynamically adapted at prediction time by an auxiliary network taking long-term summary statistics of spectrotemporal features as input. We show that PCEN reduces temporal overfitting across dawn vs. dusk audio clips whereas context adaptation on PCEN-based summary statistics reduces spatial overfitting across sensor locations. Moreover, combining them yields state-of-the-art results that are unmatched by artificial data augmentation alone. We release a pre-trained version of our best performing system under the name of BirdVoxDetect, a ready-to-use detector of avian flight calls in field recordings.

Klíčová slova:

Acoustics – Animal flight – Animal migration – Bioacoustics – Bird flight – Memory recall – Neural networks – Ambient noise


Zdroje

1. Segura-Garcia J, Felici-Castell S, Perez-Solano JJ, Cobos M, Navarro JM. Low-cost alternatives for urban noise nuisance monitoring using wireless sensor networks. Sensors Journal. 2015;15(2):836–844. doi: 10.1109/JSEN.2014.2356342

2. Mack C. The multiple lives of Moore’s law. IEEE Spectrum. 2015;52(4):31–31. doi: 10.1109/MSPEC.2015.7065415

3. Hecht J. Is Keck’s law coming to an end? IEEE Spectrum. 2016; p. 11–23.

4. McCallum JC. Graph of Memory Prices Decreasing with Time; 2017. http://jcmit.net/memoryprice.htm.

5. Stowell D, Giannoulis D, Benetos E, Lagrange M, Plumbley MD. Detection and classification of acoustic scenes and events. IEEE Transactions on Multimedia. 2015;17(10):1733–1746. doi: 10.1109/TMM.2015.2428998

6. Laiolo P. The emerging significance of bioacoustics in animal species conservation. Biological Conservation. 2010;143(7):1635–1645. doi: 10.1016/j.biocon.2010.03.025

7. Bello JP, Mydlarz C, Salamon J. Sound Analysis in Smart Cities. In: Virtanen T, Plumbley MD, Ellis D, editors. Computational Analysis of Sound Scenes and Events. Springer; 2018. p. 373–397.

8. Zhao Z, D’Asaro EA, Nystuen JA. The sound of tropical cyclones. Journal of Physical Oceanography. 2014;44(10):2763–2778. doi: 10.1175/JPO-D-14-0040.1

9. Merchant ND, Fristrup KM, Johnson MP, Tyack PL, Witt MJ, Blondel P, et al. Measuring acoustic habitats. Methods in Ecology and Evolution. 2015;6(3):257–265. doi: 10.1111/2041-210X.12330 25954500

10. Nieukirk SL, Mellinger DK, Moore SE, Klinck K, Dziak RP, Goslin J. Sounds from airguns and fin whales recorded in the mid-Atlantic Ocean, 1999–2009. Journal of the Acoustical Society of America. 2012;131(2):1102–1112. doi: 10.1121/1.3672648 22352485

11. Blumstein DT, Mennill DJ, Clemins P, Girod L, Yao K, Patricelli G, et al. Acoustic monitoring in terrestrial environments using microphone arrays: applications, technological considerations and prospectus. Journal of Applied Ecology. 2011;48(3):758–767. doi: 10.1111/j.1365-2664.2011.01993.x

12. Marques TA, Thomas L, Martin SW, Mellinger DK, Ward JA, Moretti DJ, et al. Estimating animal population density using passive acoustics. Biological Reviews. 2013;88(2):287–309. doi: 10.1111/brv.12001 23190144

13. Shonfield J, Bayne E. Autonomous recording units in avian ecological research: current use and future applications. Avian Conservation and Ecology. 2017;12(1):42–54. doi: 10.5751/ACE-00974-120114

14. Heinicke S, Kalan AK, Wagner OJ, Mundry R, Lukashevich H, Kühl HS. Assessing the performance of a semi-automated acoustic monitoring system for primates. Methods in Ecology and Evolution. 2015;6(7):753–763. doi: 10.1111/2041-210X.12384

15. Baumgartner MF, Fratantoni DM, Hurst TP, Brown MW, Cole TVN, Van Parijs SM, et al. Real-time reporting of baleen whale passive acoustic detections from ocean gliders. Journal of the Acoustical Society of America. 2013;134(3):1814–1823. doi: 10.1121/1.4816406 23967915

16. Stewart FEC, Fisher JT, Burton AC, Volpe JP. Species occurrence data reflect the magnitude of animal movements better than the proximity of animal space use. Ecosphere. 2018;9(2):e02112. doi: 10.1002/ecs2.2112

17. Oliver RY, Ellis DP, Chmura HE, Krause JS, Pérez JH, Sweet SK, et al. Eavesdropping on the Arctic: Automated bioacoustics reveal dynamics in songbird breeding phenology. Science Advances. 2018;4(6):eaaq1084. doi: 10.1126/sciadv.aaq1084 29938220

18. Fiedler W. New technologies for monitoring bird migration and behaviour. Ringing and Migration. 2009;24(3):175–179. doi: 10.1080/03078698.2009.9674389

19. Gordo O. Why are bird migration dates shifting? A review of weather and climate effects on avian migratory phenology. Climate Research. 2007;35(1-2):37–58. doi: 10.3354/cr00713

20. Bairlein F. Migratory birds under threat. Science. 2016;354(6312):547–548. doi: 10.1126/science.aah6647 27811252

21. Loss SR, Will T, Marra PP. Direct mortality of birds from anthropogenic causes. Annual Review of Ecology, Evolution, and Systematics. 2015;46:99–120. doi: 10.1146/annurev-ecolsys-112414-054133

22. Dokter AM, Farnsworth A, Fink D, Ruiz-Gutierrez V, Hochachka WM, La Sorte FA, et al. Seasonal abundance and survival of North America’s migratory avifauna determined by weather radar. Nature ecology & evolution. 2018;2(10):1603–1609. doi: 10.1038/s41559-018-0666-4

23. Farnsworth A, Sheldon D, Geevarghese J, Irvine J, Van Doren B, Webb K, et al. Reconstructing velocities of migrating birds from weather radar—a case study in computational sustainability. AI Magazine. 2014;35(2):31–48. doi: 10.1609/aimag.v35i2.2527

24. Van Doren BM, Horton KG. A continental system for forecasting bird migration. Science. 2018;361(6407):115–1118. doi: 10.1126/science.aat7526

25. DeVault TL, Belant JL, Blackwell BF, Seamans TW. Interspecific variation in wildlife hazards to aircraft: implications for airport wildlife management. Wildlife Society Bulletin. 2011;35(4):394–402. doi: 10.1002/wsb.75

26. Drewitt AL, Langston RH. Assessing the impacts of wind farms on birds. Ibis. 2006;148:29–42. doi: 10.1111/j.1474-919X.2006.00516.x

27. Blair RB. Land use and avian species diversity along an urban gradient. Ecological Applications. 1996;6(2):506–519. doi: 10.2307/2269387

28. Van Doren BM, Horton KG, Dokter AM, Klinck H, Elbin SB, Farnsworth A. High-intensity urban light installation dramatically alters nocturnal bird migration. Proceedings of the National Academy of Sciences. 2017;114(42):11175–11180. doi: 10.1073/pnas.1708574114

29. Bauer S, Chapman JW, Reynolds DR, Alves JA, Dokter AM, Menz MM, et al. From agricultural benefits to aviation safety: realizing the potential of continent-wide radar networks. BioScience. 2017;67(10):912–918. doi: 10.1093/biosci/bix074 29599538

30. Farnsworth A, Van Doren BM, Hochachka WM, Sheldon D, Winner K, Irvine J, et al. A characterization of autumn nocturnal migration detected by weather surveillance radars in the northeastern USA. Ecological Applications. 2016;26(3):752–770. doi: 10.1890/15-0023 27411248

31. Sullivan BL, Aycrigg JL, Barry JH, Bonney RE, Bruns N, Cooper CB, et al. The eBird enterprise: an integrated approach to development and application of citizen science. Biological Conservation. 2014;169:31–40. doi: 10.1016/j.biocon.2013.11.003

32. Farnsworth A. Flight calls and their value for future ornithological studies and conservation research. The Auk. 2005;122(3):733–746. doi: 10.1093/auk/122.3.733

33. Fink D, Hochachka WM, Zuckerberg B, Winkler DW, Shaby B, Munson MA, et al. Spatiotemporal exploratory models for broad-scale survey data. Ecological Applications. 2010;20(8):2131–2147. doi: 10.1890/09-1340.1 21265447

34. Fink D, Damoulas T, Bruns NE, La Sorte FA, Hochachka WM, Gomes CP, et al. Crowdsourcing meets ecology: hemisphere-wide spatiotemporal species distribution models. AI magazine. 2014;35(2):19–30. doi: 10.1609/aimag.v35i2.2533

35. Pamuła H, Kłaczyński M, Remisiewicz M, Wszołek W, Stowell D. Adaptation of deep learning methods to nocturnal bird audio monitoring. In: Postȩpy akustyki. Polskie Towarzystwo Akustyczne, Oddziałl Górnośla̧ski; 2017. p. 149–158.

36. Stowell D. Computational bioacoustic scene analysis. In: Virtanen T, Plumbley MD, Ellis D, editors. Computational Analysis of Sound Scenes and Events. Springer; 2018. p. 303–333.

37. Ross SRPJ, Friedman NR, Dudley KL, Yoshimura M, Yoshida T, Economo EP. Listening to ecosystems: data-rich acoustic monitoring through landscape-scale sensor networks. Ecological Research. 2018;33(1):135–147. doi: 10.1007/s11284-017-1509-5

38. Shamoun-Baranes J, Farnsworth A, Aelterman B, Alves JA, Azijn K, Bernstein G, et al. Innovative visualizations shed light on avian nocturnal migration. PLOS ONE. 2016;11(8):e0160106. doi: 10.1371/journal.pone.0160106 27557096

39. Warren PS, Katti M, Ermann M, Brazel A. Urban bioacoustics: it’s not just noise. Animal Behavior. 2006;71(3):491–502. doi: 10.1016/j.anbehav.2005.07.014

40. Lanzone M, Deleon E, Grove L, Farnsworth A. Revealing undocumented or poorly known flight calls of warblers (Parulidae) using a novel method of recording birds in captivity. The Auk. 2009;126(3):511–519. doi: 10.1525/auk.2009.08187

41. Hobson KA, Rempel RS, Greenwood H, Turnbull B, Wilgenburg SLV. Acoustic surveys of birds using electronic recordings: new potential from an omnidirectional microphone system. Wildlife Society Bulletin. 2002;30(3):709–720.

42. Pijanowski BC, Villanueva-Rivera LJ, Dumyahn SL, Farina A, Krause BL, Napoletano BM, et al. Soundscape ecology: the science of sound in the landscape. BioScience. 2011;61(3):203–216. doi: 10.1525/bio.2011.61.3.6

43. Naguib M. Reverberation of rapid and slow trills: implications for signal adaptations to long-range communication. Journal of the Acoustical Society of America. 2003;113(3):1749–1756. doi: 10.1121/1.1539050 12656407

44. Krim H, Viberg M. Two decades of array signal processing research: the parametric approach. IEEE Signal Processing Magazine. 1996;13(4):67–94. doi: 10.1109/79.526899

45. Wilson S, Bayne E. Use of an acoustic location system to understand how presence of conspecifics and canopy cover influence Ovenbird (Seiurus aurocapilla) space use near reclaimed wellsites in the boreal forest of Alberta. Avian Conservation and Ecology. 2018;13(2). doi: 10.5751/ACE-01248-130204

46. Mydlarz C, Salamon J, Bello JP. The implementation of low-cost urban acoustic monitoring devices. Applied Acoustics. 2017;117:207–218. doi: 10.1016/j.apacoust.2016.06.010

47. Knight EC, Bayne EM. Classification threshold and training data affect the quality and utility of focal species data processed with automated audio-recognition software. Bioacoustics. 2019;28(6):539–554. doi: 10.1080/09524622.2018.1503971

48. Evans WR. Monitoring avian night flight calls—The new century ahead. The Passenger Pigeon. 2005;67:15–27.

49. Kaewtip K, Alwan A, O’Reilly C, Taylor CE. A robust automatic birdsong phrase classification: a template-based approach. Journal of the Acoustical Society of America. 2016;140(5):3691–3701. doi: 10.1121/1.4966592 27908084

50. Heittola T, Çakir E, Virtanen T. The machine learning approach for analysis of sound scenes and events. In: Virtanen T, Plumbley MD, Ellis D, editors. Computational Analysis of Sound Scenes and Events. Springer; 2018. p. 13–40.

51. Joly A, Goëau H, Glotin H, Spampinato C, Bonnet P, Vellinga WP, et al. LifeCLEF 2017 Lab Overview: Multimedia Species Identification Challenges. In: Jones GJF, Lawless S, Gonzalo J, Kelly L, Goeuriot L, Mandl T, et al., editors. Experimental IR Meets Multilinguality, Multimodality, and Interaction. Springer International Publishing; 2017. p. 255–274.

52. Ulloa JS, Aubin T, Llusia D, Bouveyron C, Sueur J. Estimating animal acoustic diversity in tropical environments using unsupervised multiresolution analysis. Ecological Indicators. 2018;90:346–355. doi: 10.1016/j.ecolind.2018.03.026

53. Brumm H, Zollinger SA, Niemelä PT, Sprau P. Measurement artefacts lead to false positives in the study of birdsong in noise. Methods in Ecology and Evolution. 2017;8(11):1617–1625. doi: 10.1111/2041-210X.12766

54. Marcarini M, Williamson GA, de Sisternes Garcia L. Comparison of methods for automated recognition of avian nocturnal flight calls. In: Proceedings of the IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP). IEEE; 2008. p. 2029–2032.

55. Efford MG, Dawson DK, Borchers DL. Population density estimated from locations of individuals on a passive detector array. Ecology. 2009;90(10):2676–2682. doi: 10.1890/08-1735.1 19886477

56. Salamon J, Bello JP, Farnsworth A, Kelling S. Fusing shallow and deep learning for bioacoustic bird species classification. In: Proceedings of the IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP). IEEE; 2017. p. 141–145.

57. Lostanlen V, Salamon J, Farnsworth A, Kelling S, Bello JP. BirdVox-full-night: a dataset and benchmark for avian flight call detection. In: Proceedings of the IEEE Conference on Acoustics, Speech, and Signal Processing (ICASSP). IEEE; 2017. p. 266–270.

58. Delcroix M, Kinoshita K, Hori T, Nakatani T. Context-adaptive deep neural networks for fast acoustic model adaptation. In: Proceedings of the IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP); 2015. p. 4535–4539.

59. Huemmer C, Delcroix M, Ogawa A, Kinoshita K, Nakatani T, Kellermann W. Online environmental adaptation of CNN-based acoustic models using spatial diffuseness features. In: Proceedings of the IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP). IEEE; 2017. p. 4875–4879.

60. Schwarz A, Huemmer C, Maas R, Kellermann W. Spatial diffuseness features for DNN-based speech recognition in noisy and reverberant environments. In: Proceedings of the IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP). IEEE; 2015. p. 4380–4384.

61. Delcroix M, Kinoshita K, Ogawa A, Huemmer C, Nakatani T. Context adaptive neural network-based acoustic models for rapid adaptation. IEEE Transactions on Audio, Speech, and Language Processing. 2018;26(5):895–908. doi: 10.1109/TASLP.2018.2798821

62. Jia X, De Brabandere B, Tuytelaars T, Gool LV. Dynamic Filter Networks. In: Proceedings of the Conference on Neural Information Processing Systems (NeurIPS). NeurIPS; 2016. p. 667–675.

63. Wang Y, Getreuer P, Hughes T, Lyon RF, Saurous RA. Trainable frontend for robust and far-field keyword spotting. In: Proceedings of the International Conference on Acoustics, Speech, and Signal Processing (ICASSP). IEEE; 2017. p. 5670–5674.

64. McFee B, Kim JW, Cartwright M, Salamon J, Bittner RM, Bello JP. Open-Source Practices for Music Signal Processing Research: Recommendations for Transparent, Sustainable, and Reproducible Audio Research. IEEE Signal Processing Magazine. 2019;36(1):128–137. doi: 10.1109/MSP.2018.2875349

65. Mills H. HaroldMills/Vesper-Old-Bird-Detector-Eval: v1.0.2; 2018. Available from: https://doi.org/10.5281/zenodo.1306879.

66. Klapuri A. Sound onset detection by applying psychoacoustic knowledge. In: Procedings of the IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP). vol. 6. IEEE; 1999. p. 3089–3092.

67. Stowell D, Wood M, Stylianou Y, Glotin H. Bird detection in audio: a survey and a challenge. In: Proceedings of the IEEE International Workshop on Machine Learning for Signal Processing (MLSP). IEEE; 2016. p. 1–7.

68. Stowell D, Wood MD, Pamuła H, Stylianou Y, Glotin H. Automatic acoustic detection of birds through deep learning: the first Bird Audio Detection challenge. Methods in Ecology and Evolution. 2018;.

69. Grill T, Schlüter J. Two convolutional neural networks for bird detection in audio signals. In: Proceedings of the European Signal Processing Conference (EUSIPCO). IEEE; 2017. p. 1764–1768.

70. Cakir E, Adavanne S, Parascandolo G, Drossos K, Virtanen T. Convolutional recurrent neural networks for bird audio detection. In: Proceedings of the European Signal Processing Conference (EUSIPCO). IEEE; 2017. p. 1744–1748.

71. Pellegrini T. Densely connected CNNs for bird audio detection. In: Proceedings of the European Signal Processing Conference (EUSIPCO). IEEE; 2017. p. 1734–1738.

72. Schlüter J, Lehner B. Zero-Mean Convolutions for Level-Invariant Singing Voice Detection. In: Proceedings of the Conference of the International Society for Music Information Retrieval (ISMIR); 2018.

73. Millet J, Zeghidour N. Learning to Detect Dysarthria from Raw Speech. In: Proceedings of the IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP); 2019. p. 5831–5835.

74. Lostanlen V, Salamon J, Cartwright M, McFee B, Farnsworth A, Kelling S, et al. Per-Channel Energy Normalization: Why and How. IEEE Signal Processing Letters. 2019;26(1):39–43. doi: 10.1109/LSP.2018.2878620

75. Zinemanas P, Cancela P, Rocamora M. End-to-end Convolutional Neural Networks for Sound Event Detection in Urban Environments. In: Proceedings of the Conference of Open Innovations Association (FRUCT); 2019. p. 533–539.

76. Kahl S, Wilhelm-Stein T, Klinck H, Kowerko D, Eibl M. Recognizing birds from sound: The 2018 BirdCLEF baseline system. Conference and Labs of the Evaluation Forum; 2018.

77. Schlüter J. Bird Identification from Timestamped, Geotagged Audio Recordings. Conference and Labs of the Evaluation Forum (CLEF); 2018.

78. Dai J, Qi H, Xiong Y, Li Y, Zhang G. Deformable convolutional networks. In: Procedings of the IEEE International Conference on Computer Vision (ICCV). IEEE; 2017. p. 764–773.

79. Ha D, Dai A, Le QV. HyperNetworks. In: Proceedings of the International Conference on Learnining Representions (ICLR); 2017. p. 1–29.

80. Li D, Chen X, Zhang Z, Huang K. Learning deep context-aware features over body and latent parts for person re-identification. In: Procedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR). IEEE; 2017. p. 384–393.

81. Salamon J, Bello JP. Deep convolutional neural networks and data augmentation for environmental sound classification. IEEE Signal Processing Letters. 2017;24(3):279–283. doi: 10.1109/LSP.2017.2657381

82. Salamon J, Jacoby C, Bello JP. A Dataset and Taxonomy for Urban Sound Research. In: International Conference on Multimedia. Association for Computing Machinery; 2014. p. 1041–1044.

83. Salamon J, Bello JP, Farnsworth A, Robbins M, Keen S, Klinck H, et al. Towards the automatic classification of avian flight calls for bioacoustic monitoring. PLOS ONE. 2016;11(11). doi: 10.1371/journal.pone.0166866

84. Kingma D, Ba J. Adam: A Method for Stochastic Optimization. In: Proceedings of the International Conference on Learning Representations (ICLR); 2015. p. 1–15.

85. Chollet F. Keras v2.0.0; 2018. https://github.com/fchollet/keras.

86. McFee B, Jacoby C, Humphrey E. pescador; 2017. Available from: https://doi.org/10.5281/zenodo.400700.

87. Bello JP, Daudet L, Abdallah S, Duxbury C, Davies M, Sandler MB. A tutorial on onset detection in music signals. IEEE Transactions on Speech and Audio Processing. 2005;13(5):1035–1047. doi: 10.1109/TSA.2005.851998

88. Yang Z, Dai Z, Salakhutdinov R, Cohen WW. Breaking the softmax bottleneck: A high-rank RNN language model. In: Proceedings of the International Conference on Learning Representations (ICLR); 2018.

89. Battenberg E, Child R, Coates A, Fougner C, Gaur Y, Huang J, et al. Reducing bias in production speech models. arXiv preprint 170504400. 2017;.

90. Shan C, Zhang J, Wang Y, Xie L. Attention-based End-to-End Models for Small-Footprint Keyword Spotting. arXiv preprint arXiv:180310916. 2018;.

91. Franceschi JY, Fawzi A, Fawzi O. Robustness of classifiers to uniform ℓp and Gaussian noise. In: Proceedings of the International Conference on Artificial Intelligence and Statistics (AISTATS). PMLR; 2018. p. 1280–1288.

92. Krstulović S. Audio Event Recognition in the Smart Home. In: Virtanen T, Plumbley MD, Ellis D, editors. Computational Analysis of Sound Scenes and Events. Springer; 2018. p. 335–371.

93. McFee B, McVicar M, Balke S, Thomé C, Raffel C, Lee D, et al. librosa/librosa: 0.6.1; 2018. Available from: https://doi.org/10.5281/zenodo.1252297.

94. Andén J, Lostanlen V, Mallat S. Joint time-frequency scattering for audio classification. In: Proceedings of the IEEE International Conference on Machine Learning for Signal Processing (MLSP). IEEE; 2015. p. 1–6.

95. McFee B, Humphrey EJ, Bello JP. A software framework for musical data augmentation. In: Procedings of the Conference of the International Society on Music Information Retrieval (ISMIR); 2015. p. 248–254.

96. Schlüter J, Grill T. Exploring Data Augmentation for Improved Singing Voice Detection with Neural Networks. In: Proceedings of the Conference of the International Society for Music Information Retrieval (ISMIR); 2015. p. 121–126.

97. Salamon J, MacConnell D, Cartwright M, Li P, Bello JP. Scaper: A library for soundscape synthesis and augmentation. In: Proceedings of the IEEE Workshop on Applications of Signal Processing to Acoustics and Audio (WASPAA). IEEE; 2017. p. 344–348.

98. Hopcroft JE, Karp RM. An n5/2 algorithm for maximum matchings in bipartite graphs. SIAM Journal on Computing. 1973;2(4):225–231. doi: 10.1137/0202019

99. Raffel C, McFee B, Humphrey EJ, Salamon J, Nieto O, Liang D, et al. mir_eval: a transparent implementation of common MIR metrics. In: Procedings of the Conference of the International Society for Music Information Retrieval (ISMIR); 2014. p. 367–372.

100. Delcroix M, Kinoshita K, Yu C, Ogawa A, Yoshioka T, Nakatani T. Context-adaptive deep neural networks for fast acoustic model adaptation in noisy conditions. In: Proceedings of the IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP); 2016. p. 5270–5274.

101. Delcroix M, Kinoshita K, Ogawa A, Yoshioka T, Tran DT, Nakatani T. Context-Adaptive Neural Network for Rapid Adaptation of Deep CNN Based Acoustic Models. In: Procedings of the Annual Conference of the International Speech Communication Association (Interspeech); 2016. p. 1573–1577.


Článek vyšel v časopise

PLOS One


2019 Číslo 10