STOCHASTIC ADAPTIVE NEURAL ARCHITECTURE SEARCH FOR KEYWORD SPOTTING

Abstract : The problem of keyword spotting i.e. identifying keywords in a real-time audio stream is mainly solved by applying a neural network over successive sliding windows. Due to the difficulty of the task, baseline models are usually large, resulting in a high computational cost and energy consumption level. We propose a new method called SANAS (Stochastic Adaptive Neural Architecture Search) which is able to adapt the architecture of the neural network on-the-fly at inference time such that small architectures will be used when the stream is easy to process (silence, low noise, ...) and bigger networks will be used when the task becomes more difficult. We show that this adaptive model can be learned end-to-end by optimizing a trade-off between the prediction performance and the average computational cost per unit of time. Experiments on the Speech Commands dataset [1] show that this approach leads to a high recognition level while being much faster (and/or energy saving) than classical approaches where the network architecture is static.
Complete list of metadatas

Cited literature [21 references]  Display  Hide  Download

https://hal.sorbonne-universite.fr/hal-02063698
Contributor : Olivier Schwander <>
Submitted on : Monday, March 11, 2019 - 1:49:44 PM
Last modification on : Friday, July 5, 2019 - 3:26:03 PM
Long-term archiving on : Wednesday, June 12, 2019 - 2:49:42 PM

File

ICASSP_2019.pdf
Files produced by the author(s)

Identifiers

  • HAL Id : hal-02063698, version 1
  • ARXIV : 1811.06753

Citation

Tom Véniat, Olivier Schwander, Ludovic Denoyer. STOCHASTIC ADAPTIVE NEURAL ARCHITECTURE SEARCH FOR KEYWORD SPOTTING. ICASSP 2019 - International Conference on Acoustics, Speech, and Signal Processing, May 2019, Brighton, United Kingdom. ⟨hal-02063698⟩

Share

Metrics

Record views

33

Files downloads

39