Sound texture synthesis via filter statistics

Josh H. McDermott, Andrew J. Oxenham, Eero P. Simoncelli
2009 2009 IEEE Workshop on Applications of Signal Processing to Audio and Acoustics  
Many natural sounds, such as those produced by rainstorms, fires, or insects at night, consist of large numbers of rapidly occurring acoustic events. We hypothesize that humans encode these "sound textures" with statistical measurements that capture their constituent features and the relationship between them. We explored this hypothesis using a synthesis algorithm that measures statistics in a real sound and imposes them on a sample of noise. Simply matching the marginal statistics (variance,
more » ... urtosis) of individual frequency subbands was generally necessary, but insufficient, to yield good results. Imposing various pairwise envelope statistics (correlations between bands, and autocorrelations within each band) greatly improved the results, frequently producing synthetic textures that sounded natural and that listeners could reliably recognize. The results suggest that such statistical representations could underlie sound texture perception, and that the auditory system may use fairly simple statistics to recognize many natural sound textures.
doi:10.1109/aspaa.2009.5346467 dblp:conf/waspaa/McDermottOS09 fatcat:5nqq4foy5bfb5dqjranakyipmm