17,275 Hits in 7.5 sec

Using Mutual Proximity To Improve Content-Based Audio Similarity

Dominik Schnitzer, Arthur Flexer, Markus Schedl, Gerhard Widmer
2011 Zenodo  
DISCUSSION AND FUTURE WORK The authors find it very exciting to see the potential for improvements that one of the most basic content-based audio similarity algorithms still offers without any modification  ...  AUDIO SIMILARITY This work uses the basic algorithm from Mandel and Ellis [15] to compute audio similarity. To compute the features we use 25 MFCCs for each 46ms of audio with a 23ms hop size.  ... 
doi:10.5281/zenodo.1417979 fatcat:gfsj66gcvbdxvepvosr4dh57na

A Mirex Meta-Analysis Of Hubness In Audio Music Similarity

Arthur Flexer, Dominik Schnitzer, Jan Schlueter
2012 Zenodo  
3) plus improved results using "mutual proximity" (rows mp, see section 5.3).  ...  To sum up, mutual proximity (MP) is able to decisively improve the hubness situation while not changing the overall performance in audio similarity.  ... 
doi:10.5281/zenodo.1417864 fatcat:63ysc45zbndojdmxukh47qo2xi

Location-Aware Music Artist Recommendation [chapter]

Markus Schedl, Dominik Schnitzer
2014 Lecture Notes in Computer Science  
To this end, we use a novel standardized data set of music listening activities inferred from microblogs (MusicMicro) and state-ofthe-art techniques to extract audio features and contextual web features  ...  Current advances in music recommendation underline the importance of multimodal and user-centric approaches in order to transcend limits imposed by methods that solely use audio, web, or collaborative  ...  Data Representation To represent the music content, we use state-of-the-art audio music feature extractors proposed in [8] , which constitute a reference in music feature extraction for similarity-based  ... 
doi:10.1007/978-3-319-04117-9_19 fatcat:uf36ygltonevvevsyujsuog7mm

Mutual proximity graphs for improved reachability in music recommendation

Arthur Flexer, Jeff Stevens
2017 Journal of New Music Research  
We show that mutual proximity graphs yield much better graph connectivity resulting in improved reachability compared to knn graphs, mutual knn graphs and mutual knn graphs enhanced with minimum spanning  ...  We present mutual proximity graphs, which are an alternative to knn and mutual knn graphs, and are able to avoid hub vertices having abnormally high connectivity.  ...  This similarity measure is based on timbre information computed from the audio.  ... 
doi:10.1080/09298215.2017.1354891 pmid:29348779 pmcid:PMC5750815 fatcat:w4rwguirxnd3tkoggnyriqm3ee

Retrieving what's relevant in audio and video: statistics and linguistics in combination

Anthony Davis, Philip Rennert, Robert Rubinoff, Tim Sibley, Evelyne Tzoukermann
2004 Open research Areas in Information Retrieval  
In text-based searching, the user can easily skim over passages where search terms are highlighted and easily find the boundaries of relevant content.  ...  Consider the requirements for an information retrieval system that renders timed media2 searchable with efficiency similar to that of text.  ...  Finally, the mutual information model is exploited to determine how similar the contexts of two terms are.  ... 
dblp:conf/riao/DavisRRST04 fatcat:laloq5surnejzbtndd5fzk2g4q

Coincidence, Categorization, and Consolidation: Learning to Recognize Sounds with Minimal Supervision [article]

Aren Jansen, Daniel P. W. Ellis, Shawn Hershey, R. Channing Moore, Manoj Plakal, Ashok C. Popat, Rif A. Saurous
2019 arXiv   pre-print
By training a combined sound embedding/clustering/classification network according to these criteria, we achieve a new state-of-the-art unsupervised audio representation and demonstrate up to a 20-fold  ...  , (ii) a clustering objective that reflects our need to impose categorical structure on our experiences, and (iii) a cluster-based active learning procedure that solicits targeted weak supervision to consolidate  ...  Our prior work in temporal proximity-based metric learning [11] was a direct attempt to leverage coincidence for audio representation learning.  ... 
arXiv:1911.05894v1 fatcat:bf5ldupba5f7bax5zjt5573dry

Retrieval and browsing of spoken content

C. Chelba, T.J. Hazen, M. Saraclar
2008 IEEE Signal Processing Magazine  
As data availability increases, the lack of adequate technology for processing spoken documents becomes the limiting factor to large-scale access to spoken content.  ...  Text-based search is the most active area, with applications that range from Web and local network search to searching for personal information residing on one's own hard-drive.  ...  The audio content and text metadata can also be used jointly for further improvements in retrieval performance.  ... 
doi:10.1109/msp.2008.917992 fatcat:4ybsmkb3a5cunabtowypksshky

Trial Realization of Human-Centered Multimedia Navigation for Video Retrieval

Miki Haseyama, Takahiro Ogawa
2013 International Journal of Human-Computer Interaction  
By using these functions, users can find their desired video contents more quickly and accurately than with the conventional retrieval schemes since our system can provide new pathways to the desired contents  ...  (iii) adaptive visualization for users to be guided to their desired contents.  ...  Specifically, from "the law of proximity", the proposed system can provide similar video contents between neighboring retrieval times, i.e., it enables accurate content retrieval.  ... 
doi:10.1080/10447318.2012.692316 fatcat:tkqkf5qzjbc4lpwhcz6n3ozejm

Temporal Proximity induces Attributes Similarity [article]

Arun Kumar, Karan Aggarwal, Paul Schrater
2018 arXiv   pre-print
Second, we present an induced similarity metric in temporal proximity driven by user tastes and third, we show that this induced similarity can be used to learn items pairwise similarity in attribute space  ...  Users consume their favorite content in temporal proximity of consumption bundles according to their preferences and tastes.  ...  On the other hand, content based methods [18] attempt to find items similar in content to previously liked item by a user.  ... 
arXiv:1810.08747v1 fatcat:2i7a6m5vcrgmhcibzulgowpgvy


Christian Frisson, Stéphane Dupont, Willy Yvart, Nicolas Riche, Xavier Siebert, Thierry Dutoit
2014 Proceedings of the 9th Audio Mostly on A Conference on Interaction With Sound - AM '14  
AudioMetro combines a new content-based information visualization technique with instant audio feedback to facilitate this part of their workflow.  ...  We show through user evaluations by known-item search in collections of textural sounds that a default grid layout ordered by filename unexpectedly outperforms content-based similarity layouts resulting  ...  The SoundTorch content-based audio browser has been designed by Heise et al. [15, 16] .  ... 
doi:10.1145/2636879.2636880 dblp:conf/audio/FrissonDYRSD14 fatcat:qjocubi4wrg27m4m6qea2cmdce

A Multimedia Search And Navigation Prototype, Including Music And Video-Clips

Geoffroy Peeters, Frédéric Cornu, Christophe Charbuillet, Damien Tardieu, Juan José Burred, Marie Vian, Valérie Botherel, Jean-Bernard Rault, Jean-Philippe Cabanal
2012 Zenodo  
Audio feature extraction In order to decrease the total computation time, autotagging based on training and search-by-similarity are based on the same audio features front-end.  ...  to content-based estimation algorithms but may be difficult to understand by users − to a purely application oriented definition.  ... 
doi:10.5281/zenodo.1417760 fatcat:rntgn53s2re2baw5k3uclcft7a

Acoustic Fingerprints for Access Management in Ad-Hoc Sensor Networks

Pablo Perez Zarazaga, Tom Backstrom, Stephan Sigg
2020 IEEE Access  
Voice user interfaces can offer intuitive interaction with our devices, but the usability and audio quality could be further improved if multiple devices could collaborate to provide a distributed voice  ...  However, the robustness of these systems is partially based on the extensive duration of the recordings that are required to obtain the fingerprint.  ...  MUTUAL INFORMATION-BASED QUANTIZATION The goal of the generated fingerprints is to remain as similar as possible in matching cases and differ from each other when the audio samples do not match.  ... 
doi:10.1109/access.2020.3022618 fatcat:gbp6vmukdjdtznpdlkjbatdkgq

Sound Pryer: Adding Value to Traffic Encounters with Streaming Audio [chapter]

Mattias Östergren
2004 Lecture Notes in Computer Science  
Through field trial we found that user appreciated the concept, but the prototype needs some improvements, foremost in terms of audio playback.  ...  The Sound Pryer is a peer-to-peer application of mobile wireless ad hoc networking for PDAs with the intent of adding value to mundane traffic encounters.  ...  Acknowledgements We would like to thank Oskar Juhlin for your support and input to the project and particularly your contributions to the sections on adding value to traffic encounters.  ... 
doi:10.1007/978-3-540-28643-1_71 fatcat:7js4qn3q2felnlgf5qe6d22cpq

Detecting conversing groups with a single worn accelerometer

Hayley Hung, Gwenn Englebienne, Laura Cabrera Quiros
2014 Proceedings of the 16th International Conference on Multimodal Interaction - ICMI '14  
Our work differs significantly from previous approaches, which have tended to rely on audio and/or proximity sensing, often in much less crowded scenarios, for estimating whether people are talking together  ...  Our approach estimates each individual's social actions and uses the co-ordination of these social actions between pairs to identify group membership.  ...  ACKNOWLEDGEMENTS Thanks to Matthew Dobson, Claudio Martella, and Maarten van Steen (VU University of Amsterdam) for the use of their wearable sensors and assistance during the data collection.  ... 
doi:10.1145/2663204.2663228 dblp:conf/icmi/HungEQ14 fatcat:rwodm7nfdrbqbktr2hyc677ody

Speech Retrieval [chapter]

Ciprian Chelba, Timothy J. Hazen, Bhuvana Ramabhadran, Murat Saraçlar
2011 Spoken Language Understanding  
In this chapter we discuss the retrieval and browsing of spoken audio documents.  ...  The primary technical challenges of speech retrieval lie in the retrieval system's ability to deal with imperfect speech recognition technology that produces errorful output due to misrecognitions cause  ...  and browse audio content.  ... 
doi:10.1002/9781119992691.ch15 fatcat:o36ulm7kh5dxvhm6alb4yz3qvy
« Previous Showing results 1 — 15 out of 17,275 results