92 Hits in 3.2 sec

Harmonium Models for Semantic Video Representation and Classification [chapter]

Jun Yang, Yan Liu, Eric P. Xing, Alexander G. Hauptmann
2007 Proceedings of the 2007 SIAM International Conference on Data Mining  
Based on a class of bipartite undirected graphical models named harmonium, our approach represents video data as latent semantic topics derived by jointly modeling the transcript keywords and color-histogram  ...  Combining the two ideas into the same framework, we propose a probabilistic approach for video classification using intermediate semantic representations derived from the multi-modal features.  ...  tool for video representation and classification.  ... 
doi:10.1137/1.9781611972771.34 dblp:conf/sdm/YangLXH07 fatcat:tfarwp7x6bavfce4xsiloatjie

Harmonium Models for Semantic Video Representation and Classification

Jun Yang, Yan Liu, Erik X. Ping, Alexander Hauptmann
Based on a class of bipartite undirected graphical models named harmonium, our approach represents the video data as latent semantic topics derived by jointly modeling the transcript keywords and color-histogram  ...  Combining the two ideas into the one framework, we propose a probabilistic approach for video classification using intermediate semantic representations derived from multi-modal features.  ...  Conclusion We have described two bipartite undirected models for semantic representation and classication of video data.  ... 
doi:10.1184/r1/6606041.v1 fatcat:t25eqmthvnc6zm3lainefzus4u

Harmonium Models for Video Classification

Jun Yang, Rong Yan, Yan Liu, Eric P. Xing
2008 Statistical analysis and data mining  
Combining the two ideas into the one framework, we propose a series of probabilistic models for video representation and classication using intermediate semantic representations derived from multi-modal  ...  Our models are among the few attempts of using undirected graphical models for representing and classifying video data.  ...  Conclusion We have described three undirected graphical models for semantic representation and classication of video data.  ... 
doi:10.1002/sam.103 fatcat:nngnpdfpe5fwvio5dk2dydvbhe

Mining Associated Text and Images with Dual-Wing Harmoniums [article]

Eric P. Xing, Rong Yan, Alexander G. Hauptmann
2012 arXiv   pre-print
We present empirical results on the applications of this model to classification, retrieval and image annotation on news video collections, and we report an extensive comparison with various extant models  ...  We specialized our model to a dual-wing harmonium for captioned images, incorporating a multivariate Poisson for word-counts and a multivariate Gaussian for color histogram.  ...  A dual-wing harmonium for text and images To model video streams, which contain both text and image information, in the following we outline a dualwing harmonium (DWH) model based on a text submodel and  ... 
arXiv:1207.1423v1 fatcat:ttuy6fgoj5gvxky3lle6emqaim

A novel dual wing harmonium model aided by 2-D wavelet transform subbands for document data mining

Haijun Zhang, Tommy W.S. Chow, M.K.M. Rahman
2010 Expert systems with applications  
A novel dual wing harmonium model that integrates multiple features including term frequency features and 2-D wavelet transform features into a low dimensional semantic space is proposed for the applications  ...  of document classification and retrieval.  ...  Gehler for his patient explanation to their work. Also, we must thank Xiyuan Lu for discussions on wavelets and thank Jun Yang for sending original codes of their work.  ... 
doi:10.1016/j.eswa.2009.11.088 fatcat:ik5kea37gjgj7ii2qtuu3pae6i

Visual trajectory analysis via Replicated Softmax-based models

Xiaogang Chen, Qixiang Ye, Jialing Zou, Ce Li, Yanting Cui, Jianbin Jiao
2014 Signal, Image and Video Processing  
In this paper, we apply the Replicated Softmax model to visual trajectory representation and analysis problems.  ...  Experiments on trajectory classification and trajectory route analysis are conducted to demonstrate the effectiveness of the proposed model.  ...  video indexing and semantic scene understanding [1] [2] [3] [4] [5] .  ... 
doi:10.1007/s11760-014-0671-2 fatcat:okibyqrubzhppl7ehwgmn33llu

Mining Relationship Between Video Concepts using Probabilistic Graphical Models

Rong Yan, Ming-yu Chen, Alexander Hauptmann
2006 2006 IEEE International Conference on Multimedia and Expo  
For large scale automatic semantic video characterization, it is necessary to learn and model a large number of semantic concepts.  ...  In this paper, we describe various multi-concept relational learning approaches via a unified probabilistic graphical model representation and propose using numerous graphical models to mine the relationship  ...  GRAPHICAL MODEL REPRESENTATIONS FOR VIDEO CONCEPTS Many multi-concept learning approaches can be concisely represented in form of probabilistic graphical models that express dependencies among random variables  ... 
doi:10.1109/icme.2006.262458 dblp:conf/icmcs/YanCH06 fatcat:ze4nvxwtmbc2flxnc35htjnwmm

Large-Margin Predictive Latent Subspace Learning for Multiview Data Analysis

Ning Chen, Jun Zhu, Fuchun Sun, E. P. Xing
2012 IEEE Transactions on Pattern Analysis and Machine Intelligence  
Finally, we extensively evaluate the large-margin latent MN on real image and hotel review datasets for classification, regression, image annotation, and retrieval.  ...  Learning salient representations of multiview data is an essential step in many applications such as image classification, retrieval, and annotation.  ...  ACKNOWLEDGMENTS The authors thank the reviewers for their helpful comments. Part of this work was done while Ning Chen was a  ... 
doi:10.1109/tpami.2012.64 pmid:22392706 fatcat:4vizascr4jddxjnr443sncjn7e


2017 International Journal of Recent Trends in Engineering and Research  
For example, images can be used to find semantically relevant textual information.  ...  crossmedia hashing (SCMH), which uses continuous word representations to capture the textual similarity at the semantic level and use a deep belief network (DBN) to construct the correlation between different  ...  The proposed model fuses multiple data modalities into a unified representation, which can be used for classification and retrieval.  ... 
doi:10.23883/ijrter.2017.3365.aeikk fatcat:6dmfmfsmtbaejale6t63ts7may

Extracting Semantics from Multimedia Content: Challenges and Solutions [chapter]

Lexing Xie, Rong Yan
2008 Signals and Communication Technology  
We start with an system overview with the five major components that extracts and uses semantic metadata: data annotation, multimedia ontology, feature representation, model learning and retrieval systems  ...  amounts of training data, and finally leveraging media semantics in retrieval systems.  ...  [116] described several approaches for mining the relationship between video concepts with several probabilistic graphical model representations.  ... 
doi:10.1007/978-0-387-76569-3_2 fatcat:jul6fw7esfaurct6erjnvpcq6q

CMU Informedia's TRECVID 2005 Skirmishes

Alexander G. Hauptmann, Robert V. Baron, Michael G. Christel, R. Concescu, Jiang Gao, Qin Jin, Wei-Hao Lin, J.-Y. Pan, Scott M. Stevens, Rong Yan, J. Yang, Y. Zhang
2005 TREC Video Retrieval Evaluation  
At TRECVID 2005, CMU participated in the low-level feature extraction task, the semantic concept feature extraction task, automatic, manual and interactive search tasks and the BBC stock footage challenge  ...  ACKNOWLEDGMENTS This work was supported in part by the Advanced Research and Development Activity under contract numbers H98230-04-C-0406 and NBCHC040037.  ...  [6] explicitly modeled the linkage between various semantic concepts via a Bayesian network that implicitly offers an ontology semantics underlying the video collection. Snoek et al.  ... 
dblp:conf/trecvid/HauptmannBCCGJL05 fatcat:qxocospbfvfere34yyacwicrs4

Learning Image-Text Associations

Tao Jiang, Ah-Hwee Tan
2009 IEEE Transactions on Knowledge and Data Engineering  
Specifically, we present two learning methods for discovering the underlying associations between images and texts based on small training data sets.  ...  Another method uses a neural network to learn direct mapping between the visual and textual features by automatically and incrementally summarizing the associated features into a set of information templates  ...  Jun Yang at the Carnegie Mellon University for providing matlab source code of the dual-wing harmonium model.  ... 
doi:10.1109/tkde.2008.150 fatcat:zckxj6vssbhabe3wspzkgj35vm

A Survey of Multi-View Representation Learning [article]

Yingming Li, Ming Yang, Zhongfei Zhang
2017 arXiv   pre-print
This paper introduces two categories for multi-view representation learning: multi-view representation alignment and multi-view representation fusion.  ...  most appropriate tools for particular applications.  ...  It consists of three parts: compositional language model M L : T → T f , deep video model M V : V → V f , and a joint embedding model (47) where V f and T f are the output of the deep video model and  ... 
arXiv:1610.01206v4 fatcat:xsi7ufxnlbdk5lz6ykrsnexfvm

Text feature extraction based on deep learning: a review

Hong Liang, Xiao Sun, Yunlei Sun, Yuan Gao
2017 EURASIP Journal on Wireless Communications and Networking  
Selection of text feature item is a basic and important matter for text mining and information retrieval. Traditional methods of feature extraction require handcrafted features.  ...  Deep learning can automatically learn feature representation from big data, including millions of parameters.  ...  Acknowledgements This work is supported by supported by the Fundamental Research Funds for the Central Universities (Grant No.18CX02019A).  ... 
doi:10.1186/s13638-017-0993-1 pmid:29263717 pmcid:PMC5732309 fatcat:bqyk3wddqbebdfeki72myn5p2y

MedLDA: A General Framework of Maximum Margin Supervised Topic Models [article]

Jun Zhu, Amr Ahmed, Eric P. Xing
2009 arXiv   pre-print
Supervised topic models utilize document's side information for discovering predictive low dimensional representations of documents. Existing models apply the likelihood-based estimation.  ...  In this paper, we present a general framework of max-margin supervised topic models for both continuous and categorical response variables.  ...  National Key Foundation R&D Projects 2003CB317007, 2004CB318108 and 2007CB311003; and Basic Research Foundation of Tsinghua National TNList Lab.  ... 
arXiv:0912.5507v1 fatcat:xcv25naanrfwzl42iea3mm552q
« Previous Showing results 1 — 15 out of 92 results