Filters








26,829 Hits in 5.2 sec

A merged molecular representation learning for molecular properties prediction with a web-based service

Hyunseob Kim, Jeongcheol Lee, Sunil Ahn, Jongsuk Ruth Lee
2021 Scientific Reports  
The key of our model is learning structures with adjacency matrix embedding and learning logics that can infer descriptors via Quantitative Estimation of Drug-likeness prediction in pre-training.  ...  In this paper, we propose a new self-supervised method to learn SMILES and chemical contexts of molecules simultaneously in pre-training the Transformer.  ...  Pre-training approaches also appear to solve molecular properties predictions in drug discovery.  ... 
doi:10.1038/s41598-021-90259-7 pmid:34040026 fatcat:wzn37x4ipjgqnemq63iwq4nrle

TURL: Table Understanding through Representation Learning [article]

Xiang Deng, Huan Sun, Alyssa Lees, You Wu, Cong Yu
2020 arXiv   pre-print
In this paper, we present TURL, a novel framework that introduces the pre-training/fine-tuning paradigm to relational Web tables.  ...  Specifically, we propose a structure-aware Transformer encoder to model the row-column structure of relational tables, and present a new Masked Entity Recovery (MER) objective for pre-training to capture  ...  ACKNOWLEDGMENTS We would like to thank the anonymous reviewers for their helpful comments.  ... 
arXiv:2006.14806v2 fatcat:zob4kpoe7nglbd2lrtmo3ngakq

Short Text Classification Based on Latent Topic Modeling and Word Embedding

Peng LI, Jun-Qing HE, Cheng-Long MA
2017 DEStech Transactions on Computer Science and Engineering  
To classify the short and sparse text accurately is always the basic need for us to deal with information efficiently.  ...  By discovering the latent topics in the related data crawled from the web, topic distribution can describe the text content in general.  ...  Various kinds of pre-trained word embedding was used with different training method and different web resources, such as Wikipedia and Google News.  ... 
doi:10.12783/dtcse/aice-ncs2016/5647 fatcat:teazta5fivdzhjdmgit6kzlkim

Automatically Discovering Surveillance Devices in the Cyberspace

Qiang Li, Xuan Feng, Haining Wang, Limin Sun
2017 Proceedings of the 8th ACM on Multimedia Systems Conference - MMSys'17  
We achieve real-time and non-intrusive web crawling by leveraging network scanning technology.  ...  Discovering surveillance devices is a prerequisite for ensuring high availability, reliability, and security of these devices.  ...  ACKNOWLEDGMENTS We would like to thank the anonymous reviewers for their insightful and detailed feedback.  ... 
doi:10.1145/3083187.3084020 dblp:conf/mmsys/LiFWS17 fatcat:qjupcrleb5eqhdpcj2rfo6ficu

Classifying News Media Coverage for Corruption Risks Management with Deep Learning and Web Intelligence

Albert Weichselbraun, Sandro Hörler, Christian Hauser, Anina Havelka
2020 Proceedings of the 10th International Conference on Web Intelligence, Mining and Semantics  
The research presented in this paper introduces the Integrity Risks Monitor, an analytics dashboard that applies Web Intelligence and Deep Learning to english and germanspeaking documents for the task  ...  Domain experts created a gold standard dataset compiled from Anglo-American media coverage on corruption cases that has been used for training and evaluating the classifier.  ...  The conducted experiments provide the following insights: (i) self-trained word embeddings clearly outperformed pre-trained embeddings (Section 3.2), since they seemed to be better adapted to the application  ... 
doi:10.1145/3405962.3405988 fatcat:nzutrdthpjam7isklcxt7di5fi

ServeNet: A Deep Neural Network for Web Services Classification [article]

Yilong Yang, Nafees Qamar, Peng Liu, Katarina Grolinger, Weiru Wang, Zhi Li, Zhifang Liao
2020 arXiv   pre-print
Automated service classification plays a crucial role in service discovery, selection, and composition. Machine learning has been widely used for service classification in recent years.  ...  To demonstrate the effectiveness of our approach, we conduct a comprehensive experimental study by comparing 10 machine learning methods on 10,000 real-world web services.  ...  BERT model as the embedding layer, which is a pre-trained language model and can transform a word of description to a n-dimensional vector according to the contexts of words.  ... 
arXiv:1806.05437v3 fatcat:o6vgiw4izve4fc7afyn36w7r4y

Embedded Bi-directional GRU and LSTMLearning Models to Predict Disasterson Twitter Data

A. Bhuvaneswari, J. Timothy Jones Thomas, P. Kesavan
2019 Procedia Computer Science  
In this paper, embedded bi-directional GRU and LSTM learning models is applied for disaster event prediction that uses deep learning techniques to categorize the tweets.  ...  In this paper, embedded bi-directional GRU and LSTM learning models is applied for disaster event prediction that uses deep learning techniques to categorize the tweets.  ...  The dataset sample is exclusively pre-trained with word embedding like WordNet [16] .In Recurrent Neural Networks, all the sentences in the dataset are split into words and converted using the embedding  ... 
doi:10.1016/j.procs.2020.01.020 fatcat:ulgcrh65mzgsfnqjfv7epmsw2q

UMDuluth-CS8761 at SemEval-2018 Task9: Hypernym Discovery using Hearst Patterns, Co-occurrence frequencies and Word Embeddings

Arshia Zernab Hassan, Manikya Swathi Vallabhajosyula, Ted Pedersen
2018 Proceedings of The 12th International Workshop on Semantic Evaluation  
Hypernym Discovery is the task of identifying potential hypernyms for a given term. A hypernym is a more generalized word that is super-ordinate to more specific words.  ...  Our system Babbage participated in Subtask 1A for English and placed 6th of 19 systems when identifying concept hypernyms, and 12th of 18 systems for entity hypernyms.  ...  (d) Embedding Dimension Size : 300. The number of dimensions for the embedding matrix.  ... 
doi:10.18653/v1/s18-1149 dblp:conf/semeval/HassanVP18 fatcat:jp6pudo6cng3ddd46udnpyaqee

E-fashion Product Discovery via Deep Text Parsing

Uma Sawant, Vijay Gabale, Anand Subramanian
2017 Proceedings of the 26th International Conference on World Wide Web Companion - WWW '17 Companion  
Transforming unstructured text into structured form is important for fashion e-commerce platforms that ingest tens of thousands of fashion products every day.  ...  little attention has been paid to discovering potentially multiple products present in the listing along with their respective relevant attributes, and leveraging the entire title and description text for  ...  To capture capitalization, phrasing, function words of the input text (wherever present), we carefully form the representation layer using: (a) a word embedding ew from pre-trained embedding table and  ... 
doi:10.1145/3041021.3054263 dblp:conf/www/SawantGS17 fatcat:vnf64prl65h2jmx53oncxcxtyy

A Transfer Learning Pipeline for Educational Resource Discovery with Application in Leading Paragraph Generation [article]

Irene Li, Thomas George, Alexander Fabbri, Tammy Liao, Benjamin Chen, Rina Kawamura, Richard Zhou, Vanessa Yan, Swapnil Hingmire, Dragomir Radev
2022 arXiv   pre-print
In this paper, we propose the educational resource discovery (ERD) pipeline that automates web resource discovery for novel domains.  ...  This is the first study that considers various web resources for survey generation, to the best of our knowledge.  ...  To conduct QD- MLM pretraining, we apply two external educational corpora for pre-training to ensure the data quality: Tu- torialBank (TB) 12 and arXiv 13 .  ... 
arXiv:2201.02312v1 fatcat:ilb6z4xvbfhurhdgxbdtupntvu

UMDuluth-CS8761 at SemEval-2018 Task 9: Hypernym Discovery using Hearst Patterns, Co-occurrence frequencies and Word Embeddings [article]

Arshia Z. Hassan and Manikya S. Vallabhajosyula and Ted Pedersen
2018 arXiv   pre-print
Hypernym Discovery is the task of identifying potential hypernyms for a given term. A hypernym is a more generalized word that is super-ordinate to more specific words.  ...  Our system Babbage participated in Subtask 1A for English and placed 6th of 19 systems when identifying concept hypernyms, and 12th of 18 systems for entity hypernyms.  ...  (d) Embedding Dimension Size : 300. The number of dimensions for the embedding matrix.  ... 
arXiv:1805.10271v1 fatcat:65g576gphvdqtja56weprguuom

DeClarE: Debunking Fake News and False Claims using Evidence-Aware Deep Learning

Kashyap Popat, Subhabrata Mukherjee, Andrew Yates, Gerhard Weikum
2018 Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing  
Research on automated fact-checking has proposed methods based on supervised learning, but these approaches do not consider external evidence apart from labeled training instances.  ...  It also derives informative features for generating user-comprehensible explanations that makes the neural network predictions transparent to the end-user.  ...  We represent terms using pre-trained GloVe Wikipedia 6B word embeddings (Pennington et al., 2014) .  ... 
doi:10.18653/v1/d18-1003 dblp:conf/emnlp/PopatMYW18 fatcat:gj55k4ojnzejfpjc454rc7qq6a

Open Intent Discovery through Unsupervised Semantic Clustering and Dependency Parsing [article]

Pengfei Liu, Youzhang Ning, King Keung Wu, Kun Li, Helen Meng
2021 arXiv   pre-print
We obtain the utterance representation from various pre-trained sentence embeddings and present a metric of balanced score to determine the optimal number of clusters in K-means clustering for balanced  ...  In the second stage, the objective is to generate an intent label automatically for each cluster.  ...  TABLE I : I Pre-trained sentence representation models.  ... 
arXiv:2104.12114v2 fatcat:fsvp572yhrej5cfknslxrcbmqi

Semi-Supervised Class Discovery [article]

Jeremy Nixon, Jeremiah Liu, David Berthelot
2020 arXiv   pre-print
We apply a new heuristic, class learnability, for deciding whether a class is worthy of addition to the training dataset.  ...  We show that our class discovery system can be successfully applied to vision and language, and we demonstrate the value of semi-supervised learning in automatically discovering novel classes.  ...  Measures occur at every 10th step of training. Table 4. Comparison with popular unsupervised approaches for class discovery.  ... 
arXiv:2002.03480v2 fatcat:fg6gjs75drhmbdmk5bkhu2s5wu

Towards Open Intent Discovery for Conversational Text [article]

Nikhita Vedula, Nedim Lipka, Pranav Maneriker, Srinivasan Parthasarathy
2019 arXiv   pre-print
Existing research for intent discovery model it as a classification task with a predefined set of known categories.  ...  To generailze beyond these preexisting classes, we define a new task of \textit{open intent discovery}. We investigate how intent can be generalized to those not seen during training.  ...  We also obtain word level GloVe embeddings [39] for each token from a pre-trained model that has been trained on Common Crawl, a giant corpus of web crawled data.  ... 
arXiv:1904.08524v1 fatcat:ncwtn6s5evgu7lyl4d6u2r2ktu
« Previous Showing results 1 — 15 out of 26,829 results