2,187 Hits in 5.0 sec

Inorganic Materials Synthesis Planning with Literature-Trained Neural Networks [article]

Edward Kim, Zach Jensen, Alexander van Grootel, Kevin Huang, Matthew Staib, Sheshera Mysore, Haw-Shiuan Chang, Emma Strubell, Andrew McCallum, Stefanie Jegelka, Elsa Olivetti
2019 arXiv   pre-print
Starting from natural language text, we apply word embeddings from language models, which are fed into a named entity recognition model, upon which a conditional variational autoencoder is trained to generate  ...  We demonstrate that the model learns representations of materials corresponding to synthesis-related properties, and that the model's behavior complements existing thermodynamic knowledge.  ...  been adapted for materials science text [21] along with a pre-trained FastText word embedding model for materials science [22] .  ... 
arXiv:1901.00032v2 fatcat:e32cq5ibzvaqjbflxwunxbmpq4

Opportunities and challenges in developing deep learning models using electronic health records data: a systematic review

Cao Xiao, Edward Choi, Jimeng Sun
2018 JAMIA Journal of the American Medical Informatics Association  
We surveyed and analyzed multiple aspects of the 98 articles we found and identified the following analytics tasks: disease detection/classification, sequential prediction of clinical events, concept embedding  ...  We also discussed some special challenges arising from modeling EHR data and reviewed a few popular approaches. Finally, we summarized how performance evaluations were conducted for each task.  ...  They can capture the latent representation of the input data by learning their generative probability.  ... 
doi:10.1093/jamia/ocy068 pmid:29893864 fatcat:ne7weiw7xvc2lp7hfgkzltdnri

Transfer Topic Labeling with Domain-Specific Knowledge Base: An Analysis of UK House of Commons Speeches 1935-2014 [article]

Alexander Herzog and Peter John and Slava Jankin Mikhaylov
2018 arXiv   pre-print
Domain-specific codebooks form the knowledge-base for automated topic labeling.  ...  Most topic models use unsupervised methods and hence require the additional step of attaching meaningful labels to estimated topics.  ...  from a codebook capture the estimated latent dimensions.  ... 
arXiv:1806.00793v2 fatcat:zrwbfdmt3rbqtgqs2t53fwhpo4

Predicting Emerging Themes in Rapidly Expanding COVID-19 Literature with Dynamic Word Embedding Networks and Machine Learning [article]

Ridam Pal, Harshita Chopra, Raghav Awasthi, Harsh Bandhey, Aditya Nagori, Amogh Gulati, Ponnurangam Kumaraguru, Tavpritesh Sethi
2021 medRxiv   pre-print
Abstracts from more than 95,000 peer-reviewed articles from the WHO curated COVID-19 database were used to construct word embedding models. Named entity recognition was used to refine the terms.  ...  Visualization of the underlying word-embedding models allowed interactive querying to choose novel keywords and extractive models summarized the research relevant to the keyword, allowing faster knowledge  ...  We also acknowledge support from the Center of Excellence in Healthcare and the Center of Excellence in Artificial Intelligence at IIIT-Delhi.  ... 
doi:10.1101/2021.01.14.21249855 fatcat:zpa4evfpbbbehpc7rumnwy266q

Data-driven materials research enabled by natural language processing and information extraction

Elsa A. Olivetti, Jacqueline M. Cole, Edward Kim, Olga Kononova, Gerbrand Ceder, Thomas Yong-Jin Han, Anna M. Hiszpanski
2020 Applied Physics Reviews  
This involves recognizing the multi-word phrases in the chemical literature through unsupervised methods and then representing the phrases in the vocabulary. 73 Typically, word embedding is performed  ...  model, which is trained on full text. 64 Other word embedding models that have been used in the materials science domain include FastText, 65 Embeddings from Language Models (ELMo), 66 and BERT.  ... 
doi:10.1063/5.0021106 fatcat:75aap3lkjvhprleptl3bbp6w64

Discovering Key Topics From Short, Real-World Medical Inquiries via Natural Language Processing

A. Ziletti, C. Berns, O. Treichel, T. Weber, J. Liang, S. Kammerath, M. Schwaerzler, J. Virayah, D. Ruau, X. Ma, A. Mattern
2021 Frontiers in Computer Science  
Here, we combine biomedical word embeddings, non-linear dimensionality reduction, and hierarchical clustering to automatically discover key topics in real-world medical inquiries from customers.  ...  Unsupervised learning from unstructured medical text is mainly limited to the development of topic models based on latent Dirichlet allocation (LDA) (Blei et al., 2003) .  ...  In this work, we combine biomedical word embeddings and unsupervised learning to discover topics from real-world medical inquiries received by Bayer ™ .  ... 
doi:10.3389/fcomp.2021.672867 fatcat:2krbnjdohnfizfvafg4gxdiequ

Interpretable Word Embeddings via Informative Priors

Miriam Hurtado Bodell, Martin Arvidsson, Måns Magnusson
2019 Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP)  
However, lack of interpretability and the unsupervised nature of word embeddings have limited their use within computational social science and digital humanities.  ...  Word embeddings have demonstrated strong performance on NLP tasks.  ...  Word embeddings, a family of unsupervised methods for representing words as dense vectors (Mikolov et al., 2013b; Pennington et al., 2014) , are one such development.  ... 
doi:10.18653/v1/d19-1661 dblp:conf/emnlp/BodellAM19 fatcat:m5i54rb2yvfbvhflba72cthtwa

Discourse Analytics [chapter]

Carolyn Penstein Rose
2017 Handbook of Learning Analytics  
By allowing the of latent word classes, it is possible then to keep the blended within individual documents. Each latent word class is represented as a distribution of words.  ...  Embedded those of generations that came before us.  ... 
doi:10.18608/hla17.009 fatcat:5x6cxu4zwncfhg4iukcqvps3qm

Unsupervised Neural Categorization for Scientific Publications [chapter]

Keqian Li, Hanwen Zha, Yu Su, Xifeng Yan
2018 Proceedings of the 2018 SIAM International Conference on Data Mining  
., significant phrases mined from scientific publications, into continuous vectors, which capture concept semantics.  ...  Based on the concept similarity graph built from the concept embedding, we further embed concepts into a hidden category space, where the category information of concepts becomes explicit.  ...  In the first stage, by leveraging word embedding techniques [9] , we learn similarity-driven embedding of concepts, which captures the semantics as well as the similarity of concepts.  ... 
doi:10.1137/1.9781611975321.5 dblp:conf/sdm/LiZSY18 fatcat:mmphj4avavavriz4hqmhhwhyfa

No Target Function Classifier - Fast Unsupervised Text Categorization using Semantic Spaces

Tobias Eljasik-Swoboda, Michael Kaufmann, Matthias Hemmje
2018 Proceedings of the 7th International Conference on Data Science, Technology and Applications  
freshly emerging knowledge.  ...  We based our method on word embedding semantics with three different implementation approaches; each evaluated using the reuters21578 benchmark (Lewis, 2004), the MAUI citeulike180 benchmark (Medelyan  ...  The first variation is based on TFIDF and omits the usage of information derived from word embeddings.  ... 
doi:10.5220/0006847000350046 dblp:conf/data/Eljasik-Swoboda18 fatcat:o6f43geol5dexhqzia6vk35cri

Improving Arabic Cognitive Distortion Classification in Twitter using BERTopic

Fatima Alhaj, Ali Al-Haj, Ahmad Sharieh, Riad Jabri
2022 International Journal of Advanced Computer Science and Applications  
These encouraging results suggest that using latent topic distribution, obtained from the BERTopic technique, can improve the classifier's ability to distinguish between different CD categories.  ...  It employs two types of document representations and performs averaging and concatenation to produce contextual topic embeddings.  ...  This type of embedding is very powerful for language understanding and can capture the semantic relations between words. 2) Reduce the dimensionality of the embedding vectors to create a lower-dimensional  ... 
doi:10.14569/ijacsa.2022.0130199 fatcat:nsajjds525ealknxacjghz3aja

SciNER: Extracting Named Entities from Scientific Literature [chapter]

Zhi Hong, Roselyne Tchoua, Kyle Chard, Ian Foster
2020 Lecture Notes in Computer Science  
Based on bidirectional LSTM networks, our model combines word embeddings, subword embeddings, and external knowledge (from DBpedia) to boost its accuracy.  ...  The automated extraction of claims from scientific papers via computer is difficult due to the ambiguity and variability inherent in natural language.  ...  Word embeddings have been shown to be effective at capturing latent information in scientific publications, including in materials science [14, 30] .  ... 
doi:10.1007/978-3-030-50417-5_23 fatcat:a2m452npzvggda7r5xyh76g2vi

Evaluating distributed word representations for capturing semantics of biomedical concepts

MUNEEB TH, Sunil Sahu, Ashish Anand
2015 Proceedings of BioNLP 15  
Recently there is a surge in interest in learning vector representations of words using huge corpus in unsupervised manner.  ...  Such word vector representations, also known as word embedding, have been shown to improve the performance of machine learning models in several NLP tasks.  ...  Materials and Methods Corpus Data and Preprocessing PubMed Central R (PMC) is a repository of biomedical and life sciences journal literature at the U.S.  ... 
doi:10.18653/v1/w15-3820 dblp:conf/bionlp/ThSA15 fatcat:n5q34a3gujfmnlhk4nrg6py4cm

Combining word embeddings to extract chemical and drug entities in biomedical literature

Pilar López-Úbeda, Manuel Carlos Díaz-Galiano, L. Alfonso Ureña-López, M. Teresa Martín-Valdivia
2021 BMC Bioinformatics  
Conclusion On the one hand, the combination of word embeddings helps to improve the recognition of chemicals and drugs in the biomedical literature.  ...  For this purpose, we propose a combination of word embeddings in order to improve the results obtained in the PharmaCoNER challenge.  ...  Contextual word embeddings Contextualized word embeddings [53] capture latent syntactic-semantic information that goes beyond standard word embeddings.  ... 
doi:10.1186/s12859-021-04188-3 pmid:34920708 pmcid:PMC8684055 fatcat:iwdqlvquyfhwvbfpz4chqpetym

Interpretable Word Embeddings via Informative Priors [article]

Miriam Hurtado Bodell, Martin Arvidsson, Måns Magnusson
2019 arXiv   pre-print
However, lack of interpretability and the unsupervised nature of word embeddings have limited their use within computational social science and digital humanities.  ...  Word embeddings have demonstrated strong performance on NLP tasks.  ...  Experiments Our main empirical concern is how well the proposed priors can capture meaningful latent dimensions.  ... 
arXiv:1909.01459v1 fatcat:53ancapz4bdbrjf3d7qxf5ewzu
« Previous Showing results 1 — 15 out of 2,187 results