Filters








12,935 Hits in 4.6 sec

Marian: Fast Neural Machine Translation in C++

Marcin Junczys-Dowmunt, Roman Grundkiewicz, Tomasz Dwojak, Hieu Hoang, Kenneth Heafield, Tom Neckermann, Frank Seide, Ulrich Germann, Alham Fikri Aji, Nikolay Bogoychev, André F. T. Martins, Alexandra Birch
2018 Proceedings of ACL 2018, System Demonstrations  
We present Marian, an efficient and selfcontained Neural Machine Translation framework with an integrated automatic differentiation engine based on dynamic computation graphs.  ...  Government is authorized to reproduce and distribute reprints for governmental purposes notwithstanding any copyright annotation therein.  ...  The language models follow the decoder architecture and can be used for transfer learning, weighted decode-time ensembling and re-ranking.  ... 
doi:10.18653/v1/p18-4020 dblp:conf/acl/Junczys-Dowmunt18 fatcat:vl5hb5oitrcb5nv25w54qaqnqm

Marian: Fast Neural Machine Translation in C++

Marcin Junczys-Dowmunt, Roman Grundkiewicz, Tomasz Dwojak, Hieu Hoang, Kenneth Heafield, Tom Neckermann, Frank Seide, Ulrich Germann, Alham Fikri Aji, Nikolay Bogoychev, Andre F. T. Martins, Alexandra Birch
2018 Zenodo  
We present Marian, an efficient and selfcontained Neural Machine Translation framework with an integrated automatic differentiation engine based on dynamic computation graphs.  ...  Government is authorized to reproduce and distribute reprints for governmental purposes notwithstanding any copyright annotation therein.  ...  The language models follow the decoder architecture and can be used for transfer learning, weighted decode-time ensembling and re-ranking.  ... 
doi:10.5281/zenodo.2551642 fatcat:52weuuur5fb73nvcybgog6a7ei

Marian: Fast Neural Machine Translation in C++ [article]

Marcin Junczys-Dowmunt, Roman Grundkiewicz, Tomasz Dwojak, Hieu Hoang, Kenneth Heafield, Tom Neckermann, Frank Seide, Ulrich Germann, Alham Fikri Aji, Nikolay Bogoychev, André F. T. Martins, Alexandra Birch
2018 arXiv   pre-print
We present Marian, an efficient and self-contained Neural Machine Translation framework with an integrated automatic differentiation engine based on dynamic computation graphs.  ...  Government is authorized to reproduce and distribute reprints for governmental purposes notwithstanding any copyright annotation therein.  ...  The language models follow the decoder architecture and can be used for transfer learning, weighted decode-time ensembling and re-ranking.  ... 
arXiv:1804.00344v3 fatcat:ieankv2jh5asbpmnzb5lgkwioq

AdaBERT: Task-Adaptive BERT Compression with Differentiable Neural Architecture Search [article]

Daoyuan Chen, Yaliang Li, Minghui Qiu, Zhen Wang, Bofang Li, Bolin Ding, Hongbo Deng, Jun Huang, Wei Lin, Jingren Zhou
2021 arXiv   pre-print
Motivated by the necessity and benefits of task-oriented BERT compression, we propose a novel compression method, AdaBERT, that leverages differentiable Neural Architecture Search to automatically compress  ...  Large pre-trained language models such as BERT have shown their effectiveness in various natural language processing tasks.  ...  That is, the searching target is a cell and the network architecture is stacked by the searched cell over predefined K max layers, where the cell structure parameter α c is shared for all layers.  ... 
arXiv:2001.04246v2 fatcat:hma3lfdrqnf77axtib26qxafpq

Strategy of the Negative Sampling for Training Retrieval-Based Dialogue Systems [article]

Aigul Nugmanova, Andrei Smirnov, Galina Lavrentyeva, Irina Chernykh
2018 arXiv   pre-print
The article describes the new approach for quality improvement of automated dialogue systems for customer support service.  ...  The results obtained for the implemented systems and reported in this paper confirm the significant improvement of automated dialogue systems quality in case of using the negative responses from transformed  ...  Table 2 presents the CR and UR (such as in Section 5.2) metrics for three models: Dual Encoder with GRU cell (DE GRU), embeddings from Dual Encoder with GRU cell (DE emb GRU) and embeddings from Dual  ... 
arXiv:1811.09785v1 fatcat:fbuiiiebyvcxhmedlfwxqlnuui

Natural Language Interface for Databases Using a Dual-Encoder Model

Ionel-Alexandru Hosu, Radu Cristian Alexandru Iacob, Florin Brad, Stefan Ruseti, Traian Rebedea
2018 International Conference on Computational Linguistics  
We propose a sketch-based two-step neural model for generating structured queries (SQL) based on a user's request in natural language.  ...  Then, a second network designed as a dual-encoder SEQ2SEQ model using both the text query and the previously obtained sketch is employed to generate the final SQL query.  ...  We employed grid search for establishing the number of LSTM cells in the hidden layers and the size of the word embeddings (size 500).  ... 
dblp:conf/coling/HosuIBRR18 fatcat:mwnlguzrgfa73au3svokrwibna

GPNAS: A Neural Network Architecture Search Framework Based on Graphical Predictor [article]

Dige Ai, Hong Zhang
2021 arXiv   pre-print
child model.  ...  In practice, the problems encountered in Neural Architecture Search (NAS) training are not simple problems, but often a series of difficult combinations (wrong compensation estimation, curse of dimension  ...  CONCLUSION In this paper, we design a better framework for NAS, using a dual-accelerator architecture search submodel.  ... 
arXiv:2103.11820v6 fatcat:fuvvkfzuq5fnhndlijd43ulamq

Music Generation with Temporal Structure Augmentation [article]

Shakeel Raja
2020 arXiv   pre-print
An RNN architecture with LSTM cells is trained on the Nottingham folk music dataset in a supervised sequence learning setup, following a Music Language Modelling approach, and then applied to generation  ...  Our experiments show an improved prediction performance for both types of annotation.  ...  The following ranges are used in the grid search: Number of Hidden Baseline Model (BL) We train and test the dual SoftMax LSTM architecture without any feature augmentation as the baseline model.  ... 
arXiv:2004.10246v1 fatcat:syckzc3ntjdkbjr75o3lpju4qe

MaCAR: Urban Traffic Light Control via Active Multi-agent Communication and Action Rectification

Zhengxu Yu, Shuxian Liang, Long Wei, Zhongming Jin, Jianqiang Huang, Deng Cai, Xiaofei He, Xian-Sheng Hua
2020 Proceedings of the Twenty-Ninth International Joint Conference on Artificial Intelligence  
That is, the searching target is a cell and the network architecture is stacked by the searched cell over predefined K max layers, where the cell structure parameter α c is shared for all layers.  ...  Search Space A Most neural architecture search methods focus on cell-based micro search space [Zoph and Le, 2017; Pham et al., 2018; Liu et al., 2019a] .  ... 
doi:10.24963/ijcai.2020/341 dblp:conf/ijcai/ChenLQWLDDHLZ20 fatcat:6rgljj7ab5bzxnca7n7bgqec4a

AI Research Associate for Early-Stage Scientific Discovery [article]

Morad Behandish, John Maxwell III, Johan de Kleer
2022 arXiv   pre-print
We present an AI research associate for early-stage scientific discovery based on (a) a novel minimally-biased ontology for physics-based modeling that is context-aware, interpretable, and generalizable  ...  across classical and relativistic physics; (b) automatic search for viable and parsimonious hypotheses, represented at a high-level (via domain-agnostic constructs) with built-in invariants, e.g., postulated  ...  For example, an inner-oriented curve (1−cell, σ 1 ) sitting in primary space, along which temperature variations are measured, is dual to an outer-oriented surface (2−cell, σ 2 ) sitting in secondary space  ... 
arXiv:2202.03199v1 fatcat:la53pmqfifhlhowcqfpbe3nbqe

Understanding Image and Text Simultaneously: a Dual Vision-Language Machine Comprehension Task [article]

Nan Ding and Sebastian Goodman and Fei Sha and Radu Soricut
2016 arXiv   pre-print
We introduce a new multi-modal task for computer systems, posed as a combined vision-language comprehension challenge: identifying the most suitable text describing a scene, given several similar options  ...  The paper makes several contributions: an effective and extensible mechanism for generating decoys from (human-created) image captions; an instance of applying this mechanism, yielding a large-scale machine  ...  The output of each unit cell of a Vec2seq model (both on the encoding side and the decoding side) can be fed into an FFNN architecture for binary classification.  ... 
arXiv:1612.07833v1 fatcat:qv4ogreomjdn3cubdoim5u73im

Stencil computation optimization and auto-tuning on state-of-the-art multicore architectures

K. Datta, M. Murphy, V. Volkov, S. Williams, J. Carter, L. Oliker, D. Patterson, J. Shalf, K. Yelick
2008 2008 SC - International Conference for High Performance Computing, Networking, Storage and Analysis  
We develop a number of effective optimization strategies, and build an auto-tuning environment that searches over our optimizations and their parameters to minimize runtime, while maximizing performance  ...  Finally, we present several key insights into the architectural tradeoffs of emerging multicore designs and their implications on scientific algorithm development.  ...  Acknowledgments We would like to express our gratitude to IBM for access to their newest Cell blades, as well as Sun and NVIDIA for their machine donations.  ... 
doi:10.1109/sc.2008.5222004 dblp:conf/sc/DattaMVWCOPSY08 fatcat:anm64gcbzrfojaunkrc4wf2wy4

Exploring Neural Architecture Search Space via Deep Deterministic Sampling

Keith G. Mills, Mohammad Salameh, Di Niu, Fred X. Han, Seyed Saeed Changiz Rezaei, Hengshuai Yao, Wei Lu, Shuo Lian, Shangling Jui
2021 IEEE Access  
Experimental results for CIFAR-10 and CIFAR-100 on the DARTS search space show that, DDAS can depict in a single search, the accuracy-FLOPs (or model size) Pareto frontier, which outperforms random sampling  ...  Recent developments in Neural Architecture Search (NAS) resort to training the supernet of a predefined search space with weight sharing to speed up architecture evaluation.  ...  Points correspond to architectures present on the search Pareto frontier for a given search scheme. All models were trained using 20 cells, including 18 normal cells and 2 reduction cells.  ... 
doi:10.1109/access.2021.3101975 fatcat:5h46jvt33bcp5fmqklfrzerzmy

Dual Learning for Semi-Supervised Natural Language Understanding [article]

Su Zhu, Ruisheng Cao, Kai Yu
2020 arXiv   pre-print
In this work, we introduce a dual task of NLU, semantic-to-sentence generation (SSG), and propose a new framework for semi-supervised NLU with the corresponding dual model.  ...  The framework is composed of dual pseudo-labeling and dual learning method, which enables an NLU model to make full use of data (labeled and unlabeled) through a closed-loop of the primal and dual tasks  ...  Fig. 3 . 3 The proposed architecture for the dual task of NLU, which is comprised of an encoder and a decoder.  ... 
arXiv:2004.12299v1 fatcat:ymineuzvhfdfroz7vgasqjzypm

NAS-Unet: Neural Architecture Search for Medical Image Segmentation

Yu Weng, Tianbao Zhou, Yuejie Li, Xiaoyu Qiu
2019 IEEE Access  
The architectures of DownSC and UpSC updated simultaneously by a differential architecture strategy during the search stage.  ...  In this paper, we design three types of primitive operation set on search space to automatically find two cell architecture DownSC and UpSC for semantic image segmentation especially medical image segmentation  ...  Recently, most of works focus on search CNN architecture for image classification and few on RNN for language task.  ... 
doi:10.1109/access.2019.2908991 fatcat:cw5knncj3fecxhbyeatplyj4ry
« Previous Showing results 1 — 15 out of 12,935 results