Filters








248 Hits in 10.0 sec

Large-Scale Visual Search with Binary Distributed Graph at Alibaba

Kang Zhao, Pan Pan, Yun Zheng, Yanhao Zhang, Changxu Wang, Yingya Zhang, Yinghui Xu, Rong Jin
2019 Proceedings of the 28th ACM International Conference on Information and Knowledge Management - CIKM '19  
For a deployed visual search system with several billions of online images in total, building a billion-scale offline graph in hours is essential, which is almost unachievable by most existing methods.  ...  In this paper, we propose a novel algorithm called Binary Distributed Graph to solve this problem.  ...  At Alibaba, a typical scenario of ANNS is visual search. It has been studied for many years, and gives birth to a successful intelligence E-commercial application named "Pailitao".  ... 
doi:10.1145/3357384.3357834 dblp:conf/cikm/ZhaoPZZWZXJ19 fatcat:yvqezafoz5appoec2a3eelwuqu

Big Brother: A Drop-In Website Interaction Logging Service

Harrisen Scells, Jimmy, Guido Zuccon
2021 Proceedings of the 44th International ACM SIGIR Conference on Research and Development in Information Retrieval  
We further demonstrate the ability for Big Brother to scale to very large user studies through benchmarking experiments.  ...  We have made the source code and releases for Big Brother available for download at https://github.com/hscells/bigbro.  ...  However, their events contain fewer components than ours, and it is unlikely that this older technology would scale for large user studies of today.  ... 
doi:10.1145/3404835.3462781 fatcat:k4mqsyqgjffttnmt6my7okyvu4

On the Effectiveness of Sampled Softmax Loss for Item Recommendation [article]

Jiancan Wu, Xiang Wang, Xingyu Gao, Jiawei Chen, Hongcheng Fu, Tianyu Qiu, Xiangnan He
2022 arXiv   pre-print
Learning objectives of recommender models remain largely unexplored.  ...  and "What are the conceptual advantages of sampled softmax loss, as compared with the prevalent losses?", to the best of our knowledge.  ...  Contrastive Learning for Debiased Candidate Generation in Large-Scale Recom- 2016. Session-based Recommendations with Recurrent Neural Networks. In mender Systems.  ... 
arXiv:2201.02327v1 fatcat:xxcxorondvbulge3q6pgfbgqee

Cloud-native database systems at Alibaba

Feifei Li
2019 Proceedings of the VLDB Endowment  
At Alibaba, we have explored a suite of technologies to design cloud-native database systems.  ...  We will report key technologies and lessons learned to highlight the technical challenges and opportunities for cloudnative database systems at Alibaba.  ...  relational database, auto scale-out, cross- datacenter availability, high concurrency GraphDB graph database MongoDB document database HBase + X-Pack distributed wide-column database, multi-model  ... 
doi:10.14778/3352063.3352141 fatcat:wlwyvm7pzne7nf4wokn2jud7qm

Recent Advance in Content-based Image Retrieval: A Literature Survey [article]

Wengang Zhou, Houqiang Li, Qi Tian
2017 arXiv   pre-print
With the ignorance of visual content as a ranking clue, methods with text search techniques for visual retrieval may suffer inconsistency between the text words and visual content.  ...  The explosive increase and ubiquitous accessibility of visual data on the Web have led to the prosperity of research activity in image search or retrieval.  ...  Image search aims to retrieve relevant visual documents to a textual or visual query efficiently from a large-scale visual corpus.  ... 
arXiv:1706.06064v2 fatcat:m52xwsw5pzfzdbxo5o6dye2gde

CatGCN: Graph Convolutional Networks with Categorical Node Features [article]

Weijian Chen, Fuli Feng, Qifan Wang, Xiangnan He, Chonggang Song, Guohui Ling, Yongdong Zhang
2021 arXiv   pre-print
We then refine the enhanced initial node representations with the neighborhood aggregation-based graph convolution.  ...  Recent studies on Graph Convolutional Networks (GCNs) reveal that the initial node representations (i.e., the node representations before the first-time graph convolution) largely affect the final model  ...  Large-scale graph.  ... 
arXiv:2009.05303v3 fatcat:3q3fsxue3bhpnfstu7cwmtm4vm

Exploration-Exploitation Motivated Variational Auto-Encoder for Recommender Systems [article]

Yizi Zhang, Meimei Liu
2021 arXiv   pre-print
A hierarchical latent space model is utilized to learn the personalized item embedding for a given user, along with the population distribution of all user subgraphs.  ...  We retain users and items with at least ten interactions to ensure data quality. • Alibaba: This is a large user behavior dataset from Alibaba e-commerce platform.  ...  We optimize all models with the Adam optimizer, where the batch size is fixed at 512. We apply a grid search for hyper-parameters: the learning rate is tuned to 0.001 and the dropout ratio is 0.1.  ... 
arXiv:2006.03573v4 fatcat:4jeixnidqna7be27e67qzwri6i

Visual Search at Pinterest [article]

Yushi Jing and David Liu and Dmitry Kislyuk and Andrew Zhai and Jiajing Xu and Jeff Donahue and Sarah Tavel
2017 arXiv   pre-print
a cost-effective, large-scale visual search system with widely available tools.  ...  We also demonstrate, through a comprehensive set of live experiments at Pinterest, that content recommendation powered by visual search improve user engagement.  ...  or an academic lab to build a large-scale visual search system using a combination of non-proprietary tools.  ... 
arXiv:1505.07647v3 fatcat:or56oerg2zdqdekxc6bzwqtuna

Visual Search at Pinterest

Yushi Jing, David Liu, Dmitry Kislyuk, Andrew Zhai, Jiajing Xu, Jeff Donahue, Sarah Tavel
2015 Proceedings of the 21th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining - KDD '15  
a cost-effective, large-scale visual search system.  ...  We also demonstrate, through a comprehensive set of live experiments at Pinterest, that content recommendation powered by visual search improves user engagement.  ...  or an academic lab to build a large-scale visual search system using a combination of non-proprietary tools.  ... 
doi:10.1145/2783258.2788621 dblp:conf/kdd/JingLKZXDT15 fatcat:7vpuuu4uf5dl5nm3ilh2gi4ynu

Neuron-level Structured Pruning using Polarization Regularizer

Tao Zhuang, Zhixuan Zhang, Yuheng Huang, Xiaoyi Zeng, Kai Shuang, Xiang Li
2020 Neural Information Processing Systems  
A more reasonable pruning method is to only suppress unimportant neurons (with 0 scaling factors), and simultaneously keep important neurons intact (with larger scaling factor).  ...  The reasoning is that neurons with smaller scaling factors have weaker influence on network output. A scaling factor close to 0 actually suppresses a neuron.  ...  Acknowledgments and Disclosure of Funding This work is supported by Alibaba Group. The authors would like to thank the Search and Recommendation Division of Alibaba Group for support of this work.  ... 
dblp:conf/nips/ZhuangZHZSL20 fatcat:jefbbwc6frfhnepcaj4wmjm2ze

Billion-Scale Pretraining with Vision Transformers for Multi-Task Visual Representations [article]

Josh Beal, Hao-Yu Wu, Dong Huk Park, Andrew Zhai, Dmitry Kislyuk
2021 arXiv   pre-print
Large-scale pretraining of visual representations has led to state-of-the-art performance on a range of benchmark computer vision tasks, yet the benefits of these techniques at extreme scale in complex  ...  to replace the traditional convolutional backbone, with insights into both system and performance improvements, especially at 1B+ image scale.  ...  Related Work Visual Search Systems Visual search has been widely adopted in social and ecommerce applications, including Facebook [30] , Pinterest [41] , eBay [37] , Google, Microsoft [15] , Alibaba  ... 
arXiv:2108.05887v1 fatcat:gm5lzf4pkrg3zez7unuq7epp3a

A literature review on NoSQL database for big data processing

Md. Razu Ahmed, Mst. Arifa Khatun, Md. Asraf Ali, Kenneth Sundaraj
2018 International Journal of Engineering & Technology  
We specifically searched for two keywords ("NoSQL" and "Big Data") to find the articles.  ...  was to literature review on the NoSQL Database for Big Data processing including the structural issues and the real-time data mining techniques to extract the estimated valuable information.Methods: We searched  ...  But, considering the Big Data technology, we are able to store and use these kind of large data sets with the help of distributed systems, where parts of the data is stored in different locations and brought  ... 
doi:10.14419/ijet.v7i2.12113 fatcat:avhgmkeqxvan3ea6noywventhq

A Survey of Techniques for Constructing Chinese Knowledge Graphs and Their Applications

Tianxing Wu, Guilin Qi, Cheng Li, Meng Wang
2018 Sustainability  
At the same time, the accumulated experience of China in developing knowledge graphs is also a good reference to develop non-English knowledge graphs.  ...  In recent years, knowledge graph has been widely applied in different kinds of applications, such as semantic search, question answering, knowledge management and so on.  ...  Lytras who gave a lot of suggestions on how knowledge graphs impact the implementation of OBOR. Conflicts of Interest: The authors declare no conflict of interest.  ... 
doi:10.3390/su10093245 fatcat:wrqgfkwfanfejnffn6nyr4nqbq

Finding Unknown Malice in 10 Seconds: Mass Vetting for New Threats at the Google-Play Scale

Kai Chen, Peng Wang, Yeonjoon Lee, XiaoFeng Wang, Nan Zhang, Heqing Huang, Wei Zou, Peng Liu
2015 USENIX Security Symposium  
Based upon this observation, we developed a new technique, called MassVet, for vetting apps at a massive scale, without knowing what malware looks like and how it behaves.  ...  Our study shows that the technique can vet an app within 10 seconds at a low false detection rate.  ...  To detect unknown malware at a large scale, we come up with a design illustrated in Figure 1 .  ... 
dblp:conf/uss/0012WLWZHZ015 fatcat:r4b7pavquzdk5dfnalew52q7xi

ICASSP 2020 Table of Contents

2020 ICASSP 2020 - 2020 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)  
Wildes, York University, Taiwan; Chia-Wen Lin, National Tsing Hua University, Taiwan SS-L3.2: DEFENDING GRAPH CONVOLUTIONAL NETWORKS AGAINST ........................................................  ...  Group U.S., United States; Zhengzhi Ma, University of Southern California, United States; Liang Sun, Alibaba Group U.S., United States IDSP-L2: INDUSTRY SESSION ON LARGE-SCALE DISTRIBUTED LEARNING STRATEGIES  ...  GRAPH ....................................................... 2288 NETWORK Da Chen, Xiang Wu, Alibaba Group, China; Jianfeng Dong, Zhejiang Gongshang University, China; Yuan He, Hui Xue, Feng Mao, Alibaba  ... 
doi:10.1109/icassp40776.2020.9054406 fatcat:6h7hh2hxhne4pbmphharu2et2m
« Previous Showing results 1 — 15 out of 248 results