13 Hits in 6.8 sec

OGB-LSC: A Large-Scale Challenge for Machine Learning on Graphs [article]

Weihua Hu, Matthias Fey, Hongyu Ren, Maho Nakata, Yuxiao Dong, Jure Leskovec
2021 arXiv   pre-print
The OGB-LSC datasets, baseline code, and all the information about the KDD Cup are available at .  ...  We summarize the common techniques used by the winning solutions and highlight the current best practices in large-scale graph ML.  ...  OAC-1835598 (CINES), OAC-1934578 (HDR), CCF-1918940 (Expeditions) (Paszke et al., 2017) , PYTORCH GEOMETRIC (Fey and Lenssen, 2019) , DGL (Wang et al., 2019) , and DGL-KE (Zheng et al., 2020) .  ... 
arXiv:2103.09430v3 fatcat:3xew2eoggfaohjzenvb7fo4ofy

The Application of Machine Learning Techniques for Predicting Results in Team Sport: A Review [article]

Rory Bunker Nagoya Institute of Technology, Japan
2019 arXiv   pre-print
In this paper, we provide a review of studies that have used ML for predicting results in team sport, covering studies from 1996 to 2019.  ...  Our study considers accuracies that have been achieved across different sports and explores the notion that outcomes of some team sports could be inherently more difficult to predict than others.  ...  This property can offer utility to practitioners beyond just the ability to make predictions, but also in providing insight to coaches, management and athletes.  ... 
arXiv:1912.11762v1 fatcat:eupyo7xcefaytmtgt5cqvb4zja

Crowdsourcing Machine Intelligence Solutions to Accelerate Biomedical Science: Lessons learned from a machine intelligence ideation contest to improve the prediction of 3D domain swapping [article]

Yash Shah, Deepak Sharma, Rakesh Sharma, Sourav Singh, Hrishikesh Thakur, William John, Shamsudheen Marakkar, Prashanth Suravajhala, Vijayaraghava Seshadri Sundararajan, Jayaraman Valadi, Shameer Khader, Ramanathan Sowdhamini
2020 bioRxiv   pre-print
Machine intelligence competitions offer a vast pool of seasoned data scientists and machine intelligence experts to develop solutions through competition portals.  ...  Further, the biomedical diaspora could also seek help from the expert communities using a crowdsourcing website that hosts machine intelligence competitions.  ...  , falls prediction in 2019) 9-competitions KDD Cup Data Mining and Knowledge Discovery competition organized by ACM Special Interest Group on Knowledge Discovery and Data Mining  ... 
doi:10.1101/2020.07.12.199398 fatcat:eg5zvu76o5f5fhonvkdul4bhpu

The Application of Machine Learning Techniques for Predicting Match Results in Team Sport: A Review

Rory Bunker, Teo Susnjak
2022 The Journal of Artificial Intelligence Research  
In this paper, we review a selection of studies from 1996 to 2019 that used machine learning for predicting match results in team sport.  ...  Although there remains a lack of benchmark datasets (apart from in soccer), and the differences between sports, datasets and features makes between-study comparisons difficult, as we discuss, it is possible  ...  This property can offer utility to professionals beyond just the ability to make predictions, but also in providing insight to coaches, management, and athletes.  ... 
doi:10.1613/jair.1.13509 fatcat:be56yjbvmzbh7cflwbxyzq7yrq

Cybersecurity Threats and Their Mitigation Approaches Using Machine Learning—A Review

Mostofa Ahsan, Kendall E. Nygard, Rahul Gomes, Md Minhaz Chowdhury, Nafiz Rifat, Jayden F Connolly
2022 Journal of Cybersecurity and Privacy  
The detection of hidden trends and insights from network data and building of a corresponding data-driven machine learning model to prevent these attacks is vital to design intelligent security systems  ...  In this survey, the focus is on the machine learning techniques that have been implemented on cybersecurity data to make these systems secure.  ...  In recent work [171] , finding and using leakage has been discussed as one of the crucial elements for winning data mining competitions, and the authors showed it to be one of the critical elements for  ... 
doi:10.3390/jcp2030027 fatcat:3m3rxixzjjcwbhzk2od72xatta

The AI Index 2021 Annual Report [article]

Daniel Zhang, Saurabh Mishra, Erik Brynjolfsson, John Etchemendy, Deep Ganguli, Barbara Grosz, Terah Lyons, James Manyika, Juan Carlos Niebles, Michael Sellitto, Yoav Shoham, Jack Clark (+1 others)
2021 arXiv   pre-print
The report aims to be the most credible and authoritative source for data and insights about AI in the world.  ...  This year we significantly expanded the amount of data available in the report, worked with a broader set of external organizations to calibrate our data, and deepened our connections with the Stanford  ...  The terms searched for were based on the issues exposed and identified in papers below, and also on the topics called for discussion in the First AAAI/ACM Conference on AI, Ethics, and Society.  ... 
arXiv:2103.06312v1 fatcat:52qwvzv7jndxzaagyiro6koyza

Optimizing predictive performance of criminal recidivism models using registration data with binary and survival outcomes

Nikolaj Tollenaar, Peter G. M. van der Heijden, Gregor Stiglic
2019 PLoS ONE  
Results on the reconviction data from two sources suggest that both statistical and machine learning should be tried out for obtaining an optimal model.  ...  Additionally, we explore the predictive potential of classical statistical and machine learning methods for censored time-to-event data.  ...  The promise of improved model accuracy and substantial monetary rewards in business and competitions (e.g. the Kaggle competitions, the KDD cup and the Netflix competition) have made this a very active  ... 
doi:10.1371/journal.pone.0213245 pmid:30849094 pmcid:PMC6407787 fatcat:3nco2y7yg5evnbzrbzkj6kpp4a

Algorithmic Fairness Datasets: the Story so Far [article]

Alessandro Fabris, Stefano Messina, Gianmaria Silvello, Gian Antonio Susto
Finally, we analyze these resources from the perspective of five important data curation topics: anonymization, consent, inclusivity, labeling of sensitive attributes, and transparency.  ...  Unfortunately, the algorithmic fairness community, as a whole, suffers from a collective data documentation debt caused by a lack of information on specific resources (opacity) and scatteredness of available  ...  Acknowledgements The authors would like to thank the following researchers and dataset creators for the useful feedback on the data briefs: Alain Barrat, Luc  ... 
doi:10.48550/arxiv.2202.01711 fatcat:mav36x3w5namjhurzpevtsmsju

Data-Driven Analytics for Decision Making in Game Sports

Philipp German Seidenschwarz
How Much Money do the Champions League 2019-20 Win- ners Get?  ...  Marcel Jacomet from the Bern Univer- sity of Applied Sciences and to PD Dr.  ... 
doi:10.5451/unibas-ep85422 fatcat:3jf7evy7ivhcba5evgazufnsrm

Mapping (Dis-)Information Flow about the MH17 Plane Crash

Mareike Hartmann, Yevgeniy Golovchenko, Isabelle Augenstein
2019 Proceedings of the Second Workshop on Natural Language Processing for Internet Freedom: Censorship, Disinformation, and Propaganda   unpublished
ii Preface Welcome to the second edition of the Workshop on Natural Language Processing for Internet Freedom (NLP4IF 2019). This year, we focused on censorship, disinformation, and propaganda.  ...  We are also thrilled to be able to bring an invited speaker, Elissa Redmiles from Princeton University and Microsoft Research, with a talk on measuring human perception to defend democracy, exploring a  ...  Acknowledgements We would like to thank Dr Leandro Minku from the University of Birmingham for his insights into and help with the statistical analysis presented in this paper.  ... 
doi:10.18653/v1/d19-5006 fatcat:77l3dndrkvfmlhjt6qvnrassgi

Forecasting monthly airline passenger numbers with small datasets using feature engineering and a modified principal component analysis

Sara Al-Ruzeiqi
Different types of datasets were created to extract new features from the core data.  ...  trained network model is built from samples of selected features from the dataset in order to ensure diversity of the samples and to improve training.  ...  Feature engineering is used to decrease the dimensions in text classification. Feature engineering has been found to be important in the Kaggle and KDD Cup data science rivalries.  ... 
doi:10.26174/thesis.lboro.12249779.v1 fatcat:2okw5ezn5vayfldw3h5bbzxly4

Advances in session-based and session-aware recommendation

Malte Ludewig, Technische Universität Dortmund
This analysis is based on log data from a fashion retailer and insights were, furthermore, operationalized into novel session-aware recommendation approaches.  ...  ., online shops or media streaming applications, and extensive evidence exists that such systems increase both the user experience as well as the revenue of the providers.  ...  "Learning to Rank Hotels for Search and Recommendation from Session-Based Interaction Logs and Meta Data". In: Pro- ceedings of the ACM Recommender Systems Challenge 2019.  ... 
doi:10.17877/de290r-21713 fatcat:kyxn2g2tqzbtpfthb77s247sdi

OASIcs, Volume 64, ICLP'18, Complete Volume [article]

Alessandro Dal Palu', Paul Tarau, Neda Saeedloei, Paul Fodor
Most of the answer set solvers Acknowledgements We are grateful to Cesare Tinelli for valuable discussions on the subject of the paper and for the insights on the cvc4 system.  ...  Acknowledgements We thank Huiping Cao for the availability of the BigDat cluster (KDD lab).  ...  The metabolism dataset consists of an adaptation of the dataset originally from the 2001 KDD Cup Challenge 2 .  ... 
doi:10.4230/oasics.iclp.2018 fatcat:exbunzcx7rgrhg32iguyfxca7u