INT: An Inequality Benchmark for Evaluating Generalization in Theorem Proving [article]

Yuhuai Wu, Albert Qiaochu Jiang, Jimmy Ba, Roger Grosse
2021 arXiv   pre-print
INT is based on a procedure for generating theorems and proofs; this procedure's knobs allow us to measure 6 different types of generalization, each reflecting a distinct challenge characteristic to automated  ...  In addition, unlike prior benchmarks for learning-assisted theorem proving, INT provides a lightweight and user-friendly theorem proving environment with fast simulations, conducive to performing learning-based  ...  ACKNOWLEDGEMENTS We thank Jay McClelland, Han Huang and Yuanhao Wang for helpful comments and discussions. We also thank anonymous reviewers for valuable and constructive feedbacks.  ... 
arXiv:2007.02924v2 fatcat:5dpl46b7jvhkroeteh3iij4igm