Filters








3 Hits in 1.6 sec

Table Pre-training: A Survey on Model Architectures, Pre-training Objectives, and Downstream Tasks [article]

Haoyu Dong, Zhoujun Cheng, Xinyi He, Mengyu Zhou, Anda Zhou, Fan Zhou, Ao Liu, Shi Han, Dongmei Zhang
<span title="2022-04-29">2022</span> <i > arXiv </i> &nbsp; <span class="release-stage" >pre-print</span>
Since a vast number of tables can be easily collected from web pages, spreadsheets, PDFs, and various other document types, a flurry of table pre-training frameworks have been proposed following the success  ...  To fully use the supervision signals in unlabeled tables, a variety of pre-training objectives have been designed and evaluated, for example, denoising cell values, predicting numerical relationships,  ...  ., ForTaP extracted existing formulas from a large web-crawled spreadsheet corpus and extracted numerical reference and calculation relationships from them for pre-training.  ... 
<span class="external-identifiers"> <a target="_blank" rel="external noopener" href="https://arxiv.org/abs/2201.09745v4">arXiv:2201.09745v4</a> <a target="_blank" rel="external noopener" href="https://fatcat.wiki/release/fckxlk6przhsthnyhozehw3dz4">fatcat:fckxlk6przhsthnyhozehw3dz4</a> </span>
<a target="_blank" rel="noopener" href="https://web.archive.org/web/20220504060959/https://arxiv.org/pdf/2201.09745v4.pdf" title="fulltext PDF download" data-goatcounter-click="serp-fulltext" data-goatcounter-title="serp-fulltext"> <button class="ui simple right pointing dropdown compact black labeled icon button serp-button"> <i class="icon ia-icon"></i> Web Archive [PDF] <div class="menu fulltext-thumbnail"> <img src="https://blobs.fatcat.wiki/thumbnail/pdf/49/f4/49f4b4ca86e574c7ec688cfd45d2e17ff079c313.180px.jpg" alt="fulltext thumbnail" loading="lazy"> </div> </button> </a> <a target="_blank" rel="external noopener" href="https://arxiv.org/abs/2201.09745v4" title="arxiv.org access"> <button class="ui compact blue labeled icon button serp-button"> <i class="file alternate outline icon"></i> arxiv.org </button> </a>

A Machine Learning Approach for Layout Inference in Spreadsheets

Elvis Koci, Maik Thiele, Oscar Romero, Wolfgang Lehner
<span title="">2016</span> <i title="SCITEPRESS - Science and Technology Publications"> <a target="_blank" rel="noopener" href="https://fatcat.wiki/container/4bghlx2mf5fs3nq4gshtblxiza" style="color: black;">Proceedings of the 8th International Joint Conference on Knowledge Discovery, Knowledge Engineering and Knowledge Management</a> </i> &nbsp;
The biggest obstacle is the lack of awareness about the structure of the data in spreadsheets, which otherwise could provide the means to automatically understand and extract knowledge from these files  ...  In spite of this success, there does not exist a comprehensive approach to automatically extract and reuse the richness of data maintained in this format.  ...  ACKNOWLEDGEMENTS This research has been funded by the European Commission through the Erasmus Mundus Joint Doctorate "Information Technologies for Business Intelligence -Doctoral College" (IT4BI-DC).  ... 
<span class="external-identifiers"> <a target="_blank" rel="external noopener noreferrer" href="https://doi.org/10.5220/0006052200770088">doi:10.5220/0006052200770088</a> <a target="_blank" rel="external noopener" href="https://dblp.org/rec/conf/ic3k/KociTRL16.html">dblp:conf/ic3k/KociTRL16</a> <a target="_blank" rel="external noopener" href="https://fatcat.wiki/release/axcwpwu7ijb2pgywq4x6ihbq2i">fatcat:axcwpwu7ijb2pgywq4x6ihbq2i</a> </span>
<a target="_blank" rel="noopener" href="https://web.archive.org/web/20190225212011/http://pdfs.semanticscholar.org/6864/26d2d639156d6f4757e9be630f9c0b6b2fab.pdf" title="fulltext PDF download" data-goatcounter-click="serp-fulltext" data-goatcounter-title="serp-fulltext"> <button class="ui simple right pointing dropdown compact black labeled icon button serp-button"> <i class="icon ia-icon"></i> Web Archive [PDF] <div class="menu fulltext-thumbnail"> <img src="https://blobs.fatcat.wiki/thumbnail/pdf/68/64/686426d2d639156d6f4757e9be630f9c0b6b2fab.180px.jpg" alt="fulltext thumbnail" loading="lazy"> </div> </button> </a> <a target="_blank" rel="external noopener noreferrer" href="https://doi.org/10.5220/0006052200770088"> <button class="ui left aligned compact blue labeled icon button serp-button"> <i class="external alternate icon"></i> Publisher / doi.org </button> </a>

Table Pretraining: A Survey on Model Architectures, Pretraining Objectives, and Downstream Tasks [article]

Haoyu Dong, Zhoujun Cheng, Xinyi He, Mengyu Zhou, Anda Zhou, Fan Zhou, Ao Liu, Shi Han, Dongmei Zhang
<span title="2022-01-01">2022</span>
Since a vast number of tables can be easily collected from web pages, spreadsheets, PDFs, and various other document types, a flurry of table pretraining frameworks have been proposed following the success  ...  To fully use the supervision signals in unlabeled tables, a variety of pretraining objectives have been designed and evaluated, for example, denoising cell values, predicting numerical relationships, and  ...  Universal Framework for General Document Tables Most works only focus on a specific type of table, e.g., web tables.  ... 
<span class="external-identifiers"> <a target="_blank" rel="external noopener noreferrer" href="https://doi.org/10.48550/arxiv.2201.09745">doi:10.48550/arxiv.2201.09745</a> <a target="_blank" rel="external noopener" href="https://fatcat.wiki/release/xx3r3o5qgjc3nizxhbt7x5ytf4">fatcat:xx3r3o5qgjc3nizxhbt7x5ytf4</a> </span>
<a target="_blank" rel="noopener" href="https://web.archive.org/web/20220126054431/https://arxiv.org/pdf/2201.09745.pdf" title="fulltext PDF download" data-goatcounter-click="serp-fulltext" data-goatcounter-title="serp-fulltext"> <button class="ui simple right pointing dropdown compact black labeled icon button serp-button"> <i class="icon ia-icon"></i> Web Archive [PDF] <div class="menu fulltext-thumbnail"> <img src="https://blobs.fatcat.wiki/thumbnail/pdf/1f/25/1f25cd030c4107fbf55bfcb657e1ae85826888dd.180px.jpg" alt="fulltext thumbnail" loading="lazy"> </div> </button> </a> <a target="_blank" rel="external noopener noreferrer" href="https://doi.org/10.48550/arxiv.2201.09745"> <button class="ui left aligned compact blue labeled icon button serp-button"> <i class="external alternate icon"></i> Publisher / doi.org </button> </a>