1,204,126 Hits in 7.9 sec

On the Reliability of Test Collections for Evaluating Systems of Different Types [article]

Emine Yilmaz, Nick Craswell, Bhaskar Mitra, Daniel Campos
2020 arXiv   pre-print
This paper uses simulated pooling to test the fairness and reusability of test collections, showing that pooling based on traditional systems only can lead to biased evaluation of deep learning systems  ...  are mainly based on lexical similarity, they may return different types of relevant document that were not identified in the original pooling.  ...  We then analysed the reusability of test collections generated via pooling top-k results of systems of a particular type for evaluating systems that are of a different type, particularly focusing on traditional  ... 
arXiv:2004.13486v1 fatcat:v532636jbbf2dliuv3ngaxfnum

Information retrieval evaluation using test collections

Falk Scholer, Diane Kelly, Ben Carterette
2016 Information retrieval (Boston)  
Acknowledgments We thank all the authors who submitted papers and allowed us to review their work for this special issue. We also thank all those who reviewed papers for this special issue.  ...  The model is evaluated using two test collections, which contain different types of search queries (e.g., informational and navigational) and different types of assessors.  ...  However, this is accepted as one of the limitations of test collection-based evaluation.  ... 
doi:10.1007/s10791-016-9281-7 fatcat:fqka6za7lbe3hdeiyp7xtiddh4

Measurement Reliability and Reactivity Using Repeated Measurements of Resting Energy Expenditure with a Face Mask, Mouthpiece, and Ventilated Canopy

Terry R. Isbell, Robert C. Klesges, Andrew W. Meyers, Lisa M. Klesges
1991 JPEN - Journal of Parenteral and Enteral Nutrition  
Unfortunately, investigations that have evaluated the effects of different testing procedures on the measure- ment reliability of resting energy expenditure have yielded inconclusive results.  ...  * Moreover, there are con- flicting reports on the importance of the type of data collection systems used, with some investigators conclud- ing that the less obtrusive canopy system is superior to the  ... 
doi:10.1177/0148607191015002165 pmid:2051556 fatcat:lahj7bimirb3vk4w4mhz2tv3hq

Software Reliability Assurance Using a Framework in Weapon System Development: A Case Study

Dalju Lee, Jongmoon Baik, Ju-Hwan Shin
2009 2009 Eighth IEEE/ACIS International Conference on Computer and Information Science  
The framework provides a guideline for software reliability evaluation to software organizations and pursues the improvement of software engineering process which supports activities and indicators for  ...  We also present the empirical study of the application of the proposed framework and the analysis results on the effectiveness of the proposed framework.  ...  Figure 9 represents the fault density on different testing phases.  ... 
doi:10.1109/icis.2009.168 dblp:conf/ACISicis/LeeBS09 fatcat:knvfdabrojgytndydjrzhz5a2q

Measuring Component-Based Systems Using a Systematic Approach and Environment

Jerry Gao, Yumei Wu, Lee Chang, Sigurd Meldal
2006 2006 Second IEEE International Symposium on Service-Oriented System Engineering (SOSE'06)  
It reports the development effort on constructing a distributed performance evaluation environment for software components based on a set of well-defined performance evaluation metrics and techniques.  ...  Hence, performance testing and evaluation of software components becomes a critical task for component-based software.  ...  Component Reliability Although many different approaches and metrics have been proposed to measure system reliability, the common way is to evaluate the system reliability based on its reliability of service  ... 
doi:10.1109/sose.2006.19 dblp:conf/sose/GaoWCM06 fatcat:ettiikr67ngwbhe7yavydi2yza

A New Life Expectancy Assessment Method for Complex Systems With Multi-Characteristics: Case Study on Power-Shift Steering Transmission Control System

Xiao-Jian Yi, Yue-Feng Chen, Hui-Na Mu, Jian Shi, Peng Hou
2019 IEEE Access  
All in all, this life expectancy assessment method not only improves the theory of GO method, so that GO method is applied to evaluate the life expectancy of complex systems with multi-characteristics  ...  This paper proposes a new life expectancy assessment method for high-value complex systems with multicharacteristics based on goal-oriented (GO) method.  ...  ACKNOWLEDGMENT The authors are grateful to the chief editor, editor and reviewers for the suggestions, which improve the draft of this paper.  ... 
doi:10.1109/access.2019.2893216 fatcat:4eqstsshf5aunfm7fbh3lvw7eu

Assessors Agreement: A Case Study Across Assessor Type, Payment Levels, Query Variations and Relevance Dimensions [chapter]

Joao Palotti, Guido Zuccon, Johannes Bernhardt, Allan Hanbury, Lorraine Goeuriot
2016 Lecture Notes in Computer Science  
on system evaluation in the presence of disagreements across assessments obtained in the different settings.  ...  Yet, there is only limited understanding of how assessment disagreement influences the reliability of the evaluation in terms of systems rankings.  ...  Acknowledgements This work has received funding from the European Union's Horizon 2020 research and innovation programme under grant agreement No 644753 (KConnect), and from the Austrian Science Fund (  ... 
doi:10.1007/978-3-319-44564-9_4 fatcat:juq5kfk3nbft7lfrw3wxsvkaoa

Reliability of function points measurement: a field experiment

Chris F. Kemerer
1993 Communications of the ACM  
The results showed that the FP counts from pairs of raters using the standard method differed on average by +/-10.78%, and that the correlation across the two methods tested was as high as .95 for the  ...  reliability across different methods of counting has remained untested.  ...  Collection of FP counts for one medium-sized system was estimated to require 4 work-hours on the part of each rater 5 .  ... 
doi:10.1145/151220.151230 fatcat:y6akdj2d5jdalh7uei7bhne6na


A. Regattieri, A. Casto, F. Piana, M. Faccio, E. Ferrari
2018 DEStech Transactions on Engineering and Technology Research  
The reliability prediction (i.e. life prediction) of components and subsystems is a crucial issue for the efficiency of production systems.  ...  Researchers did different experimental tests in the laboratory of the Department of Industrial Engineer -University of Bologna, to estimate model parameters.  ...  Reliability evaluation is usually based on data collection from field during assets daily working [11] [12] [13] .  ... 
doi:10.12783/dtetr/icpr2017/17699 fatcat:i3zsyoxcxfbzhaklilqijformm

Development of the SciRAP Approach for Evaluating the Reliability and Relevance of in vitro Toxicity Data

Nicolas Roth, Johanna Zilliacus, Anna Beronius
2021 Frontiers in Toxicology  
The SciRAP in vitro tool (version 2.0) was revised based on the outcome of the expert test round (study evaluation and online survey) and consists of 24 criteria for evaluating "reporting quality" (reliability  ...  We present the work to develop and refine the SciRAP tool for evaluation of reliability and relevance of in vitro studies for incorporation on the SciRAP web-based platform (  ...  for the SciRAP tool development.  ... 
doi:10.3389/ftox.2021.746430 pmid:35295161 pmcid:PMC8915875 fatcat:xzazul3ahna7dj6qyvjaega7iy

Qualitative evaluation of automatic assignment of keywords to images

Chih-Fong Tsai, Ken McGarry, John Tait
2006 Information Processing & Management  
Only one of these methods is reported for most systems on which user-centred evaluation are conducted. We believe that both methods need to be considered for full evaluation.  ...  We also provide an example evaluation of our system based on this methodology.  ...  CBIR and/or relevance feedback Barnard Acknowledgement The authors would like to thank Chris Stokoe, James Malone, Sheila Garfield, Mark Elshaw, and Jean Davison to participate the system evaluation  ... 
doi:10.1016/j.ipm.2004.11.001 fatcat:rzvs6g637nfhpcb2vdhp6o2bbq

Qualitative and quantitative reliability assessment

K. Kanoun, M. Kaaniche, J.-P. Laprie
1997 IEEE Software  
We also thank Sylvain Metge for his contribution to the development of Sorel. Our work was partially supported by the European ESPRIT Long Term Research Project 20072: Deva (Design for Validation).  ...  ACKNOWLEDGMENT We thank the company that provided the data analyzed in this article and, especially, the engineers who participated in that analysis.  ...  The unit of time used for grouped data is a function of the system usage type and the number of failures occurring during the period analyzed, and may differ for different phases.  ... 
doi:10.1109/52.582977 fatcat:doxxecvtgzbopodl7stvhu3puq

Dynamic safety measurement-control technology for intelligent connected vehicles based on digital twin system

Xingbin Chen, Peng Zhang, Xinhe Min, Nini Li, Wei Cao, Shunren Xiao, Guanting Du
2021 Vibroengineering PROCEDIA  
The study is conducive to improving the test efficiency and index evaluation integrity of the intelligent networked system, reducing the test costs, proving the behaviors of Game interaction and stress  ...  for autonomous vehicle in different connected levels of the mixed-flow traffic environments, such as multi-agent perception, multi-source information transmission, vehicle control, vehicle-to-vehicle  ...  When a certain Game strategy is selected, a collection trend = , , ⋯ , of the Game is formed, where, ∈ , for each trend , the Game effect obtained by each different type vehicle is: ⎩ ⎪ ⎨ ⎪ ⎧ = , = , =  ... 
doi:10.21595/vp.2021.21990 fatcat:mf2zvazbxzcrlfbtqzma6wmswq


Rodrigo Antunes de Vasconcelos, Débora Bevilaqua-Grossi, Antonio Carlos Shimano, Cleber Jansen Paccola, Tânia Fátima Salvini, Christiane Lanatovits Prado, Wilson A. Mello Junior
2009 Revista Brasileira de Ortopedia (English Edition)  
All individuals performed isometric tests in the MID, muscular strength deficits collected were subsequently compared to the tests performed on the Biodex System 3 operating in the isometric and isokinetic  ...  Objectives: The aim of this study was to evaluate the reliability and validity of a modified isometric dynamometer (MID) in performance deficits of the knee extensor and flexor muscles in normal individuals  ...  This requires evaluating the validity and reliability criteria between deficits in isometric and isokinetic muscle performance collected using different types of dynamometers.  ... 
doi:10.1016/s2255-4971(15)30071-9 pmid:27004175 pmcid:PMC4783672 fatcat:z5rvrv7f7jgjrhhpoos3rcjgj4

First bridge with aspects of the "Smart Bridge" released for traffic

Sarah Dabringhaus
2018 Zenodo  
The current German maintenance management system for bridges is mainly based on visual inspection and aims at the repair of identified damages.  ...  In the project cluster "Smart Bridge" an adaptive system for information and holistic evaluation in real time is developed.  ...  Different measuring systems are used for the collection of traffic data in order to optimize the measuring systems by comparing reference traffic data.  ... 
doi:10.5281/zenodo.1441129 fatcat:v25db7e5uffzdjsat5fmtjwwhm
« Previous Showing results 1 — 15 out of 1,204,126 results