A copy of this work was available on the public web and has been preserved in the Wayback Machine. The capture dates from 2020; you can also visit the original URL.
The file type is application/pdf
.
Filters
On the Reliability of Test Collections for Evaluating Systems of Different Types
[article]
2020
arXiv
pre-print
This paper uses simulated pooling to test the fairness and reusability of test collections, showing that pooling based on traditional systems only can lead to biased evaluation of deep learning systems ...
are mainly based on lexical similarity, they may return different types of relevant document that were not identified in the original pooling. ...
We then analysed the reusability of test collections generated via pooling top-k results of systems of a particular type for evaluating systems that are of a different type, particularly focusing on traditional ...
arXiv:2004.13486v1
fatcat:v532636jbbf2dliuv3ngaxfnum
Information retrieval evaluation using test collections
2016
Information retrieval (Boston)
Acknowledgments We thank all the authors who submitted papers and allowed us to review their work for this special issue. We also thank all those who reviewed papers for this special issue. ...
The model is evaluated using two test collections, which contain different types of search queries (e.g., informational and navigational) and different types of assessors. ...
However, this is accepted as one of the limitations of test collection-based evaluation. ...
doi:10.1007/s10791-016-9281-7
fatcat:fqka6za7lbe3hdeiyp7xtiddh4
Measurement Reliability and Reactivity Using Repeated Measurements of Resting Energy Expenditure with a Face Mask, Mouthpiece, and Ventilated Canopy
1991
JPEN - Journal of Parenteral and Enteral Nutrition
Unfortunately, investigations that have evaluated the effects of different testing procedures on the measure- ment reliability of resting energy expenditure have yielded inconclusive results. ...
* Moreover, there are con- flicting reports on the importance of the type of data collection systems used, with some investigators conclud- ing that the less obtrusive canopy system is superior to the ...
doi:10.1177/0148607191015002165
pmid:2051556
fatcat:lahj7bimirb3vk4w4mhz2tv3hq
Software Reliability Assurance Using a Framework in Weapon System Development: A Case Study
2009
2009 Eighth IEEE/ACIS International Conference on Computer and Information Science
The framework provides a guideline for software reliability evaluation to software organizations and pursues the improvement of software engineering process which supports activities and indicators for ...
We also present the empirical study of the application of the proposed framework and the analysis results on the effectiveness of the proposed framework. ...
Figure 9 represents the fault density on different testing phases. ...
doi:10.1109/icis.2009.168
dblp:conf/ACISicis/LeeBS09
fatcat:knvfdabrojgytndydjrzhz5a2q
Measuring Component-Based Systems Using a Systematic Approach and Environment
2006
2006 Second IEEE International Symposium on Service-Oriented System Engineering (SOSE'06)
It reports the development effort on constructing a distributed performance evaluation environment for software components based on a set of well-defined performance evaluation metrics and techniques. ...
Hence, performance testing and evaluation of software components becomes a critical task for component-based software. ...
Component Reliability Although many different approaches and metrics have been proposed to measure system reliability, the common way is to evaluate the system reliability based on its reliability of service ...
doi:10.1109/sose.2006.19
dblp:conf/sose/GaoWCM06
fatcat:ettiikr67ngwbhe7yavydi2yza
A New Life Expectancy Assessment Method for Complex Systems With Multi-Characteristics: Case Study on Power-Shift Steering Transmission Control System
2019
IEEE Access
All in all, this life expectancy assessment method not only improves the theory of GO method, so that GO method is applied to evaluate the life expectancy of complex systems with multi-characteristics ...
This paper proposes a new life expectancy assessment method for high-value complex systems with multicharacteristics based on goal-oriented (GO) method. ...
ACKNOWLEDGMENT The authors are grateful to the chief editor, editor and reviewers for the suggestions, which improve the draft of this paper. ...
doi:10.1109/access.2019.2893216
fatcat:4eqstsshf5aunfm7fbh3lvw7eu
Assessors Agreement: A Case Study Across Assessor Type, Payment Levels, Query Variations and Relevance Dimensions
[chapter]
2016
Lecture Notes in Computer Science
on system evaluation in the presence of disagreements across assessments obtained in the different settings. ...
Yet, there is only limited understanding of how assessment disagreement influences the reliability of the evaluation in terms of systems rankings. ...
Acknowledgements This work has received funding from the European Union's Horizon 2020 research and innovation programme under grant agreement No 644753 (KConnect), and from the Austrian Science Fund ( ...
doi:10.1007/978-3-319-44564-9_4
fatcat:juq5kfk3nbft7lfrw3wxsvkaoa
Reliability of function points measurement: a field experiment
1993
Communications of the ACM
The results showed that the FP counts from pairs of raters using the standard method differed on average by +/-10.78%, and that the correlation across the two methods tested was as high as .95 for the ...
reliability across different methods of counting has remained untested. ...
Collection of FP counts for one medium-sized system was estimated to require 4 work-hours on the part of each rater 5 . ...
doi:10.1145/151220.151230
fatcat:y6akdj2d5jdalh7uei7bhne6na
RELIABILITY PREDICTION OF A MECHANICAL COMPONENT THROUGH ACCELERATED LIFE TESTING
2018
DEStech Transactions on Engineering and Technology Research
The reliability prediction (i.e. life prediction) of components and subsystems is a crucial issue for the efficiency of production systems. ...
Researchers did different experimental tests in the laboratory of the Department of Industrial Engineer -University of Bologna, to estimate model parameters. ...
Reliability evaluation is usually based on data collection from field during assets daily working [11] [12] [13] . ...
doi:10.12783/dtetr/icpr2017/17699
fatcat:i3zsyoxcxfbzhaklilqijformm
Development of the SciRAP Approach for Evaluating the Reliability and Relevance of in vitro Toxicity Data
2021
Frontiers in Toxicology
The SciRAP in vitro tool (version 2.0) was revised based on the outcome of the expert test round (study evaluation and online survey) and consists of 24 criteria for evaluating "reporting quality" (reliability ...
We present the work to develop and refine the SciRAP tool for evaluation of reliability and relevance of in vitro studies for incorporation on the SciRAP web-based platform (www.scirap.org). ...
for the SciRAP tool development. ...
doi:10.3389/ftox.2021.746430
pmid:35295161
pmcid:PMC8915875
fatcat:xzazul3ahna7dj6qyvjaega7iy
Qualitative evaluation of automatic assignment of keywords to images
2006
Information Processing & Management
Only one of these methods is reported for most systems on which user-centred evaluation are conducted. We believe that both methods need to be considered for full evaluation. ...
We also provide an example evaluation of our system based on this methodology. ...
CBIR and/or relevance feedback Barnard
Acknowledgement The authors would like to thank Chris Stokoe, James Malone, Sheila Garfield, Mark Elshaw, and Jean Davison to participate the system evaluation ...
doi:10.1016/j.ipm.2004.11.001
fatcat:rzvs6g637nfhpcb2vdhp6o2bbq
Qualitative and quantitative reliability assessment
1997
IEEE Software
We also thank Sylvain Metge for his contribution to the development of Sorel. Our work was partially supported by the European ESPRIT Long Term Research Project 20072: Deva (Design for Validation). ...
ACKNOWLEDGMENT We thank the company that provided the data analyzed in this article and, especially, the engineers who participated in that analysis. ...
The unit of time used for grouped data is a function of the system usage type and the number of failures occurring during the period analyzed, and may differ for different phases. ...
doi:10.1109/52.582977
fatcat:doxxecvtgzbopodl7stvhu3puq
Dynamic safety measurement-control technology for intelligent connected vehicles based on digital twin system
2021
Vibroengineering PROCEDIA
The study is conducive to improving the test efficiency and index evaluation integrity of the intelligent networked system, reducing the test costs, proving the behaviors of Game interaction and stress ...
for autonomous vehicle in different connected levels of the mixed-flow traffic environments, such as multi-agent perception, multi-source information transmission, vehicle control, vehicle-to-vehicle ...
When a certain Game strategy is selected, a collection trend = , , ⋯ , of the Game is formed, where, ∈ , for each trend , the Game effect obtained by each different type vehicle is: ⎩ ⎪ ⎨ ⎪ ⎧ = , = , = ...
doi:10.21595/vp.2021.21990
fatcat:mf2zvazbxzcrlfbtqzma6wmswq
RELIABILITY AND VALIDITY OF A MODIFIED ISOMETRIC DYNAMOMETER IN THE ASSESSMENT OF MUSCULAR PERFORMANCE IN INDIVIDUALS WITH ANTERIOR CRUCIATE LIGAMENT RECONSTRUCTION
2009
Revista Brasileira de Ortopedia (English Edition)
All individuals performed isometric tests in the MID, muscular strength deficits collected were subsequently compared to the tests performed on the Biodex System 3 operating in the isometric and isokinetic ...
Objectives: The aim of this study was to evaluate the reliability and validity of a modified isometric dynamometer (MID) in performance deficits of the knee extensor and flexor muscles in normal individuals ...
This requires evaluating the validity and reliability criteria between deficits in isometric and isokinetic muscle performance collected using different types of dynamometers. ...
doi:10.1016/s2255-4971(15)30071-9
pmid:27004175
pmcid:PMC4783672
fatcat:z5rvrv7f7jgjrhhpoos3rcjgj4
First bridge with aspects of the "Smart Bridge" released for traffic
2018
Zenodo
The current German maintenance management system for bridges is mainly based on visual inspection and aims at the repair of identified damages. ...
In the project cluster "Smart Bridge" an adaptive system for information and holistic evaluation in real time is developed. ...
Different measuring systems are used for the collection of traffic data in order to optimize the measuring systems by comparing reference traffic data. ...
doi:10.5281/zenodo.1441129
fatcat:v25db7e5uffzdjsat5fmtjwwhm
« Previous
Showing results 1 — 15 out of 1,204,126 results