Estimating Location Using Wi-Fi

Qiang Yang, Sinno Jialin Pan, Vincent Wenchen Zheng
2008 IEEE Intelligent Systems  
R ecent advances in pervasive computing and mobile technology have enabled accurate location and activity tracking of users wearing wireless devices indoors, where GPS isn't available. A practical way to do this is by leveraging the Wi-Fi signals that a mobile client receives from various access points. For example, many indoor location estimation techniques use received radio signal strength (RSS) values and radio signal propagation models to track users. Machine learning-based methods have
more » ... ven among the most accurate. However, Wi-Fi data is noisy owing to the indoor environment's multipath and shadow fading effects. The data distribution changes constantly as people move and as temperature and humidity change. 1-3 Moreover, it can be expensive to collect and label RSS training data in a large building because it requires a human to walk with a mobile device, collecting RSS values and recording ground locations. 4, 5 Despite intense research in indoor location estimation and activity recognition, the field lacks benchmark data that researchers and practitioners can use to compare their solutions. The 2007 Data Mining Contest (www.ist.unomaha.edu/icdm2007/contest), sponsored by the IEEE International Conference on Data Mining, provided the first realistic public benchmark data for indoor location estimation using RSS that a client device received from Wi-Fi access points. We collected the data sets in a 145.5 m × 37.5 m academic building at the Hong Kong University of Science and Technology. We divided the location into a grid of 247 units, each about 1.5 m × 1.5 m. We focused on discrete classification as well as regression versions of the tasks (we've posted these and the benchmark data set at www.cs.ust. hk/~qyang/ICDMDMC07). This year's contest focused on two tasks: indoor location estimation and transferring knowledge (learned from training data) for indoor location estimation. Task 1 In this semisupervised-learning problem, we asked participants to predict a client's location on the basis of RSS values received from Wi-Fi access points. We provided a set of data (RSS values, location label) as training data, with discrete location labels, which correspond to different grids. To make the problem more interesting, we also provided some unlabeled data (with only the RSS values) and some partially labeled user traces. In this task, the training data had 3,196 RSS vectors in both nontrace and trace data; only 787 were labeled. We obtained the test data by collecting the RSS values as we walked around a building that had 43 user traces and a total of 2,180 vectors of RSS values. We asked participants to predict the location label for each RSS vector in the test data. We asked participants to submit their predictions for each task for all test data separately. We conducted the evaluation on a test data set and ranked the final results in descending order of their precision values for each task: precision = (number of correct predictions)/(total number of test data) JaNuarY/FEbruarY 200 www.computer.org/intelligent Worldwide, 115 teams registered for our contest. In the end, 21 teams submitted 32 results-15 for task 1 and 17 for task 2. Among the solutions, participants most frequently used k-nearest neighbor methods, decision trees, 6 and semisupervised or transductive learning models. 7 A team from IBM Research, Tokyo, won task 1, obtaining 0.8226 precision. The two runners-up were teams from the University of Tokyo and from Tsinghua University. A graduate student from HeBei University won task 2. The runners-up were teams from the Chinese Academy of Sciences and from IBM Research, Tokyo. Table 1 evaluates the contest's submissions. Max, min, median, average, and std-dev represent the highest, lowest, median, average, and standard deviation precision values among the submissions. System science and data mining makes localization through Wi-Fi and sensors feasible. This data mining contest brought many innovative solutions to this challenging and important problem. At the same time, it brought new research issues for the future, including transfer learning and semi-supervised learning.
doi:10.1109/mis.2008.4 fatcat:y3l2rbex5favhjqqmv7r3djozy