4 Hits in 1.8 sec

The RobotSlang Benchmark: Dialog-guided Robot Localization and Navigation [article]

Shurjo Banerjee, Jesse Thomason, Jason J. Corso
2020 arXiv   pre-print
To study such cooperative communication, we introduce Robot Simultaneous Localization and Mapping with Natural Language (RobotSlang), a benchmark of 169 natural language dialogs between a human Driver  ...  We introduce a Localization from Dialog History (LDH) and a Navigation from Dialog History (NDH) task where a learned agent is given dialog and visual observations from the robot platform as input and  ...  Acknowledgments The authors are supported in part by ARO grant (W911NF-16-1-0121) and by the US National Science Foundation National Robotics Initiative under Grants 1522904.  ... 
arXiv:2010.12639v1 fatcat:k53mcuuvmrghhivcmcyx63qiea

Vision-and-Language Navigation: A Survey of Tasks, Methods, and Future Directions [article]

Jing Gu, Eliana Stefani, Qi Wu, Jesse Thomason, Xin Eric Wang
2022 arXiv   pre-print
Vision-and-Language Navigation (VLN) is a fundamental and interdisciplinary research topic towards this goal, and receives increasing attention from natural language processing, computer vision, robotics  ...  Through structured analysis of current progress and challenges, we highlight the limitations of current VLN and opportunities for future work.  ...  Navigation graphs assume: (1) perfect localization-in the real world it is a noisy estimate; (2) oracle navigation-real robots cannot "teleport" to a new node; (3) known topology-in reality an agent may  ... 
arXiv:2203.12667v2 fatcat:lmtnaeejvfcrpb4qoaew2jcvsu

Deep Learning for Embodied Vision Navigation: A Survey [article]

Fengda Zhu, Yi Zhu, Vincent CS Lee, Xiaodan Liang, Xiaojun Chang
2021 arXiv   pre-print
We summarize the benchmarks and metrics, review different methods, analysis the challenges, and highlight the state-of-the-art methods.  ...  This problem has attracted rising attention in recent years due to its wide application in autonomous driving, vacuum cleaner, and rescue robot.  ...  [175] propose RobotSlang benchmark, a dataset which is gathered by pairing a human "driver" controlling a physical robot and asking questions of a human "commander" We compare the difference of Embodied  ... 
arXiv:2108.04097v4 fatcat:46p2p3zlivabbn7dvowkyccufe

Vision-Language Navigation: A Survey and Taxonomy [article]

Wansen Wu, Tao Chang, Xinmeng Li
2022 arXiv   pre-print
Depending on whether the navigation instructions are given for once or multiple times, this paper divides the tasks into two categories, i.e., single-turn and multi-turn tasks.  ...  We identify progress made on the tasks and look into the limitations of existing VLN models and task settings.  ...  ACKNOWLEDGMENT The work described in this paper was sponsored in part by the National Natural Science Foundation of China under Grant No. 62103420 and 62103428 , the Natural Science Fund of Hunan Province  ... 
arXiv:2108.11544v3 fatcat:qo5g237si5cwtewxiaeqtjwqpy