SPARKlab

Khronos: 4D Spatio-temporal Perception for Autonomous Robots

Researchers from MIT SPARKlab have created novel perception algorithms that for the first time allow robots to build a 4D spatio-temporal representation of their environment. This includes the detection and detailed reconstruction of the scene, including moving and changing objects, as well as their evolution through time. This 4D understanding can be built in real-time during autonomous robot operation.

Khronos won the Outstanding Systems Paper Award at the 2024 Robotics: Science and Systems conference in Delft, the Netherlands.

Authors: Lukas Schmid (lead author), Marcus Abate, Yun Chang, and Luca Carlone (co-authors)
Citation: “Khronos: A Unified Approach for Spatio-Temporal Metric-Semantic SLAM in Dynamic Environments,” in Robotics: Science and Systems, Delft, The Netherlands, July 2024.
Paper: https://arxiv.org/abs/2402.13817
Code: https://github.com/MIT-SPARK/Khronos

Abstract:
Perceiving and understanding highly dynamic and changing environments is a crucial capability for robot autonomy. While large strides have been made towards developing dynamic SLAM approaches that estimate the robot pose accurately, a lesser emphasis has been put on the construction of dense spatio-temporal representations of the robot environment. A detailed understanding of the scene and its evolution through time is crucial for long-term robot autonomy and essential to tasks that require long-term reasoning, such as operating effectively in environments shared with humans and other agents and thus are subject to short and long-term dynamics. 

To address this challenge, this work defines the Spatio-temporal Metric-semantic SLAM (SMS) problem, and presents a framework to factorize and solve it efficiently. We show that the proposed factorization suggests a natural organization of a spatio-temporal perception system, where a fast process tracks short-term dynamics in an active temporal window, while a slower process reasons over long-term changes in the environment using a factor graph formulation. 

We provide an efficient implementation of the proposed spatio-temporal perception approach, that we call Khronos, and show that it unifies existing interpretations of short-term and long-term dynamics and is able to construct a dense spatio-temporal map in real-time. We provide simulated and real results, showing that  the spatio-temporal maps built by Khronos are an accurate reflection of a 3D scene over time and that Khronos outperforms baselines across multiple metrics. We further validate our approach on two heterogeneous robots in challenging, large-scale real-world environments.