Gigapixel videography, beyond the resolution of a single camera and human visual perception, aims to capture large-scale dynamic scenes with extremely high resolution. Benefiting from the high resolution and wide FoV, it leads to new challenges and opportunities for a large amount of computer vision tasks. Among them, multi-object tracking is a typical task to track the objects, especially humans in a video. Many researchers have developed state-of-the-art algorithms to settle the problem, e.g., FairMOT, ReMOTS, DeepMOT, etc.
However, accurate and efficient multi-object tracking in large-scale scenes is still not well studied. Although the gigapixel videography can capture both the wide-FoV scene and the high-resolution local details, how to efficiently process such high-resolution data is still not well studied. It is still an open question demanding prompt solutions. Thus, in this challenge, we present our unprecedented dataset “PANDA”, gigaPixel-level humAN-centric viDeo dAtaset (published in CVPR 2020), aiming for multi-object tracking research in large-scale gigapixel videography.