OW-TAL: Learning Unknown Human Activities for Open-World Temporal Action Localization

Pattern Recognition. (PR) 2022.09.04,

Yaru Zhang, Xiao-Yu Zhang*, Haichao Shi.


Current temporal action localization methods work well on a closed-world assumption, in which all action categories to be localized are known as a priori. However, this assumption doesn’t apply to open-world scenarios, as novel categories that never appeared in the training stage will be encountered without explicit supervision. Distinct from the closed-world setting, localizing actions under the open-world setup poses two significant challenges: 1) identifying unknown actions from diverse knowns and localizing their temporal boundaries. 2) defying forgetting of previous actions when incrementally updating knowledge of identified unknown actions. To address the aforementioned challenges, we develop a two-branch framework with Unknown and Known action modeling Networks, a.k.a. UK-Net, for the problem of Open-World Temporal Action Localization (OW-TAL). The potential patterns underlying unknown and known actions, as well as their dynamic transformation, are modeled in a unified pipeline. Specifically, a self-attention based positionsensitive module is designed to produce actionness scores for unknown actions in a class-agnostic way. Besides, an iterative optimization strategy is developed to enable knowledge derived from known categories to be shared with the unknowns. In addition, a self-paced learning strategy is proposed to instructionally guide class-incremental learning while defying catastrophic forgetting. Benefiting from the above components, our UK-Net yields superior performance on three challenging datasets, i.e., THUMOS14, ActivityNet1.2, and MUSES. Experimental results also demonstrate the competitive performance of our method when compared with traditional closed-world counterparts.
Keywords: Temporal action localization, Open-world learning, Self-paced learning