Temporal teacher with masked transformers for semi-supervised action proposal generation
Year of publication
2024
Authors
Pehlivan, Selen; Laaksonen, Jorma
Abstract
<p>By conditioning on unit-level predictions, anchor-free models for action proposal generation have displayed impressive capabilities, such as having a lightweight architecture. However, task performance depends significantly on the quality of data used in training, and most effective models have relied on human-annotated data. Semi-supervised learning, i.e., jointly training deep neural networks with a labeled dataset as well as an unlabeled dataset, has made significant progress recently. Existing works have either primarily focused on classification tasks, which may require less annotation effort, or considered anchor-based detection models. Inspired by recent advances in semi-supervised methods on anchor-free object detectors, we propose a teacher-student framework for a two-stage action detection pipeline, named Temporal Teacher with Masked Transformers (TTMT), to generate high-quality action proposals based on an anchor-free transformer model. Leveraging consistency learning as one self-training technique, the model jointly trains an anchor-free student model and a gradually progressing teacher counterpart in a mutually beneficial manner. As the core model, we design a Transformer-based anchor-free model to improve effectiveness for temporal evaluation. We integrate bi-directional masks and devise encoder-only Masked Transformers for sequences. Jointly training on boundary locations and various local snippet-based features, our model predicts via the proposed scoring function for generating proposal candidates. Experiments on the THUMOS14 and ActivityNet-1.3 benchmarks demonstrate the effectiveness of our model for temporal proposal generation task.</p>
Show moreOrganizations and authors
VTT Technical Research Centre of Finland Ltd
Pehlivan Selen
Publication type
Publication format
Article
Parent publication type
Journal
Article type
Original article
Audience
ScientificPeer-reviewed
Peer-ReviewedMINEDU's publication type classification code
A1 Journal article (refereed), original researchPublication channel information
Journal
Publisher
Volume
35
Issue
3
Article number
36
Pages
1-15
ISSN
Publication forum
Publication forum level
2
Open access
Open access in the publisher’s service
Yes
Open access of publication channel
Partially open publication channel
Self-archived
Yes
Other information
Fields of science
Computer and information sciences
Keywords
[object Object],[object Object],[object Object],[object Object]
Internationality of the publisher
International
Language
English
International co-publication
No
Co-publication with a company
No
DOI
10.1007/s00138-024-01521-7
The publication is included in the Ministry of Education and Culture’s Publication data collection
Yes