论文标题

时门:远程活动中段的条件门控

TimeGate: Conditional Gating of Segments in Long-range Activities

论文作者

Hussein, Noureldien, Jain, Mihir, Bejnordi, Babak Ehteshami

论文摘要

当识别长期活动时,探索整个视频详尽且计算昂贵,因为它可以跨越几分钟。因此,仅采样视频的显着部分是非常重要的。我们提出了时间门以及一个新颖的条件门控模块,用于对远程活性的最具代表性段进行抽样。 TimeGate有两个新颖性,可以解决以前的采样方法的缺点,作为SCSAMPLER。首先,它可以对细分进行区分采样。因此,时间门可以与现代CNN一起安装,并端对端作为单个模型训练。因此,时间门更适合于长期活动,其中细分市场的重要性在很大程度上取决于视频上下文。TimeGate减少了在三个基准上进行远程活动的现有CNN的计算:charades,早餐和多源。特别是,TimeGate在保持分类准确性的同时,将I3D的计算降低了50%。

When recognizing a long-range activity, exploring the entire video is exhaustive and computationally expensive, as it can span up to a few minutes. Thus, it is of great importance to sample only the salient parts of the video. We propose TimeGate, along with a novel conditional gating module, for sampling the most representative segments from the long-range activity. TimeGate has two novelties that address the shortcomings of previous sampling methods, as SCSampler. First, it enables a differentiable sampling of segments. Thus, TimeGate can be fitted with modern CNNs and trained end-to-end as a single and unified model.Second, the sampling is conditioned on both the segments and their context. Consequently, TimeGate is better suited for long-range activities, where the importance of a segment heavily depends on the video context.TimeGate reduces the computation of existing CNNs on three benchmarks for long-range activities: Charades, Breakfast and MultiThumos. In particular, TimeGate reduces the computation of I3D by 50% while maintaining the classification accuracy.

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源