読んだ論文まとめ（随時更新）

距離学習

SoftTriple Loss: Deep Metric Learning Without Triplet Sampling
- Classifcationとmetric learningを一つのロスで行う手法の提案。Triplet lossをスムージングしていくと cross entropyになることが証明された（本質的には同じだったらしい）
Visual Explanation for Deep Metric Learning
- 距離学習モデルの可視化
Embedding Expansion: Augmentation in Embedding Space for Deep Metric Learning
- マイニングにいくつかシンプルなルールベースの最適化を行うことでどの距離損失関数に対しても精度向上が確認された
Moving in the Right Direction: A Regularization for Deep Metric Learning
- 深層距離学習の正則化手法の比較、triplet lossの危険性について書いてあった
Deep Metric Learning via Adaptive Learnable Assessment
- マイニングのルールを学習ベースに置き換えエピソードベースの学習スキームを採用した

Spatiotemporal Contrastive Video Representation Learning
- SimCLRを動画分類タスクに適用した、導入したい。
Predicting Video with VQVAE
- kinetics600で65%、teacher_forcing likeな方法が取れる
Is Space-Time Attention All You Need for Video Understanding?
- Transformerによる動画分類器、色々新しい。
VideoMix: Rethinking Data Augmentation for Video Classification
- VideoMixという動画行動認識のための新しいDAを提案、T-VideoMixという手法が導入できそう。
TSM: Temporal Shift Module for Efficient Video Understanding
- 3DCNN重すぎ問題をTSMというモジュールを2DCNNに挿入することで代用した、TSMはパラメータ0なので2DCNNのcomplexityのままらしい。

Improved Conditional VRNNs for Video Prediction
Variational Recurrent Autoencoder で動画の未知のフレームを予測する。典型的なRAEで非常にシンプル、生成するならこれでしょ。
Video Prediction via Example Guidance
- 読み終わってない、動画未来予測で初のマルチモーダルモデル
Predictive Learning: Using Future Representation Learning Variantial Autoencoder for Human Action Prediction
- RGBとOptical Flowの2ストリーム

Revisiting ResNets: Improved Training and Scaling Strategies
- ResNetの学習とスケーリング方法
An annotation-free whole-slide training approach to pathological classification of lung cancer types using deep learning
- ユニファイドメモリ（UM）メカニズムといくつかのGPUメモリ最適化手法で画像の縮小を回避する
Prototypical Contrastive Learning of Unsupervised Representations
- EMアルゴリズムベースのクラスタリング、クラスターが収束しずらくなる様に距離関数を変更していき過学習を抑制する