论文笔记-Autoencoders Are Vision Learners

  • DALL-E: Zero-Shot Text-to-Image Generation
  • BEIT: BERT Pre-Training of Image Transformers
  • Discrete representations strengthen vision transformer robustness
  • IBOT: Image BERT Pre-training with online tokenizer
  • Masked Autoencoders Are Scalable Vision Learners
  • VIMPAC: Video Pre-Training via Masked Token Prediction and Contrastive Learning
  • SignBERT: Pre-Training of Hand-Model-Aware Representation for Sign Language Recognition
作者

Xie Pan

发布于

2021-11-29

更新于

2021-11-29

许可协议

评论