TransMix: Attend to Mix for Vision Transformers

Transformer-based architectures are widely used in the field of computer vision. However, transformers-based networks are hard to optimize and can easily overfit if the training data is not sufficient. A common solution to the problem is using data augmentation and regularization techniques. Image credit: Wikitude via Flickr, CC BY-SA 2.0 A recent paper on arXiv.org argues that this approach has ...

Pathdreamer: A World Model for Indoor Navigation

World models represent an agent’s knowledge about its surroundings. The agent can predict the future of a model by ‘imagining’ the consequences of proposed actions. Nonetheless, world models that generate high-dimensional visual observations have been restricted to relatively simple environments. An example of a robotic platform that could be used for the indoor navigation. Image credit: Neurotechnology A recent paper ...

