Vision-and-Language Navigation (VLN) is a dynamic interdisciplinary field at the interface of computer vision, natural language processing and robotics. It involves the design of autonomous agents ...