Abstract: Extracting polygonal building footprints from off-nadir imagery is crucial for diverse applications. Current deep-learning-based extraction approaches predominantly rely on semantic ...
Abstract: This paper presents DSConvNet, a novel architecture based on depthwise separable convolutional blocks for efficient multi-class image classification. Despite progress in compact ...
Automatically describing an image with a natural language has been an emerging challenge in both fields of computer vision and natural language processing. In this paper, we present Long Short-Term ...
This repository contains the code for the paper "MSSPlace: Multi-Sensor Place Recognition with Visual and Text Semantics". High-level overview of the proposed multimodal MSSPlace method. The MSSPlace ...
Stable Diffusion is a latent text-to-image diffusion model. For more efficiency and speed on GPUs, we highly recommended installing the xformers library. Tested on A100 with CUDA 11.4. Installation ...