VO: Visual Odometry

Visual Odometry (VO), also known as Visual Odometry, is a technique used to estimate the movement of a vehicle or robot in an unknown environment, using only information from one or more cameras. This technique is based on the principle that, by capturing sequences of images of a scene, it is possible to track the points […]

SLAM: Simultaneous Localization and Mapping

SLAM (Simultaneous Localization and Mapping) is a process used in robotics to build and update a map of an unknown environment while simultaneously tracking the robot’s position in that same environment. The process involves collecting and analyzing sensory data, such as camera images, LiDAR sensor readings, and IMU (Inertial Measurement Unit) data, […]

3DR: 3D Reconstruction

3D reconstruction (3DR) is a technical process that involves creating three-dimensional models from 2D data, such as images, videos, or point clouds. This process can be divided into several steps: data acquisition, processing, modeling, and rendering. Data acquisition is usually done through sensors, such as cameras, […]

PS: Panoptic Segmentation

Panoptic Segmentation (PS) is an advanced computer vision technique that combines the concepts of semantic and instanced segmentation. In semantic segmentation, the goal is to classify each pixel in the image into a specific category (e.g., sky, road, person, car). In instanced segmentation, the task is to identify and delimit individual objects within the image.

IS: Instance Segmentation

Instance Segmentation (IS) is a computer vision technique that focuses on identifying and differentiating individual object instances in an image. Unlike semantic segmentation, which classifies pixels into categories without distinguishing specific objects, IS provides an instance-level segmentation, where each object within a category is segmented […]

SS: Semantic Segmentation

Semantic Segmentation (SS) is a computer vision technique that aims to assign a class label to each pixel in an image. Unlike object detection, which identifies and delimits objects in an image with bounding boxes, semantic segmentation provides a classification per pixel, generating a mask that defines […]

OD: Object Detection

Object Detection (OD) is a computer vision technique that aims to identify and locate objects within an image or video. The process consists of detecting entities, locating them in a two-dimensional space and classifying them into different categories. OD is a complex field that combines machine learning algorithms, […]

VQA: Visual Question Answering

Visual Question Answering (VQA) is a field of artificial intelligence that combines natural language processing (NLP) and computer vision techniques to answer questions about images. In technical terms, the VQA system takes as input an image and a natural language question about that image, and produces an answer, which can be […]

MRC: Machine Reading Comprehension

Machine Reading Comprehension (MRC) is a subfield of Artificial Intelligence (AI) that focuses on developing systems capable of reading and understanding texts in natural language. These systems not only identify words and sentences, but also interpret the context, semantics, and relationships between different parts of the text. MRC uses algorithms […]

MTL: Multi-Task Learning

Multi-Task Learning (MTL) is an approach in machine learning and deep learning where a model is trained to perform multiple tasks at the same time, rather than being trained separately for each task. In MTL, related tasks share a common representation, which allows the model to generalize better […]