Academic Project Proposals


Greyed Entries: already assigned projects.
  Quantum Machine Learning for Computer Vision
Quantum Machine Learning (QML) refers to the application of quantum algorithms to machine learning problems, Recently available tools (IBM Qiskit, Google Cirq, Pennnylane) provide excellent starting points for research on QML algorithms, This project tackles classical CV problems (detection, classif., ...) with QML to achieve an introductory approach to the topic and its comparative performance.
Academic Supervisor: Fernando Vilarino
Supervisor e-mail: fernando@cvc.uab.es
Institution: UAB
Confidential: No
Date: 2023-01-31 11:33:54
  Bio-inspired networks for AI and ML
A long-standing aim for AI and ML is to mimic the behavior of the human brain. Hence, an increasing interest in implementing biologically plausible mechanisms from neuroscience to improve state-of-the-art neural networks (NN). In this project, spiking NN will be used to classify small datasets (e.g. MNIST) using biologically plausible learning mechanisms such as spike timing-dependent plasticity.
Extended abstract: Download PDF
Academic Supervisor: Xavier Otazu
Supervisor e-mail: xotazu@cvc.uab.cat
Institution: UAB
co-Supervisor: Olivier Penacchio
co-Supervisor e-mail: olivier.penacchio@uab.cat
Confidential: No
Date: 2023-02-02 11:38:23
  Biologically inspired algorithms for energy efficient machine learning
The astonishing improvements of ML since the emergence of deep learning (DL) rest on enormous computational resources, with unsustainable environmental cost. As the brain is an extremely efficient learning machine, using bio-inspired algorithms is a promising avenue for tackling these issues. This project explores this idea by comparing the energy use of bio-inspired and standard ML algorithms.
Extended abstract: Download PDF
Academic Supervisor: Olivier Penacchio
Supervisor e-mail: oliver.penacchio@uab.cat
Institution: UAB
co-Supervisor: Xavier Otazu
co-Supervisor e-mail: xotazu@cvc.uab.cat
Confidential: No
Date: 2023-02-02 11:46:13
  Automatic analysis of sport events in video sequences
Develop a robust deep learning architecture to track different players in a sport event, in particular, basketball. The main goal of this project is to study improve occlusions/disappearance situations using global information (such as the number on the back or the face information). There is a possibility to finance the TFM.
Academic Supervisor: Javier Ruiz Hidalgo
Supervisor e-mail: j.ruiz@upc.edu
Institution: UPC
co-Supervisor: Josep Ramon Morros
co-Supervisor e-mail: ramon.morros@upc.edu
Confidential: No
Date: 2023-02-02 18:31:59
  Transcription and Decryption of handwritten ciphered document images
Contrary to text documents, there are few methods for recognizing manuscripts with uncommon alphabets, like ciphers documents (secret messages in diplomatic letters, secret societies...). This work will focus on designing a transcription and decryption model based on Deep Learning architectures, and validate its applicability on real cipher images. More info: https://de-crypt.org/
Academic Supervisor: Alicia Fornes
Supervisor e-mail: afornes@cvc.uab.es
Institution: UAB
co-Supervisor: Mohamed Ali Souibgui
co-Supervisor e-mail: msouibgui@cvc.uab.cat
Confidential: No
Date: 2023-02-08 16:29:40
  Golf Swing analysis
Currently, the tools used to analyze the quality of the golf swing are based on very expensive multi-camera systems, available only on a few very large golf clubs. In this project, we want to study the feasibility of using Deep Learning models to perform this analysis on videos captured with a single low-cost camera, enabling its use as a tool for coordination and motor learning at schools.
Academic Supervisor: Xavier Baro
Supervisor e-mail: xbaro@uoc.edu
Institution: UOC
Confidential: No
Date: 2023-02-08 16:44:55
  Autonomous Driving powered by Deep Learning
CVC performs worldwide pioneer research on different topics at the intersection of deep learning (DL), simulation, and autonomous driving (AD). Known works are CARLA simulator (carlar.org) and AD by imitation learning (real-world AD example in www.youtube.com/watch?v=pzmQ-TmaGi0 . This proposal offers a TFM on DL for AD, where the specific topic will be decided with the selected student.
Academic Supervisor: Antonio M. Lopez
Supervisor e-mail: antonio@cvc.uab.es
Institution: UAB
Assigned Student Name: Abel García Romera
Student e-mail: Abel.GarciaR@autonoma.cat
pre-Assigned Student Name: Abel Garcia
Confidential: No
Date: 2023-02-11 14:01:18
  Exploring continual learning capabilities of large language models
Large language models (LLM) have shown incredible performance in language-related tasks. Nevertheless, the potential of LLMs can be utilized for other endeavors as well. For instance, it has been shown that they can store and simulate other neural networks inside their hidden layers. In this project you will explore the capabilities of frozen LLM (such as GTP-3) for continual learning of images.
Academic Supervisor: Alex Gomez-Villa
Supervisor e-mail: agomezvi@cvc.uab.cat
Institution: UAB
co-Supervisor: Joost Van De Weijer
co-Supervisor e-mail: joost@cvc.uab.es
Confidential: No
Date: 2023-02-13 18:13:40
  Weakly Supervised Learning Segmentation applied to Wind Turbines Images: Loss exploration
The goal of the project is developing an image segmentation algorithm on wind turbine blade imagery, obtained during drone inspections. The project would pursue improving current networks by designing customized loss functions that include our prior knowledge of blade images.
Academic Supervisor: Antonio Agudo
Supervisor e-mail: Raül Perez
Institution: UPF
Confidential: No
Date: 2023-02-20 22:14:43
  Large scale crinoideus counting from ROV videos
The exhaustive monitoring of marine species is crucial to estimate the effects of protection measures and policies. This TFM proposes to reduce manual annotation needs and to specifically provide DL tools to estimate the populations of crinoideus from large DBs recorded using ROV in the deep sea. We will explore recent CNN architectures and attention mechanisms to provide reliable counting.
Extended abstract: Download PDF
Academic Supervisor: David Masip Rodo
Supervisor e-mail: dmasipr@uoc.edu
Institution: UOC
Confidential: No
Date: 2023-02-24 11:57:23
  Low-level image features’ contribution to aesthetic valuation.
Humans perceive and rate real world objects and images with ease. However, this task is very challenging for computers, mostly because of the large semantic content of current training datasets. We propose training a model on semantically-deprived images to understand the contribution to aesthetics of low-level features such as colourfulness, symmetry, image technical quality, etc.
Academic Supervisor: C. Alejandro Parraga
Supervisor e-mail: Alejandro.Parraga@cvc.uab.cat
Institution: UAB
Assigned Student Name: Marcos Muñoz González
Student e-mail: Marcos.MunozG@autonoma.cat
pre-Assigned Student Name: Marcos Munoz Gonzalez
Confidential: No
Date: 2023-03-09 15:14:03
  Neural 3D detail-aware shape from RGB images under general lighting
We aim to propose a new neural method to capture 3D objects along with illumination properties from pictures. To this end, we explore a universal approach that can work in an uncalibrated, unified and unsupervised manner, fully interpretable, and without assuming any prior knowledge of the shape geometry to constrain the solution and under general lighting.
Academic Supervisor: Antonio Agudo
Supervisor e-mail: aagudo@iri.upc.edu
Institution: UPF
Confidential: No
Date: 2023-03-15 09:53:07
  3D fruit detection in LiDAR point clouds using graph or 3D neural networks
The detection of fruits is of great interest to predict the harvest resources in advance. The main goal of this project will be to explore and design new deep learning architectures based on Graph or 3D Neural Networks to be able to detect fruits in the 3D representation. There is a possibility to parcially fund this work (see attached PDF).
Extended abstract: Download PDF
Academic Supervisor: Javier Ruiz Hidalgo
Supervisor e-mail: j.ruiz@upc.edu
Institution: UPC
co-Supervisor: Jordi Gene Mola
co-Supervisor e-mail: jordi.genemola@udl.cat
Assigned Student Name: Berkay Arpaci
Student e-mail: Berkay.Arpaci@autonoma.cat
Confidential: No
Date: 2023-03-15 13:48:06
  A study of automatic emotion recognition in children aged 3 to 5 years.
In this project we will explore Computer Vision techniques for children's facial analysis. The objective of this Master Thesis proposal is to assess whether the combinations of Action Units that previous literature has found to be associated with certain emotional expressions are correlated with three- to five-year-old children’s expression of those emotions perceived by the observer.
Academic Supervisor: Agata Lapedriza
Supervisor e-mail: alapedriza@uoc.edu
Institution: UOC
co-Supervisor: Lucrezia Crescenzi
co-Supervisor e-mail: lcrescenzi@uoc.edu
Confidential: No
Date: 2023-03-15 22:26:07
  Apples and Oranges: Topology Alignment for OCR-Free Topic Modeling in the Visual Domain
In the context of document understanding, many advances have been done in terms of information retrieval. So far, many of the topic model approaches in document analysis rely on extracting the OCR in order to perform the computation in the textual domain. In this work, we propose a mechanism to perform this retrieval directly in the visual domain.
Academic Supervisor: Josep Llados
Supervisor e-mail: josep@cvc.uab.cat
Institution: UAB
co-Supervisor: Oriol Ramos
co-Supervisor e-mail: oriolrt@cvc.uab.cat
Assigned Student Name: Adrià Molina Rodríguez
Student e-mail: Adria.Molina@uab.cat
pre-Assigned Student Name: Adria Molina Rodriguez
Confidential: No
Date: 2023-03-16 14:58:59
  Smoke Evolution Measurement: Estimation Of 3D Shape And Volume Of Fire Plumes From Multiple Views
This project aims to measure wildfire plume dimensions and geometry in 3D space and time by means of computer vision techniques. In comparison with fuel and fire monitoring, smoke remote sensing is significantly less developed, mainly due to the highly dynamic nature of smoke and its very variable optical properties. Significant impact on plume models testing are expected for fire management ops.
Extended abstract: Download PDF
Academic Supervisor: Josep R. Casas
Supervisor e-mail: josep.ramon.casas@upc.edu
Institution: UPC
co-Supervisor: Montse Pardas
co-Supervisor e-mail: montse.pardas@upc.edu
Assigned Student Name: Júlia Ariadna Blanco Arnaus
Student e-mail: JuliaAriadna.Blanco@autonoma.cat
pre-Assigned Student Name: Julia Ariadna Blanco i Arnaus
Confidential: No
Date: 2023-04-12 10:25:40
  Exploring pretraining tasks for multimodal methods in DocVQA
This project consist of explore different pretraining tasks on already existing models for DocVQA. It's defined as two different milestones. First, the student will select an already existing model that has been pretrained with textual tasks and implement visual pretraining tasks. Then, he/she will extend the analysis of the results with extra different pretraining tasks on both modalities.
Extended abstract: Download PDF
Academic Supervisor: Dimosthenis Karatzas
Supervisor e-mail: dimos@cvc.uab.cat
Institution: UAB
co-Supervisor: Ruben P. Tito
co-Supervisor e-mail: rperez@cvc.uab.cat
Confidential: No
Date: 2023-04-13 20:31:20
  Improving Flood Detection on SAR images using State-of-the-Art Computer Vision Algorithms
This project consists of applying state-of-the-art computer vision algorithms for flood detection, comparing their performance, and proposing novel ideas to improve them. The project is based on the NASA Interagency Implementation and Advanced Concepts Team's flood event detection contest, which involves using supervised learning to identify flood pixels in Synthetic Aperture Radar (SAR) images.
Extended abstract: Download PDF
Academic Supervisor: Lluis Gomez
Supervisor e-mail: lgomez@cvc.uab.cat
Institution: UAB
co-Supervisor: Ali Furkan Biten
co-Supervisor e-mail: abiten@cvc.uab.es
Confidential: No
Date: 2023-04-17 14:22:56
  Automated Animal Detection and Classification in Camera Traps using Computer Vision
Automate the categorization of species in camera trap data using computer vision. This project will provide a comparative study for researchers to investigate state-of-the-art models' ability to generalize to unseen locations, lighting conditions, and occlusions.
Extended abstract: Download PDF
Academic Supervisor: Lluis Gomez
Supervisor e-mail: lgomez@cvc.uab.cat
Institution: UAB
co-Supervisor: Ali Furkan Biten
co-Supervisor e-mail: abiten@cvc.uab.es
Confidential: No
Date: 2023-04-17 14:38:03
  Automated Building Damage Assessment using Satellite Imagery
The project aims to automate the process of assessing building damage after a natural disaster using state-of-the-art computer vision algorithms. The xView2 Challenge dataset, consisting of high-resolution satellite imagery, will be used to develop and compare models for building damage assessment.
Extended abstract: Download PDF
Academic Supervisor: Lluis Gomez
Supervisor e-mail: lgomez@cvc.uab.cat
Institution: UAB
co-Supervisor: Ali Furkan Biten
co-Supervisor e-mail: abiten@cvc.uab.es
Confidential: No
Date: 2023-04-17 15:26:14
  Exploring Rejection Strategies for Zero-Shot Image Classification
This project aims to explore state-of-the-art rejection strategies in the zero-shot setting for popular models such as CLIP, which have shown impressive performance in zero-shot image classification. By examining the effectiveness of various rejection strategies, we hope to improve the robustness and accuracy of these models.
Extended abstract: Download PDF
Academic Supervisor: Lluis Gomez
Supervisor e-mail: lgomez@cvc.uab.cat
Institution: UAB
Assigned Student Name: Hicham El Muhandiz Aarab
Student e-mail: Hicham.ElMuhandiz@autonoma.cat
Confidential: No
Date: 2023-04-17 15:44:57
  Large Language Models for Document Visual Question Answering
Document visual question answering is an important tool to perform high-level reasoning and interpret document images. Nowadays, Large language models are becoming popular in question answering tasks. In this project, we aim to incorporate the large language models in a machine learning model that answers user questions and queries about a document image in a multi-modal fashion.
Extended abstract: Download PDF
Academic Supervisor: Dimosthenis Karatzas
Supervisor e-mail: dimos@cvc.uab.es
Institution: UAB
co-Supervisor: Mohamed Ali Souibgui
co-Supervisor e-mail: msouibgui@cvc.uab.es
Assigned Student Name: Anna Oliveras Tous
Student e-mail: Anna.OliverasT@autonoma.cat
Confidential: No
Date: 2023-04-17 18:23:42
  Prior-based implicit reconstruction of human avatars
In this project we aim at exploring the use of parametric priors to boost the performance of implicit representation in the task of building human avatars
Academic Supervisor: Francesc Moreno-Noguer
Supervisor e-mail: fmoreno@iri.upc.edu
Institution: UPC
Assigned Student Name: Alvaro Francesc Budria Fernández
Student e-mail: AlvaroFrancesc.Budria@autonoma.cat
pre-Assigned Student Name: Alvaro Francesc Budria
Confidential: No
Date: 2023-04-24 10:35:40
  Image retrieval with text modifiers
In this project, we will study the task of image retrieval, where the input query is specified in the form of an image plus some text that describes desired modifications to the input image.
Academic Supervisor: Lluis Gomez
Supervisor e-mail: lgomez@cvc.uab.cat
Institution: UAB
Assigned Student Name: Razvan-Florin Apatean
Student e-mail: RazvanFlorin.Apatean@autonoma.cat
Confidential: No
Date: 2023-05-16 13:31:49
  Adaptive Control and Task Generalization in Delta Robot Manipulation
This research develops an AI-driven Delta robot system for manipulation tasks. It includes a physical robot, simulator, control algorithms, and an AI method utilizing LLM. The LLM decomposes user requests into atomic tasks executed with traditional control. Close look control is achieved through a camera. The method generalizes to new tasks and objects, enhancing the robot's capabilities.
Academic Supervisor: David Vazquez Bermudez
Supervisor e-mail: david.vazquez@servicenow.com
Institution: UAB
co-Supervisor: Michal Drozdzal
Assigned Student Name: Jia Qiang Ye Zhu
Student e-mail: JiaQiang.Ye@autonoma.cat
pre-Assigned Student Name: Jiaqiang Ye Zhu
Confidential: No
Date: 2023-05-17 18:08:45
  Self-supervised learning of multimodal representations in food recipes
This project aims at developing a self-supervised deep learning approach to learn video representations of food recipes jointly from either video frames and textual descriptions or from video frames and their corresponding audio signal. The learned representation will be evaluated on the downstream tasks of action prediction and action localization on different datasets related to cooking.
Academic Supervisor: Gloria Haro
Supervisor e-mail: gloria.haro@upf.edu
Institution: UPF
co-Supervisor: Coloma Ballester
co-Supervisor e-mail: Mariella Dimiccoli
pre-Assigned Student Name: Igor Ugarte
Confidential: No
Date: 2023-05-19 19:58:08
  Visual vs Textual Features: Towards Instance Document Layout Segmentation.
Document layout segmentation (DLS) is the task of identifying the different layout elements such as text, images, tables, and graphs, in a document image. One of the critical decisions in this task is the choice of features used to represent the layout elements. In this thesis, we compare the performance of visual with textual features and multi-modal (visual + textual) features in DLS.
Extended abstract: Download PDF
Academic Supervisor: Josep Llados
Supervisor e-mail: josep@cvc.uab.cat
Institution: UAB
pre-Assigned Student Name: Ayan Banerjee
Confidential: No
Date: 2023-05-26 12:01:59
  Diffusion Models for Replay in Continual Learning
Continual learning is a significant challenge in machine learning, involving learning from a continuous stream of data while retaining past knowledge. Replay-based methods have shown promise in mitigating catastrophic forgetting but suffer from computational costs and limited sample diversity. This project aims to overcome these limitations by leveraging diffusion models for continual learning.
Academic Supervisor: Ernest Valveny
Supervisor e-mail: ernest.valveny@cvc.uab.cat
Institution: UAB
co-Supervisor: Pau Rodriguez
co-Supervisor e-mail: pau.rodri1@gmail.com
Assigned Student Name: Sergi Masip Cabeza
Student e-mail: Sergi.Masip@autonoma.cat
pre-Assigned Student Name: Sergi Masip
Confidential: No
Date: 2023-05-30 16:22:43

Past year proposals.
  4D Neural Models from uncalibrated videos
The estimation of 4D shape reconstruction normally relies on sophisticated pre-defined models or exhaustive systems of capture. Unfortunately, the generality of these solutions does not scale to a wide variety of challenging objects in nature. In this project, we will present an uncalibrated, differentiable and self-supervised algorithm to recover high detailed 4D reconstructions from still video.
Academic Supervisor: Antonio Agudo
Supervisor e-mail: aagudo@iri.upc.edu
Institution: UPF
Assigned Student Name: Sergio Montoya de Paco
Student e-mail: Sergio.MontoyaDePaco@autonoma.cat
Confidential: No
Date: 2022-06-07 15:52:01
  Text conditional image generation
In this project, we are interested in generating images that match a given text. Especially, we will work with fashionGEN dataset which contain fashion outfits with their text description.
Academic Supervisor: David Vazquez Bermudez
Supervisor e-mail: david.vazquez@servicenow.com
Assigned Student Name: Sergi García Sarroca
Student e-mail: Sergi.GarciaSa@autonoma.cat
pre-Assigned Student Name: Sergi Garcia Sarroca
Confidential: No
Date: 2022-05-31 13:41:19
  2-phase crossmodal search combining Dual Encoders and Visual-Language Models
2-phase crossmodal search combining Dual Encoders and Visual-Language Models
Academic Supervisor: Lluis Gomez
Supervisor e-mail: lgomez@cvc.uab.es
Institution: UAB
co-Supervisor: Dimosthenis Karatzas
co-Supervisor e-mail: dimos@cvc.uab.es
Assigned Student Name: Joan Fontanals Martinez
Student e-mail: Joan.Fontanals@autonoma.cat
pre-Assigned Student Name: Joan Fontanals
Confidential: No
Date: 2022-05-24 22:42:39
  Methods to minimize human interaction in labeling images for flora and fauna visual identification
The research is part of the UPC participation in the XPrize Rainforest challenge. The goal is identifying the flora and fauna in a given rainforest area and in a limited amount of time. Successful image classification systems are data-hungry. Our challenge is the lack of labeled images. We will explore methods to solve our classification problem with the minimum interaction in the labeling task
Academic Supervisor: Ferran Marques
Supervisor e-mail: ferran.marques@upc.edu
Institution: UPC
co-Supervisor: Antonio Torralba
co-Supervisor e-mail: torralba@csail.mit.edu
Assigned Student Name: Laia Albors Zumel
Student e-mail: Laia.Albors@autonoma.cat
pre-Assigned Student Name: Laia Albors Zumel
Confidential: No
Date: 2022-05-23 10:57:08
  Multimodal Data Representations for the Analysis of Social Media
Deep Learning for Multimodal data representation is a fundamental technique to integrate data of different types into a common space. With this project we intend to explore multimodal representations for the automatic detection of hate speech and fake news in social media posts
Extended abstract: Download PDF
Academic Supervisor: Ernest Valveny
Supervisor e-mail: Ernest.Valveny@uab.cat
Institution: UAB
co-Supervisor: Dimosthenis Karatzas
co-Supervisor e-mail: dimos@cvc.uab.es
Confidential: No
Date: 2022-05-17 14:36:12
  Synthesis of Virtual Avatar Animations from Sign Language Videos
Avatar synthesis is one of the most important and challenging tasks when it comes to sign language synthesis. In this project I will provide a system to convert sign animations from sign language videos. Additionally, a novel automatic system to generate a dataset will be proposed. Finally, an evaluation of different approaches to generate realistic sign animations will also be presented.
Extended abstract: Download PDF
Academic Supervisor: Coloma Ballester
Supervisor e-mail: coloma.ballester@upf.edu
Institution: UPF
co-Supervisor: Gloria Haro
co-Supervisor e-mail: gloria.haro@upf.edu
Assigned Student Name: Víctor Ubieto Nogales
Student e-mail: Victor.Ubieto@autonoma.cat
Confidential: No
Date: 2022-05-15 21:30:58
  Thermal event forecasting from video analysis in Wendelstein 7-X
Wendelstein 7-X is a Stellarator fusion prototype. UPC collaborates with IPP-MPI for a new operation phase to start in 2022. Data is available for research to develop image processing and deep learning tools for detection, tracking and classification of thermal events on Plasma Facing Components, and for the estimation of their evolution.
Extended abstract: Download PDF
Academic Supervisor: Josep R. Casas
Supervisor e-mail: josep.ramon.casas@upc.edu
Institution: UPC
co-Supervisor: Philippe Salembier, Aleix Puig-Sitjes
co-Supervisor e-mail: philippe.salembier@upc.edu
Confidential: No
Date: 2022-05-10 19:56:36
  Self-supervised learning of multimodal representations in food recipes
This project aims at developing a self-supervised deep learning approach to learn video representations of food recipes jointly from either video frames and textual descriptions or from video frames and their corresponding audio signal. The learned representation will be evaluated on the downstream tasks of action prediction and action localization on different datasets related to cooking.
Academic Supervisor: Coloma Ballester
Supervisor e-mail: coloma.ballester@upf.edu
Institution: UPF
co-Supervisor: Mariella Dimiccoli; Gloria Haro
co-Supervisor e-mail: mdimiccoli@iri.upc.edu; gloria.haro@upf.edu
Assigned Student Name: Igor Ugarte Molinet
Student e-mail: Igor.Ugarte@autonoma.cat
Confidential: No
Date: 2022-05-06 12:03:11
  Classifying and processing the information from the electricity bills of companies in Spain
In this project, we intend to extract information from electricity bills —such as euros or kilowatts spent— from some Spanish companies. For this, we need a machine learning algorithm that performs optical character recognition and classification, so that it is available to identify and classify the information contained in the bills. It should also deal with the clients’ private information.
Extended abstract: Download PDF
Academic Supervisor: Oriol Ramon
Supervisor e-mail: oriolrt@cvc.uab.cat
co-Supervisor: Dr. Melanie Revilla, Patricia Iglesias
co-Supervisor e-mail: melanie.revilla@upf.edu; patricia.iglesias@upf.edu
Assigned Student Name: Yu Pang
Student e-mail: Yu.Pang@autonoma.cat
Confidential: No
Date: 2022-04-04 09:41:50
  Audio-visual speech and singing voice separation 
Source separation is the automatic estimation of the individual isolated sources that make up the audio mixture. The goal of this project is to separate a human voice in a mixture by using both the audio and video modalities. Leveraging visual and motion information from the target person-s face is particularly useful when there are different voices present in the mixture.
Extended abstract: Download PDF
Academic Supervisor: Gloria Haro
Supervisor e-mail: gloria.haro@upf.edu
Institution: UPF
Assigned Student Name: Eudald Ballescà Casas
Student e-mail: Eudald.Ballesca@autonoma.cat
Confidential: No
Date: 2022-03-25 16:17:35
  Multiple and Diverse Image Colorization
Image colorization is a problem with multiple possible solutions. The aim of this project is to explore which is the best approach (i.e., transformers, capsule networks,...) to tackle this one-to-many problem yielding plausible colorization results being both spatially and semantically coherent.
Extended abstract: Download PDF
Academic Supervisor: Coloma Ballester
Supervisor e-mail: coloma.ballester@upf.edu
Institution: UPF
co-Supervisor: Lara Raad Cisa, Patricia Vitoria
co-Supervisor e-mail: lara.raadcisa@esiee.fr; patricia.vitoria@upf.edu
Confidential: No
Date: 2022-03-18 14:39:44
  Assisting people in successfully sorting their waste by using images
We aim to help people successfully sort their waste by using their smartphone camera. To do that, we need a mobile-friendly machine learning algorithm that performs image classification of the items to be recycled, and provides, as an outcome, the type of waste based on the main categories used in Spain: paper/carton, glass, plastic, green (organic waste), or garbage.
Extended abstract: Download PDF
Academic Supervisor: Dr. Melanie Revilla
Supervisor e-mail: melanie.revilla@upf.edu
Institution: UPF
co-Supervisor: Carlos Ochoa, Patricia Iglesias
co-Supervisor e-mail: carlos.ochoa@upf.edu, patricia.iglesias@upf.edu
Confidential: No
Date: 2022-03-14 12:31:25
  Knowledge-base for TextCaps and CTC
In this project, we will explore Cross-modal Retrieval, where the task is to provide a matching caption to a given image or vice-versa. The main idea is to create an external knowledge base or graph that captures relations between objects, scenes and scene-text. Later, the learned representation can be employed to assess which modalities can be employed to yield an improved retrieval performance.
Extended abstract: Download PDF
Academic Supervisor: Andres Mafla; Ali Furkan Biten
Supervisor e-mail: amafla@cvc.uab.es; abiten@cvc.uab.es
Institution: UAB
co-Supervisor: Dimosthenis Karatzas; Lluis Gomez
co-Supervisor e-mail: lgomez@cvc.uab.es; dimos@cvc.uab.es
Confidential: No
Date: 2022-03-10 17:28:41
  Image retrieval with text modifiers
In this project, we will study the task of image retrieval, where the input query is specified in the form of an image plus some text that describes desired modifications to the input image.
Extended abstract: Download PDF
Academic Supervisor: Ali Furkan Biten; Andres Mafla;
Supervisor e-mail: abiten@cvc.uab.es; amafla@cvc.uab.es
Institution: UAB
co-Supervisor: Dimosthenis Karatzas; Lluis Gomez
co-Supervisor e-mail: lgomez@cvc.uab.es; dimos@cvc.uab.es
Confidential: No
Date: 2022-03-10 17:27:10
  Hierarchical Model for Cross Modal Retrieval
In this project, we will try to explore cross modal retrieval (CMR) where the task is to retrieve an image given its captions or retrieve a caption given an image. The main idea of the project is to create CMR models that can perform retrieval across different datasets by creating hierarchical embeddings. We will try to answer the hypothesis that learning concepts hierarchically will result in bet
Extended abstract: Download PDF
Academic Supervisor: Ali Furkan Biten; Dimosthenis Karatzas
Supervisor e-mail: abiten@cvc.uab.es; dimos@cvc.uab.es
Institution: UAB
co-Supervisor: Andres Mafla; Lluis Gomez
co-Supervisor e-mail: amafla@cvc.uab.es; lgomez@cvc.uab.es;
Confidential: No
Date: 2022-03-10 17:25:19
  Understanding the dynamics of human interactions
Understanding social interactions in detail means understanding a large collection of social signals and social dynamics. In this project we will work on developing explainable deep learning models to approach fine-grain classification tasks related to the understanding of the dynamics of dyadic interactions.
Extended abstract: Download PDF
Academic Supervisor: Agata Lapedriza
Supervisor e-mail: alapedriza@uoc.edu
Institution: UOC
Assigned Student Name: José Manuel López Camuñas
Student e-mail: JoseManuel.LopezCam@autonoma.cat
pre-Assigned Student Name: Jose Manuel Lopez Camuñas
Confidential: No
Date: 2022-02-28 20:35:31
  Image Sentiment Analysis using commonsense knowledge
Image Sentiment Analysis is the problem of recognizing the emotions that images evoke to humans. In this project we will explore the use of external commonsense knowledge to improve the accuracy and generalization capabilities of the deep learning models for image sentiment analysis. For that we will create bimodal neural networks that will incorporate semantic reasoning.
Extended abstract: Download PDF
Academic Supervisor: Agata Lapedriza
Supervisor e-mail: alapedriza@uoc.edu
Institution: UOC
Assigned Student Name: Guang Jun Du
Student e-mail: GuangJun.Du@autonoma.cat
pre-Assigned Student Name: Guang Jun Du
Confidential: No
Date: 2022-02-28 20:29:31
  Automatic detection of fiber-cement roofs in aerial images
Some types of fiber-cement roofs can contain highly toxic materials. Although these types of roofs are currently banned, there are still many roofs that contain these toxic materials. Unfortunately the location of these types of roofs is often unknown. In this project we will work on the development of computer vision systems for the automatic detection of fiber-cement roofs in aerial images.
Extended abstract: Download PDF
Academic Supervisor: Agata Lapedriza
Supervisor e-mail: alapedriza@uoc.edu
Institution: UOC
co-Supervisor: Javier Borge
Assigned Student Name: Kevin Martín Fernández
Student e-mail: Kevin.MartinF@autonoma.cat
pre-Assigned Student Name: Kevin Martin Fernandez
Confidential: No
Date: 2022-02-28 20:25:36
  Automated count of fish species from submarine videos.
The research is a joint collaboration with the CISC Marine Science Institute. The MSC recorded thousands of hours of video from the Mediterranean sea, and need to automatically count the number of times that specific strategic fish species appear in the footage, to estimate the population before and after doing ecological interventions. Drop me an email for more information (dmasipr@uoc.edu).
Academic Supervisor: David Masip
Supervisor e-mail: dmasipr@uoc.edu
Institution: UOC
Assigned Student Name: David Serrano Lozano
Student e-mail: David.SerranoL@autonoma.cat
Confidential: No
Date: 2022-02-24 13:07:07
  On generating plausible RAW data
Data augmentation is a drawback for training neural networks. Currently, it is performed using conversions that are not realistic, hindering the results. In this project, we will improve on this problem by generating plausible augmentations. Given a dataset, we will generate a plausible version of it in RAW format, from where we will be able to generate different realistic augmentations.
Extended abstract: Download PDF
Academic Supervisor: Javier Vázquez Corral
Supervisor e-mail: javier.vazquez@cvc.uab.cat
Institution: UAB
Confidential: No
Date: 2022-02-22 16:15:04
  Color manipulation for photographic enhancement
Creating models that given an unprocessed image output an image that mimics the result of professional photographers is currently a hot topic. However, current models are cumbersome, and different from the few modifications that photographers are allowed to perform. We will tackle this problem by developing a model that can be expressed as a cascade of standard image photographic enhancements.
Extended abstract: Download PDF
Academic Supervisor: Javier Vázquez Corral
Supervisor e-mail: javier.vazquez@cvc.uab.cat
Institution: UAB
Assigned Student Name: Marcos V Conde Osorio
Student e-mail: MarcosV.Conde@autonoma.cat
Confidential: No
Date: 2022-02-22 15:59:29
  Can biological solutions help computers perceive symmetry?
Symmetry is an important visual cue for a wide range of biological organisms regardless of size and cognitive ability. The perception of symmetry is important for object processing by facilitating target recognition and identification. Although easy for humans, it is very challenging for computers -it has been proposed as a robust -captcha- by Funk & Liu (CVPR2016).
Extended abstract: Download PDF
Academic Supervisor: C. Alejandro Parraga
Supervisor e-mail: Alejandro.Parraga@cvc.uab.es
Institution: UAB
Confidential: No
Date: 2022-02-18 11:32:03
  Biologically-inspired tone mapping
The processing of high dynamic range images into lower dynamic range representations is something that the human visual system easily does all the time. However, this task (i.e. Tone Mapping or TM) is very difficult for machines. Our aim is to apply well-known primate visual mechanisms to tone-mapping algorithms to improve their performance.
Extended abstract: Download PDF
Academic Supervisor: C. Alejandro Parraga
Supervisor e-mail: Alejandro.Parraga@cvc.uab.es
Institution: UAB
Confidential: No
Date: 2022-02-18 11:27:28
  White balance in the presence of mixed colour illuminations
White balance (WB) algorithms compensate for the colours of illuminants. For example, tungsten lights introduce a yellowish cast. Unfortunately, many scenes exhibit a combination of illuminants (e.g., artificially lit indoor scenes plus light from a window). In these cases, WB is a challenging task. Our aim is to extend the capabilities of existing WB algorithms to mixed coloured illuminants.
Extended abstract: Download PDF
Academic Supervisor: C.Alejandro Parraga
Supervisor e-mail: Alejandro.Parraga@cvc.uab.es
Institution: UAB
Confidential: No
Date: 2022-02-18 11:22:43
  Sketch2Code
Designing structured apps such as websites is a labor-intensive process from the side of front-end developers and designers. Recent advances in CV and NLP with Transformers have made possible to generate HTML from simple sketches [1]. In this project, we propose developing new sketch2code techniques for more structured formats such as forms. [1] https://sketch2code.azurewebsites.net/
Academic Supervisor: Pau Rodríguez
Supervisor e-mail: pau.rodriguez@servicenow.com
Institution: UAB
co-Supervisor: David Vazquez
Assigned Student Name: Juan Antonio Rodríguez García
Student e-mail: JuanAntonio.RodriguezG@autonoma.cat
Confidential: No
Date: 2022-02-14 19:33:00
  Handwritten Music Recognition
Despite the raise of deep learning, the recognition of old handwritten scores is far to be solved, because labeled data to train is barely available and the high variability in the handwriting styles. Thus, this work will be focused on proposing deep learning methodologies for historical handwritten scores, taking into account the particularities of graphical music notation.
Academic Supervisor: Alicia Fornes
Supervisor e-mail: afornes@cvc.uab.es
Institution: UAB
Assigned Student Name: Pau Torras Coloma
Student e-mail: Pau.Torras@autonoma.cat
pre-Assigned Student Name: Pau Torras
Confidential: No
Date: 2022-02-10 13:23:59
  Fruit tracking using RGB-D data
The goal of this project is to capture a video sequence of fruit trees using a camera that travels along the row of trees. Most of the fruits that are occluded in a frame will be visible in another one. To avoid counting each fruit multiple times, the fruits must be given a unique ID. This can be achieved using object tracking.. RGB-D data will be used to improve the performance.
Extended abstract: Download PDF
Academic Supervisor: Josep Ramon Morros
Supervisor e-mail: ramon.morros@upc.edu
Institution: UPC
co-Supervisor: Jordi Gené Mola
co-Supervisor e-mail: jordi.genemola@udl.cat
Assigned Student Name: Francesc Net Barnés
Student e-mail: Francesc.Net@autonoma.cat
Confidential: No
Date: 2022-02-08 11:39:31
  ERP decoding to classify self-made and externally generated errors
In the present project, we want to explore the application of ERPs in response to self-made errors and externally generated errors, and to investigate the accuracy of decoders. We will use different ERP descriptors and classifiers, at a single-trial basis, to decode internal and external causes of errors.
Extended abstract: Download PDF
Academic Supervisor: Xim Cerdá Company
Supervisor e-mail: xcerda@cvc.uab.cat
Institution: UAB
co-Supervisor: Alba Gómez Andrés
co-Supervisor e-mail: agomezandres@gmail.com
Confidential: No
Date: 2022-02-04 08:54:30
  3D fruit detection and size estimation using graph neural networks
The detection and measurement of fruit size is of great interest to estimate the crop and predict harvest resources. Nowadays, different sensors are able to register fruit trees into a 3D map of the environment. The main goal of this thesis will be to explore and design new deep learning architectures based on Graph Neural Networks to detect fruits and estimate their size.
Extended abstract: Download PDF
Academic Supervisor: Javier Ruiz Hidalgo
Supervisor e-mail: j.ruiz@upc.edu
Institution: UPC
co-Supervisor: Jordi Gené Mola
co-Supervisor e-mail: jordi.genemola@udl.cat
Assigned Student Name: Ignacio Galve Ceamanos
Student e-mail: Ignacio.Galve@autonoma.cat
Confidential: No
Date: 2022-02-02 16:03:32