One of the main goals of the European Green Deal is to reduce the emission of greenhouse gases to zero by 2050. Consequently, large industrial installations inevitably constitute a fundamental area of focus. At the same time, however, the European Union is interested in keeping its global economic competitiveness. Therefore, mapping the environmental capabilities of EU countries to check if they match the need for clean technology as implied by the European Green Deal becomes crucial. One way to achieve this is to retrieve geo-localized documents describing R&D activities, in particular patents, across the EU and the world. Under a contract with the Joint Research Center (JRC) of the European Union, we carried out a project, Patents4IPPC, consisting of building an Information Retrieval (IR) engine capable of fetching relevant patents based on specific subsections of official documents called BREF. Following recent advances in the field of Natural Language Processing (NLP), we build a retrieval system based on the Transformer architecture. We were then able to solve some language mismatches and speed up computation using FAISS indexing.
Cristiano De Nobili
Cristiano De Nobili is a Theoretical Physicist (Ph.D. @SISSA) now working mainly in AI applied to environmental impactful projects. Previously NLP & Deep Learning Scientist @ Samsung's Bixby project. Machine Learning lecturer for several masters in the private and public sector. TEDx speaker and science communication enthusiast (twitter @denocris, IG @denocris).
Francesco Cariaggi is a Software Engineer and AI enthusiast with a Master's Degree in Computer Science from the University of Pisa, Italy. He currently works at Pi School, where he helps businesses grow thanks to AI (twitter @fcariaggi).