Research Teams and Lines

Group Seminars

Year 2024

Seminar of Gianluca Manzan: 09/04/2024

Title: Efficiency limits of Restricted Boltzmann Machines in a Teacher-Student Framework

Abstract:
Unsupervised Machine learning with Boltzmann machines is the inverse problem of finding a suitable Gibbs measure to approximate an unknown probability distribution from a training set consisting of a large amount of samples. The minimum size of the training set necessary for a good estimation depends on both the properties of the data and of the machine. We investigate this problem in a controlled environment where a Teacher Restricted Boltzmann machine (T-RBM) is used to generate the dataset and another Student machine (S-RBM) is trained with it. We consider different classes of unit priors and weight regularizers and we analyze both the informed and mismatched cases, viewed as the amount of information the Student receives about the Teacher model. We describe the results in terms of phase transitions in the Student posterior distribution, interpreted as a statistical mechanics system. In the analysis we give special attention to the Hopfield model scenario, where the problem is expressed in terms of phase diagrams, describing the zoology of the possible working regimes of the entire environment. In this present case it is possible to observe the differences between memorization and learning approach. When data become large and confused the learning methodology overcomes memorization.

Link to article: Applied mathematics and Computation

Seminar of Misaki Ozawa: 23/01/2024

Title: Renormalization Group Approach for Machine Learning Hamiltonian

Abstract:
Reconstructing, or generating, Hamiltonian associated with high dimensional probability distributions starting from data is a central problem in machine learning and data sciences. We will present a method —The Wavelet Conditional Renormalization Group —that combines ideas from physics (renormalization group theory) and computer science (wavelets, Monte-Carlo sampling, etc.). The Wavelet Conditional Renormalization Group allows reconstructing in a very efficient way classes of Hamiltonians and associated high dimensional distributions hierarchically from large to small length scales. We will present the method and then show its applications to data from statistical physics and cosmology.

Link to article: PRX 2023

Year 2023

Seminar of Claudio Chilin: 20/11/2023

Title: The Hopfield model towards modern generalisations 

Abstract:
The Hopfield model is one of the few examples of neural computation systems that allows for an analytical solution via the tools of statistical physics. The original formulation makes use of the inefficient Hebb rule, that can store a number P of uncorrelated examples that is P=0.138N, where N is the number of neurons composing the network. In the light of the results of modern machine learning, some improvements to the usual treatment can be suggested: the use of correlated data and of improved learning protocols - the classic Hebbian unlearning algorithm and a modern non-destructive version, the Daydreaming algorithm -. The combination of these elements suggests the presence of previously unobserved behaviours of this model.

Link to article: openreview

Seminar of Carlo Lucibello: 16/10/2023

Title: The Exponential Capacity of Dense Associative Memories

Abstract:
Recent generalizations of the Hopfield model of associative memories are able to store a number P of random patterns that grows exponentially with the number N of neurons, P=exp(αN). Besides the huge storage capacity, another interesting feature of these networks is their connection to the attention mechanism which is part of the Transformer architectures widely applied in deep learning. In this work, we consider a generic family of pattern ensembles, and thanks to the statistical mechanics analysis of an auxiliary Random Energy Model, we are able to provide exact asymptotic thresholds for the retrieval of a typical pattern, α1, and lower bounds for the maximum of the load α for which all patterns can be retrieved, αc. Additionally, we characterize the size of the basins of attractions. We discuss in detail the cases of Gaussian and spherical patterns, and show that they display rich and qualitatively different phase diagrams.
 


Seminar of Matteo Negri: 05/07/2023

Title: Storage, Learning and Daydreaming in the Random-Features Hopfield Model

Abstract:
The Hopfield model is a paradigmatic model of neural networks that has been analyzed for many decades in the statistical physics, neuroscience, and machine learning communities. Inspired by the manifold hypothesis in machine learning, we propose and investigate a generalization of the standard setting that we name "Random-Features Hopfield Model”: here, the binary patterns are (non-linear) superpositions of binary features vectors. Besides the usual retrieval phase, where the patterns can be dynamically recovered from some initial corruption, we uncover a new phase where the features hidden in the data can be recovered instead. We call this phenomenon the "learning phase transition", as the features are not explicitly given to the model but rather are inferred from the patterns in an unsupervised fashion. Starting from this, we explore the effect algorithms that increase the storage capacity of the model, identifying behaviours that are surprisingly close to Restricted Boltzmann Machines. This could be a promising theoretical framework to understand the generalization capabilities of more complex neural networks.
 
 

Seminar of Nicolas Béreux: 24/05/2023

Title: Learning a restricted Boltzmann machine using biased Monte Carlo sampling

Abstract:
Restricted Boltzmann Machines are a generative model able to learn any complex distribution from a dataset. Its simple structure makes it particularly useful for interpretability and pattern extraction applications. RBMs, like other energy-based generative models, struggle to describe highly structured data, mainly because their training relies on costly Markov Chain Monte Carlo (MCMC) processes and the cost of sampling multimodal distributions is prohibitive. In particular, we observe that RBMs perform dramatically poorly on artificial low-dimensional clustered datasets.
In our work, we investigate a biased sampling method named Tethered Monte Carlo (TMC) to overcome this limitation. This method allows to properly sample such low dimensional datasets in a significantly shorter time, leading to a more accurate likelihood gradient during training, allowing the RBM to accurately learn such datasets. This method can also be used to retrieve the distribution learned by the RBM after training, allowing to assess the quality of the training.
However, this method breaks the intra-layer independence of the RBMs which forbids the parallelisation of the MCMC updates, limiting the size of the model we can use.
 
 

Seminar of Miguel Ruiz García: 24/02/2023

Title: Loss function landscapes, how their structure can determine the fate of constraint satisfaction problems and machine learning

Abstract:
Optimizing a loss function in high-dimensional space lies at the heart of machine learning and other constraint satisfaction problems. The structure of these landscapes and the optimization method used to find solutions can determine the final outcome in both scenarios. In this talk, we will explore some similarities and differences between constraint satisfaction problems and supervised classification tasks, and show how analogies between the two fields can be exploited to propose new ideas. Time permitting we will discuss work in progress bridging these topics to edge-of-stability optimization.
 
Link to article: ICML'21
 

Seminar of Federico Ricci-Tersenghi: 31/01/2023

Title: The spin glass physics behind hard inference problems

Abstract:
I will start by introducing some basic inference problems, like finding the hidden partition or the proper coloring in a random graph. Then I will discuss the connection between statistical physics and Bayesian inference, showing the key role of phase transitions in inference problems: indeed the behaviour of Bayes-optimal inference algorithms (like Belief Propagation) is tightly connected to these phase transitions. Moving away from the Bayes-optimal setting, much less is known for these problems and the spin glass physics naturally arises. I will present a phenomenological theory which is able to predict the limits and the performances of solving algorithms based on Monte Carlo methods. The theory is based on our knowledge of mean-field spin glasses and witnesses the central role spin glass physics (awarded the Nobel Prize in 2021) plays in modern machine learning.
 

Year 2022

Seminar of Cyril Furtlehner: 14/12/2022

Title: Free Dynamics of Feature Learning Processes

Abstract:
Regression models usually tend to recover a noisy signal in the form of a combination of regressors, also called features in machine learning, themselves being the result of a learning process. The alignment of the prior covariance feature matrix with the signal is known to play a key role in the generalization properties of the model, i.e. its ability to make predictions on unseen data during training. We present a statistical physics picture of this learning process. First, we revisit the ridge regression to obtain compact asymptotic expressions for train and test errors thanks to free probabilities, rendering manifest the conditions under which efficient generalization occurs. Then we derive an autonomous dynamical system in terms of elementary degrees of freedom of the problem determining the evolution of the relative alignment between the population matrix and the signal. Various dynamical mechanisms are unveiled, allowing one to interpret the dynamics of simulated learning processes and to reproduce trajectories of single experimental run with high precision.
 
 

Seminar of Ilaria Paga: 22/11/2022

Title: Memory and rejuvenation in spin glasses: numerical simulations meet experiments.

Abstract:
Memory and rejuvenation effects in the magnetic response of off-equilibrium spin glasses have been widely regarded as the doorway into the experimental exploration of ultrametricity and temperature chaos (maybe the most exotic features in glassy free-energy landscapes). Unfortunately, despite more than twenty years of theoretical efforts following the experimental discovery of memory and rejuvenation, these effects have thus far been impossible to simulate reliably. Yet, three recent developments convinced us to accept this challenge: first, the custom-built Janus II supercomputer makes it possible to carry out ``numerical experiments'' in which the very same quantities that can be measured in single crystals of CuMn are computed from the simulation, allowing for parallel analysis of the simulation/experiment data. Second, Janus II simulations have taught us how numerical and experimental length scales should be compared. Third, we have recently understood how temperature chaos materializes in aging dynamics. All three aspects have proved crucial for reliably reproducing rejuvenation and memory effects on the computer. Our analysis shows that (at least) three different length scales play a key role in aging dynamics, while essentially all theoretical analyses of the aging dynamics emphasize the presence and the crucial role of a single glassy correlation length.
Ilaria
 

Seminar of Lorenzo Rosset: 20/10/2022

Title: Exploiting the learning dynamics of the Restricted Boltzmann Machine to construct relational trees of data

Abstract:
Unveiling the relational structure of a dataset is an interesting task that brings many practical applications. For instance, in biology, a relevant problem consists of reconstructing the evolutionary path of proteins by producing trees of homologous sequences.

In this talk, I will present you with a new and general method for building relational trees of data by leveraging the learning dynamics of the Restricted Boltzmann Machine (RBM). In the spirit of explainable Machine Learning, this method has its roots in the Mean Field approach developed in the context of disordered systems. The proposed algorithm can also be used to categorize the data by relying on minimal knowledge of the dataset. This approach yielded encouraging results when applied to three different real-world datasets and yet offers several possible directions of improvement.

Seminar of Elisabeth Agoritsas: 10/10/2022

Title: Towards a unifying picture of driven disordered systems

Abstract:

Disorder is ubiquitous in physical systems, and can radically alter their physical properties compared to their ‘pure’ counterparts. For instance, amorphous materials such as emulsions, foams, metallic glasses or biological tissues are all structurally disordered, and this has key implications for their rheological, mechanical or transport properties. Nevertheless, theoretical descriptions of such ‘driven' amorphous materials remain challenging, despite decades of extensive analytical and computational studies. The difficulties pertain to the interplay of competing sources of stochasticity, and to the resulting out-of-equilibrium nature of these systems. A standard model for amorphous materials, which allows one to focus on the key role of their structural (positional) disorder, is provided by dense many-body systems of pairwise interacting particles. Here I will introduce an exact Dynamical Mean-Field Theory (DMFT) for these many-body systems, derived in the limit of infinite spatial dimension. In this framework, the many- body Langevin dynamics of the whole problem can be exactly reduced to a single scalar effective stochastic process, and dynamical observables such as pressure or shear stress can be computed for arbitrary driving protocols. Using this DMFT, we were in particular able to establish a direct equivalence between a global forcing (external shear) and a random local forcing (reminiscent of active matter), upon a simple rescaling of the control parameter (the accumulated strain). In this framework, global shear is thus simply a special case of a much broader family of local forcing that can be explored by tuning its spatial correlations. Our predictions were moreover found to be in remarkably good agreement with two-dimensional numerical simulations. These results hint at a unifying framework for establishing rigorous analogies, at the mean-field level, between different families of driven disordered systems, such as sheared granular materials and active matter, or machine-learning algorithms.

 

Link to articles: Granular system, Out-of-Equilibrium dynamical system
Elisabeth

Seminar of Antonio Lasanta: 21/09/2022

Title: Understanding the function of paralogous protein sequences

Abstract:

En esta charla presentaré algunos resultados recientes tanto teóricos como experimentales sobre la relajación de sistemas sujetos a uno o dos procesos de enfriamiento repentino. En particular, mostraré que durante la evolución transitoria de esos sistemas y antes de alcanzar el estado de equilibrio o estacionario y bajo algunas condiciones concretas, aparecen algunos fenómenos sorprendentes y contraintuitivos

Seminar of Edoardo Sarti: 27/05/2022

Title: Understanding the function of paralogous protein sequences

Abstract:

One of the main ways organisms evolve new functional proteins is via a duplication event in their genome. When two copies of a gene are present, either the organism benefits from a larger concentration of the produced protein or the sequence of one of the two copies will accumulate mutations and diverge in evolution, often developing new functions. Annotating the function of paralogous sequences has always been very challenging both in small-scale, expert-guided assays and in large-scale bioinformatics studies, where paralogs are the most important source of functional annotation errors.
ProfileView is a novel computational method designed to functionally classify sets of homologous sequences. It constructs a library of probabilistic models accurately representing the functional variability of protein families, and extracts biologically interpretable information from the classification process. We have tested it on the 11 proteins composing the Calvin-Benson cycle, and obtained fully consistent results on 8 of them, and partially consistent results on other 2. The knowledge about paralog function annotation in the CBC is being now employed for matching same-function paralog sequences for producing joint MSAs for protein-protein interaction studies

 
SemEdo
 

Seminar of David Yllanes: 06/05/2022

Title: Geometric control of thermalised elastic sheets: crumpling and buckling

Abstract:
Studies of buckling and instabilities of thin plates date back more than two centuries. However, stability predictions, such as for the critical buckling load, can be dramatically altered for nano membranes (e.g graphene) when thermal fluctuations become important. Two-dimensional materials such as graphene or MoS2 currently enable the experimental study of the mechanical properties of thermalised elastic sheets and provide a testing ground for many longstanding theoretical and numerical predictions. Particularly striking is the possibility of engineering elastic parameters such as the bending rigidity and Young’s modulus over broad ranges simply by varying the overall size or temperature of atomically thin cantilevers and springs. Recent work by Blees et al., in addition to demonstrating a 4000-fold enhancement of the bending rigidity relative to its T = 0 value, has shown the potential of graphene as the raw ingredient of microscopic mechanical metamaterials. Employing the principles of kirigami (the art of cutting paper), one can construct robust microstructures, thus providing a route towards the design of mechanical metamaterials. 
 
In this talk, using molecular dynamics simulations and analytical computations, I will explore the effect of geometry on the mechanical properties of thermalised elastic sheets, focusing on the crumpling and buckling transitions. Thermalised elastic membranes without distant self-avoidance undergo a crumpling transition when the microscopic bending stiffness is comparable to kT.  We propose a mechanism to tune the onset of the crumpling transition by altering the geometry and topology of the sheet through a dense array of holes. The critical exponents for the perforated membrane are compatible with those of the standard crumpling transition. In addition, we study thin ribbons under longitudinal compression. We find that the buckling behavior can be described by a mean-field theory with renormalised elastic constants when the ribbon length is shorter than the persistence length.
 
 

Seminar of Giovanni Catania: 22/04/2022

Title: Approximate inference on discrete graphical models: message-passing, loop-corrected methods and applications

Abstract:
Probabilistic graphical models provide an unified framework to analyze systems of many interacting degrees of freedom, by combining elements of graph and probability theory. The computation of observables (or equivalently, marginal distributions) of discrete or semi-discrete probabilistic graphical models is a fundamental step in most Bayesian inference methods and high-dimensional estimation problems, such as the evaluation of equilibrium observables in statistical mechanics models.
In this talk, I will present a recently introduced family of computational schemes to estimate the marginals of discrete undirected graphical models, called Density Consistency (DC). This method inherits properties of two existing and related algorithms, Expectation Propagation (EP) and Belief Propagation (BP, also known in statistical physics as the Bethe Approximation): it shares with EP the ability to treat correlations of variables arising from loops of any length, and as BP, it provides the exact marginal densities when the underlying graph is acyclic. The novelty introduced by DC relies on a simple way to include self-consistently the effect of correlations in the cavity graph.
Results on finite-size Ising-like models show significant improvements in the estimation of equilibrium observables with respect to other state-of-the-art techniques. Moreover, I will discuss how to exploit DC as an advanced mean-field theory for the Ising ferromagnet, in which case it provides a remarkably good estimate of the critical point at finite dimensions (larger or equal than 3). Finally, a simple application to the Inverse Ising Problem is discussed, where DC allows to derive closed-form expressions for the inferred couplings and fields in terms of the first and second empirical moments, outperforming other mean-field like methods in the case of non-zero magnetization at low temperatures, as well as in presence of random external fields.
 
 
 
Seminar of Alex Posaz-Kerstjens: 11/02/2022
 
Title: Boltzmann machines without spin glasses
 
Abstract: Boltzmann machines are generative machine learning models with great appeal within the physics community, since they can be studied under the lens of statistical mechanics. In fact, many methods for training and evaluating Boltzmann machines are directly based on the tools routinely used in the statistical physics community, and many insights come from it. In this talk I will describe some contributions in this direction. First, I will argue that initializing Boltzmann machines in a way equivalent to the Sherrington-Kirkpatrick spin glass model, despite being routine in the machine learning community, constitutes a bottleneck in training. However, this bottleneck is avoidable. I will present a regularization technique that allows one to control the amount of minima in the energy landscape of the Boltzmann machine, thus avoiding the spin-glass phase at every stage of training. Furthermore, this regularization suggests a new way of computing the gradient updates with significant computational savings.
 
 
 

Year 2021

Seminar of Tony Bonnaire: 27/10/2021

Title: The principal graph of the cosmic web: learning patterns in point cloud datasets

Abstract:
The spatial distribution of galaxies in the Universe is not uniform but rather stands on a gigantic structure commonly called the "cosmic web". In this pattern, dense nodes are linked together by elongated bridges of matter named filaments that are playing a key role in the formation and evolution of galaxies but also carries information about the cosmological model.In this presentation, we build several algorithms aiming at learning patterns in point-cloud datasets such as the galaxy distribution. We particularly focus on two kinds of patterns, with first clustered-type ones in which the data points are separated in the input space into multiple groups. We will show that the unsupervised clustering procedure performed with a Gaussian Mixture Model can be formulated in terms of a statistical physics optimization problem. This formulation enables the unsupervised extraction of much key information about the dataset itself, like the number of clusters, their size and how they are embedded in space, particularly interesting for high-dimensional input spaces where visualization is not possible.On the other hand, we study spatially continuous datasets assuming as standing on an underlying 1D structure that we aim to learn. To this end, we resort to a regularization of the Gaussian Mixture Model in which a spatial graph is used as a prior to approximate the underlying 1D structure. The overall graph is efficiently learnt by means of the Expectation-Maximisation algorithm with guaranteed convergence and comes together with the learning of the local width of the structure.