On January the 26th, 2017, we organize the third day on machine learning in astrophysics at DAp, CEA Saclay.

**Program:**

All talks are taking place at DAp, Salle Galilée (Building 713)

10:00 - 10:45h. **Marc Duranton **** **(CEA Saclay)

10:45 - 11:15h. **Rémi Flamary** (Université Nice-Sophia Antipolis)

11:15 - 11:45h. **Christoph Ernst René Schäfer **(EPFL)

12:00 - 13:30h. Lunch

13:30 - 14:00h. **Emille Ishida** (Laboratoire de Physique de Clermont)

14:00 - 14:30h. **Silvia Villa **(Politecnico di Milano)

14:30 - 15:00h. **Arthur Pajot** (LIP6)

15:00 - 15:30h. **Morgan Schmitz **(CEA Saclay - CosmoStat)

15:30 - 16:00h. Coffe break

16:00 - 17:00h. Round table

### Deep Learning for Physical Processes: Incorporating Prior Scientific Knowledge

**Arthur Pajot** **(LIP6)**

We consider the use of Deep Learning methods for modeling complex phenomena like those occurring in natural physical processes. With the large amount of data gathered on these phenomena the data intensive paradigm could begin to challenge more traditional approaches elaborated over the years in fields like maths or physics. However, despite considerable successes in a variety of application domains, the machine learning field is not yet ready to handle the level of complexity required by such problems. Using an example application, namely Sea Surface Temperature Prediction, we show how general background knowledge gained from physics could be used as a guideline for designing efficient Deep Learning models.

### Wasserstein dictionary Learning

**Morgan Schmitz** **(CEA Saclay - CosmoStat)**

Optimal Transport theory enables the definition of a distance across the set of measures on any given space. This Wasserstein distance naturally accounts for geometric warping between measures (including, but not exclusive to, images). We introduce a new, Optimal Transport-based representation learning method in close analogy with the usual Dictionary Learning problem. This approach typically relies on a matrix dot-product between the learned dictionary and the codes making up the new representation. The relationship between atoms and data is thus ultimately linear.

We instead use automatic differentiation to derive gradients of the Wasserstein barycenter operator, and we learn a set of atoms and barycentric weights from the data in an unsupervised fashion. Since our data is reconstructed as Wasserstein barycenters of our learned atoms, we can make full use of the attractive properties of the Optimal Transport geometry. In particular, our representation allows for non-linear relationships between atoms and data.

###