In this Tutorial, we aim to directly recreate some of our "aha"moments when exploring the impact of heat diffusion on the spatial resolution limit of photothermal imaging. Our objective is also to communicate how this physical limit can nevertheless be overcome and include some concrete technological applications. Describing diffusion as a random walk, one insight is that such a stochastic process involves not only a Gaussian spread of the mean values in space, with the variance proportional to the diffusion time, but also temporal and spatial fluctuations around these mean values. All these fluctuations strongly influence the image reconstruction immediately after the short heating pulse. The Gaussian spread of the mean values in space increases the entropy, while the fluctuations lead to a loss of information that blurs the reconstruction of the initial temperature distribution and can be described mathematically by a spatial convolution with a Gaussian thermal point-spread-function. The information loss turns out to be equal to the mean entropy increase and limits the spatial resolution proportional to the depth of imaged subsurface structures. This principal resolution limit can only be overcome by including additional information such as sparsity or positivity. Prior information can be also included by using a deep neural network with a finite degrees of freedom and trained on a specific class of image examples for image reconstruction.