diff --git a/.gitignore b/.gitignore index 7ae1fc9..b5a79c7 100644 --- a/.gitignore +++ b/.gitignore @@ -2,6 +2,8 @@ notes main.pdf texmf +.vscode +feedback ## Core latex/pdflatex auxiliary files: *.aux diff --git a/.vscode/ltex.dictionary.en-US.txt b/.vscode/ltex.dictionary.en-US.txt deleted file mode 100644 index f747af1..0000000 --- a/.vscode/ltex.dictionary.en-US.txt +++ /dev/null @@ -1,47 +0,0 @@ -SimCLR -SwAV -PMC-HHG -AUPRG -ImageNet -AUPR -CCMIL -VarMIL -kurtosis -pilocytic -astrocytoma -mpp -Pilocytic -astrocytomas -astrocytes -mpp -Lanczos -EntropyMasker -IoU -Máxima -ependymoma -SCLICOM -informedness -imgclsmob -DeepMIL -TinyClinicalBERT -Huggingface -Optuna -Torchmetrics -pyprg -Dataverse -pilocytic -Craniopharyngioma -Ganglioglioma -Ependymoma -Dysembryoplastic -neuroepithelial -Subependymal -SimCLR -iso-accuracy -iso-accuracies -SCLICOM -intratumor -parenchyma -OOV -ELMo -Informedness diff --git a/ANN/hhg.tex b/ANN/hhg.tex new file mode 100644 index 0000000..3d2b4ca --- /dev/null +++ b/ANN/hhg.tex @@ -0,0 +1,16 @@ +\section{Higher harmonic generation microscopy} +Higher harmonic generation is a nonlinear scattering process resulting from femtosecond pulse lasers interacting with tissue. +Photons from the incident laser beam combine into one photon via a virtual state, preserving the energy. +In this study, two higher harmonic generation variants are used: second (SHG) and third harmonic generation (THG). +THG happens at structural interfaces, making it useful to image \eg cells and their nuclei or axons. +SHG is generated by non-centrosymmetric structures, such as collagen. +Sometimes, photons combine into a photon with a slightly lower energy than the combination of incidence photons. +This is autofluorescence. +\Cref{fig:hhg-jablonski} shows a Jablonski diagram for THG, SHG and 2PEF. + +\begin{figure}[hb] + \centering + \includegraphics[width=\linewidth]{ANN/images/hhg-jablonski.png} + \caption[HHG Jablonski diagram]{Jablonski diagram for third (THG) and second harmonic generation (SHG), and two-photon excitation fluorescence (2PEF).} + \label{fig:hhg-jablonski} +\end{figure} diff --git a/ANN/images/hhg-jablonski.png b/ANN/images/hhg-jablonski.png new file mode 100644 index 0000000..05594b6 Binary files /dev/null and b/ANN/images/hhg-jablonski.png differ diff --git a/ANN/images/hhg-jablonski.pptx b/ANN/images/hhg-jablonski.pptx new file mode 100644 index 0000000..87d9f28 Binary files /dev/null and b/ANN/images/hhg-jablonski.pptx differ diff --git a/ANN/theory.tex b/ANN/theory.tex index 3e67704..8be0f8c 100644 --- a/ANN/theory.tex +++ b/ANN/theory.tex @@ -20,10 +20,10 @@ \end{figure} Inspired by this biological process, artificial neural networks have been developed. -Later, \textcite{Fukushima1980} mimicked this neural network for two dimensional information, using convolution operations. +Later, \textcite{Fukushima1980} mimicked this neural network for two-dimensional information, using convolution operations. The approach of \citeauthor{Fukushima1980} was inefficient. It could not learn to identify reoccurring features. -To enable learning, \textcite{Rumelhart1986} developed backpropagation: an algorithm to learn importances at feature level. +To enable learning, \textcite{Rumelhart1986} developed backpropagation: an algorithm optimize a model to learn general features. \textcite{LeCun1990} was one of the first to use backpropagation in a visual setting. They combined convolutions and backpropagation into a convolutional neural network to recognize handwritten digits. @@ -33,8 +33,34 @@ \section[CNN building blocks]{The building blocks of convolutional neural networks} -\subsection{Artificial neural network} -Neural network and backpropagation. +\subsection{Artificial neural networks} +An artificial neural network (ANN) is a computational model inspired by the structure and functioning of the human brain. +It consists of connected artificial neurons, also known as nodes or units, organized into layers. +Each neuron takes inputs, performs a mathematical operation on them, and produces an output. + +ANNs typically have an input layer, one or more hidden layers, and an output layer. +Information flows through the network from the input layer, to the hidden layers, and finally the output layer. +The hidden layers contain neurons that transform the input data into a more useful representation. +The transformations are dictated by weight matrices $W$. + +Neural networks are designed to learn from data through a process called training. +During training, the network adjusts the strengths of connections between neurons, known as weights, based on the patterns and relationships in the input data. +This process allows the network to recognize and generalize from examples, making it capable of solving complex problems and making predictions. + +Optimizing ANNs often relies on backpropagation (from backward propagation of errors)~\cite{Rumelhart1986}. +Mathematically, an ANN $g$ with $L$ layers and activation function $f$ can be described as +\begin{equation} + \hat{y} = g(x) = f^L \left\{W^L f^{L-1} \left[W^{L-1} \cdots f^1 \left(W^1 x\right) \cdots \right] \right\}, +\end{equation} +where $\hat{y}$ is the output, given input $x$. +To quantify the error of the model, the loss can be calculated with an appropriate error function $E(y, \hat{y})$, where $y$ is the target corresponding to input $x$. +Calculating $\partial E / \partial w_{ij}$ allows updating individual weights of the network with \eg +\begin{equation} + \Delta w_{ij} = -\eta \frac{\partial E}{\partial w_{ij}}, +\end{equation} +where $\eta$ is the learning rate. +This optimization algorithm is called stochastic gradient descent (SGD). +Backpropagation and SGD form the basis of neural network optimization, but there are other optimization algorithms available such as Adam~\cite{Kingma2014AdamAM}. \subsection{Convolutional layers} To distinguish a neural network from a convolutional neural network (CNN), at least one layer must be a convolution. @@ -56,13 +82,10 @@ \subsection{Convolutional layers} The kernel with size $k=3$ moves across the input of size $i=5$ with stride $s = 2$ in both directions. Convolutions have the useful property that they are equivariant to translations. -A function $f$ is equivariant to function $g$ if -\begin{equation} - f \circ g = g \circ f. -\end{equation} -The equivariance of convolutions and translations implies that learned weights and biases belonging to a convolution can be reused for identifying similar features anywhere in inputs. -Moreover, any operator that is not equivariant with convolutions may be used as a way to augment data, as the kernel perceives it is being different. -Examples of such operators are scaling, rotating, and flipping.\marginnote{Move this to its own theory section about data augmentation?} +The equivariance of convolutions to translations implies that learned weights and biases belonging to a convolution can be reused for identifying similar features anywhere in inputs. +Any operator that is not equivariant to convolutions may be used as a way to augment data, as the kernel perceives it is being different. +Examples of such operators are scaling, rotating, and flipping. +Applying these operators on input data trains the model to be invariant to the operator, meaning transformed versions of structures can be identified. \begin{figure*}%[p] \centering @@ -108,7 +131,7 @@ \subsection{Pooling}\label{sec:pooling} In deep learning, pooling is therefore used to quantify the presence of a pattern, as opposed to finding its position. \subsubsection{Max pooling}\label{subsec:maxpool} -The most common form of pooling is max pooling (ref). +The most common form of pooling is max pooling. The kernel finds the maximum value in sub-regions and maps these maximum values per sub-region to a new image. The output \begin{equation} @@ -125,36 +148,55 @@ \subsubsection{Average pooling}\label{subsec:avgpool} \subsection{Activation functions}\label{sec:activations} -\begin{enumerate} - \item heaviside - \item logistic curves - \item vanishing gradient problem -> relu -\end{enumerate} +\subsubsection{Saturating activation functions} -\subsubsection{Vanishing gradient problem} -Over the years, neural networks have become deeper, \ie more layers are being added. +Neural networks require differential activation functions for backpropagation to update model weights. +Sigmoids, +\begin{equation} + \sigma(x) = \frac{e^x}{e^x + 1} = 1 - \sigma(-x), +\end{equation} +are such differentiable functions and often used. -Sigmoids (and other saturating curves like hyperbolical tangent) have. +Over the years, neural networks have become deeper, \ie more layers are being added, to make function approximators generalize better. +Sigmoids and other saturating curves like the hyperbolic tangent can produce small gradients at either side. +For deep networks, gradients resulting from the product of multiple small activation can become too small, preventing the model to learn. +This phenomenon is called the vanishing gradient problem. \subsubsection{Rectified linear unit}\label{subsec:relu} -To overcome the vanishing gradient problem, non-saturating activation functions can be used. +To overcome the vanishing gradient problem, less saturating activation functions can be used. One such function is the rectified linear unit (ReLU). It is defined as \begin{equation} f(x) = x^+ = \max(0, x), \end{equation} such that only the positive arguments keep their value. +ReLU only saturates on the left side. +\subsubsection{Last layer activations} +\paragraph{Classification} +In classification tasks, the last layer usually contains multiple neurons where each neuron stands for a specific classification. +To estimate probabilities summing to one, the activation function of the last layer can be replaced by the softmax function, +\begin{align} + \mathrm{softmax}_i = \frac{e^{z_i}}{\sum_{j=1}^C e^{z_i}}, +\end{align} +where $\vec{z} = \mathbf{W}\vec{h}+\vec{b}$, $C$ the number of classes and $i$ the class of interest. + +\paragraph{Regression} +For regression models, no activation function should be used as the output should be unrestricted. % -------------------------------------------------- % Loss functions % -------------------------------------------------- \subsection{Loss functions} -Backpropagation needs a loss to learn in which direction to update weights. -There is a variety of loss functions available (ref review). +Neural networks are often updated with a gradient-based optimizer such as SGD. +SGD computes gradients of weights with respect to a loss function. +Loss functions should be chosen depending on the target. +Targets often fall in two categories: regression and classification. + +\subsubsection{Regression} -\subsubsection{Mean absolute error} +\paragraph{Mean absolute error} One of the most straightforward techniques of calculating the loss is the mean absolute error (MAE). It measures the average difference between every prediction and target, like \begin{equation} @@ -162,17 +204,17 @@ \subsubsection{Mean absolute error} \end{equation} where $n$ is the number of targets per sample, $y$ the prediction and $y'$ the target. -The MAE loss is forgiving, i.\ e.\ outliers are weighted as much as predictions close to the target. -In training a neural network, focusing on outliers is assumed to be beneficial, as those are the cases that the model has difficulty with (ref). +The MAE loss is forgiving, \ie, outliers are weighted as much as predictions close to the target. +In training a neural network, focusing on outliers can be beneficial, as those are the cases that the model has difficulty with. -\subsubsection{Mean square error} +\paragraph{Mean square error} To overcome the forgiving nature of the MAE loss, the mean square error (MSE) can be used. It measures the average squared difference between every prediction and target, like \begin{equation} \mathrm{MSE} = \frac{1}{n} \sum_{i=1}^n (y_i - y'_i)^2. \end{equation} -\subsubsection{Focal MSE} +\paragraph{Focal MSE} To give even more focus on the hard targets, giving them more importance than easy targets can be done through the focal MSE loss (FL)~\cite{Lu2022}. To give less importance to the easier targets, FL follows \begin{equation} @@ -180,12 +222,100 @@ \subsubsection{Focal MSE} \end{equation} where increasing $\gamma$ increases the number of targets regarded as easy and $\beta$ regulates the speed with which the first part of the curve increases (make fig). +\subsubsection{Classification} +For classification, the last layer gives an estimate of class probabilities. +Targets are often binary: probability of zero and one for the negative and positive class, respectively. + +\paragraph{Cross entropy} +Cross entropy can be used to calculate the loss between probabilities and their targets. +It is defined as +\begin{align}\label{eq:ce} + CE(x, y) = \frac{1}{N}\sum_{n=1}^N [-\sum_{c=1}^C \log \frac{e^x_{n,c}}{\sum_{i=1}^C e^{x_{n,i}}} y_{n,c}], +\end{align} +where $x$ is the probability output, $y$ the target, $C$ the number of classes, and $N$ the number of batches used. + +\paragraph{Binary cross entropy} +In the specific case where there are only two classes, \cref{eq:ce} can be reduced to +\begin{align} + BCE(x, y) = \frac{1}{N}\sum_{n=1}^N [y_n \cdot \log x_n + (1 - y_n) \cdot \log (1-x_n)] +\end{align} + %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% % TRAINING A NEURAL NETWORK %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% \section{Training a neural network} +\subsection{Training}\label{Training} +At the start of training, a neural network has its weights and biases initialized. +Most probably, the model is not capable of mapping input to output in a robust manner. +To achieve this, repeatedly using backpropagation to update the model parameters aims to shape the model in the direction of minimizing the loss between target and model output. +The neural network is presented with the input data in batches. +Every batch, the model is updated with the backpropagation algorithm. +One cycle of using all the batches is called an epoch. +A training consists of multiple epochs. + +For every $\mathcal{B}$th batch, a loss $\mathcal{L}_\mathcal{B}$ can be defined that is used by backpropagation to penalize the model performance. +Taking the average of all batch losses is the epoch loss, +\begin{equation} + \mathcal{L}_\mathrm{epoch} = \frac{1}{N_\mathcal{B}}\sum_{i=0}^{N_\mathcal{B}}\mathcal{L}_\mathcal{B}, +\end{equation} +Tracking $\mathcal{L}_\mathrm{epoch}$ shows how quickly the model is learning. + +To see if the model generalizes, it is standard practice to have a hold-out set, that the model does not learn from, but only uses to calculate the validation loss. +Ideally, this validation loss follows a similar trajectory as the training loss. +If the validation loss diverges upwards from the training loss, the model is overfitting. +It fails to generalize to unseen, but similar data. +To remedy this, there are multiple possible solutions. +Solutions include dropout, batch normalization, or more complex models (\ie deeper or wider networks). + +\subsubsection{Dropout}\label{sec:dropout} +Overfitting can be reduced by applying methods of regularization. +One regularization method is dropout. +It prevents neurons from co-adapting, which would otherwise reduce the chance of the model to perform well on external validation sets \cite{Srivastava2014}. +With dropout, individual neurons are activated with probability $p$, effectively dropping neurons randomly. + +\subsubsection{Batch normalization}\label{sec:bn} +Batch normalization (BN)~\cite{Ioffe2015} is a technique to shift and scale batches akin to standardization. +It can be implemented as a layer in any neural network. +Per minibatch and per dimension, the mean and standard deviation of the input are calculated. +Then, the input is standardized with +\begin{equation} + \hat{x}_i = \frac{x_i - \mu_\mathcal{B}}{\sqrt{\sigma_\mathcal{B}^2 + \epsilon}}, +\end{equation} +where $\mu_\mathcal{B}$ and $\sigma_\mathcal{B}$ are the mean and unbiased standard deviation of the batch, and $\epsilon$ is a small number for numerical stability when the variance is small. +The standardized input is then mapped through +\begin{equation} + y_i = \gamma \hat{x}_i + \beta, +\end{equation} +where $\gamma$ and $\beta$ are learnable parameters learned in a sub-network. + +When batch normalization is applied after a convolutional layer, the bias term of the convolution becomes redundant and can be set to zero to avoid unnecessary operations. + +BN has been shown to have a regularizing effect~\cite{Bjorck2018}, although combining it with dropout is disputed. +More often than not, using both BN and dropout leads to worse results on the test set. + +\subsubsection{Model ensembling}\label{subsec:model_ensembling} +A benefit of having a cyclic learning rate is generating multiple models across cycles. +The effectiveness of ensembling models from multiple cycles, Snapshot Ensembling, is first described by \textcite{Huang2017}. +During model training, a model can be saved (or checkpointed) at the best performing epoch for every cycle. +The checkpointed models can be ensembled by choosing the last $m$ out of $n$ models and averaging the output, as +\begin{equation} + \mathrm{output} = \frac{1}{M} \sum_{i=0}^{m-1} \mathrm{model}_{n-i}(\mathrm{input}). +\end{equation} +% \begin{figure} +% \centering +% \includegraphics[width=0.48\linewidth]{ANN/images/ensembling_huang_left.png} +% \includegraphics[width=0.48\linewidth]{ANN/images/ensembling_huang_right.png} +% \caption[Snapshot ensembling]{ +% Left: Illustration of model optimization The model converges to a local minimum. +% Right: Illustration of Snapshot Ensembling. +% The learning rate is cyclic and annealing, allowing to converge to and escaping from local minima. +% Snapshots are taken at every minimum which can be used for ensembling for inference. +% Reproduced from \fullcite{Huang2017} (Ref.~\cite{Huang2017}). +% } +% \end{figure} + % -------------------------------------------------- % Hyperparameter optimization % -------------------------------------------------- @@ -225,9 +355,8 @@ \subsubsection{Grid search and random search} \end{figure} \subsubsection{Tree Parzen estimator} -Still, random search requires trials in regions that are unpromising. -This is inefficient. -A tree-structured Parzen estimator (TPE)~\cite{Bergstra2011} approach aims to model the probability of a hyperparameter, given a loss value. +Still, random search requires trials in regions that are unpromising which is inefficient. +A tree-structured Parzen estimator (TPE)~\cite{Bergstra2011} approach aims to model the probability of a hyperparameter\footnote{Or a set of hyperparameters in the case of multivariate TPE~\cite{Falkner2018}}, given a loss value. That probability consists of two distributions, describing the good and bad values: \begin{equation} p(c|L) = @@ -236,7 +365,7 @@ \subsubsection{Tree Parzen estimator} p(c|L \leq L^*) = p(c|\mathrm{good}), \end{cases} \end{equation} -where $c$ is drawn from $\mathcal{C}$ and $L$ is the loss. +where $c$ is drawn from configuration space $\mathcal{C}$ and $L$ is the loss. $L^*$ a loss above which losses are considered bad. TPE chooses $L^*$ to be a fraction of observed $L$ values, such that $p(\mathrm{good}) = \gamma$. A promising candidate has low probability under $p(c|\mathrm{bad})$ and high probability under $p(c|\mathrm{good})$. @@ -247,7 +376,7 @@ \subsubsection{Tree Parzen estimator} is high. Ref.~\cite{Bergstra2011} shows that this ratio is proportional to the expected improvement~\cite{Jones2001}. The configuration responsible for the maximum of $\mathrm{promisingness}(c)$ is used as the next trial. -Results of that trial are now categorized as good or bad, and the iterative process continues.\marginnote{add multivariate note \cite{Falkner2018}} +Results of that trial are now categorized as good or bad, and the iterative process continues. \subsubsection{Successive Halving and Hyperband} Although $\mathcal{C}$ can be sampled more efficiently with TPE, trials still use the full computational budget, even if it is apparent that the trial is unpromising early on. @@ -276,74 +405,6 @@ \subsubsection{Parameter importances} \textcite{Hutter2014} describe the fANOVA algorithm to quantitatively assess the importance of every hyperparameter. Knowing the importance of a variable gives more insight into interactions and relative importance between hyperparameters. -\subsection{Training}\label{Training} -At the start of training, a neural network has its weights and biases initialized. -Most probably, the model is not capable of mapping input to output in a robust manner. -To achieve this, repeatedly using backpropagation to update the model parameters aims to shape the model in the direction of minimizing the loss between target and model output. -The neural network is presented with the input data in batches. -Every batch, the model is updated with the backpropagation algorithm. -One cycle of using all the batches is called an epoch. -A training consists of multiple epochs. - -For every $\mathcal{B}$th batch, a loss $\mathcal{L}_\mathcal{B}$ can be defined that is used by backpropagation to penalize the model performance. -Taking the average of all batch losses is the epoch loss, -\begin{equation} - \mathcal{L}_\mathrm{epoch} = \frac{1}{N_\mathcal{B}}\sum_{i=0}^{N_\mathcal{B}}\mathcal{L}_\mathcal{B}, -\end{equation} -Tracking $\mathcal{L}_\mathrm{epoch}$ shows how quickly the model is learning. - -To see if the model generalizes, it is standard practice to have a hold-out set, that the model does not learn from, but only uses to calculate the validation loss. -Ideally, this validation loss follows a similar trajectory as the training loss. -If the validation loss diverges upwards from the training loss, the model is overfitting. -It fails to generalize to unseen, but similar data. -To remedy this, there are multiple possible solutions. -Solutions include dropout, batch normalization, or more complex models (\ie deeper or wider networks). - -\subsubsection{Dropout}\label{sec:dropout} -Overfitting can be reduced by applying methods of regularization. -One regularization method is dropout. -It prevents neurons from co-adapting, which would otherwise reduce the chance of the model to perform well on external validation sets \cite{Srivastava2014}. -With dropout, individual neurons are activated with probability $p$, effectively dropping neurons randomly. - -\subsubsection{Batch normalization}\label{sec:bn} -Batch normalization (BN)~\cite{Ioffe2015} is a technique to shift and scale batches akin to standardization. -It can be implemented as a layer in any neural network. -Per minibatch and per dimension, the mean and standard deviation of the input are calculated. -Then, the input is standardized with -\begin{equation} - \hat{x}_i = \frac{x_i - \mu_\mathcal{B}}{\sqrt{\sigma_\mathcal{B}^2 + \epsilon}}, -\end{equation} -where $\mu_\mathcal{B}$ and $\sigma_\mathcal{B}$ are the mean and unbiased standard deviation of the batch, and $\epsilon$ is a small number for numerical stability when the variance is small. -The standardized input is then mapped through -\begin{equation} - y_i = \gamma \hat{x}_i + \beta, -\end{equation} -where $\gamma$ and $\beta$ are learnable parameters learned in a sub-network. - -BN has been shown to have a regularizing effect (ref), although combining it with dropout is disputed. -More often than not, using both BN and dropout leads to worse results on the test set. - -\subsubsection{Model ensembling}\label{subsec:model_ensembling} -A benefit of having a cyclic learning rate is generating multiple models across cycles. -The effectiveness of ensembling models from multiple cycles, Snapshot Ensembling, is first described by \textcite{Huang2017}. -During model training, a model can be checkpointed at the best performing epoch for every cycle, see \. -The checkpointed models can be ensembled by choosing the last $m$\marginnote{how many models chosen and how many models checkpointed then?!} out of $n$ models and averaging the output, as -\begin{equation} - \mathrm{output} = \frac{1}{M} \sum_{i=0}^{m-1} \mathrm{model}_{n-i}(\mathrm{input}). -\end{equation} -\begin{figure} - \centering - \includegraphics[width=0.48\linewidth]{ANN/images/ensembling_huang_left.png} - \includegraphics[width=0.48\linewidth]{ANN/images/ensembling_huang_right.png} - \caption[Snapshot ensembling]{ - Left: Illustration of model optimization The model converges to a local minimum. - Right: Illustration of Snapshot Ensembling. - The learning rate is cyclic and annealing, allowing to converge to and escaping from local minima. - Snapshots are taken at every minimum which can be used for ensembling for inference. - Reproduced from \fullcite{Huang2017} (Ref.~\cite{Huang2017}). - } -\end{figure} - % -------------------------------------------------- % Image quality @@ -402,7 +463,7 @@ \subsubsection{Kurtosis} \subsection{Explainable AI} A significant number of users of a trained AI generally view the model as a black box that simply maps input to output. -How this box is constructed and why it results in a particular outcome is often overlooked.\marginnote{ref!} +How this box is constructed and why it results in a particular outcome is often overlooked. Meanwhile, techniques to give insight into the black box (explainable AI, XAI) have been developed. These techniques fall in roughly two categories: gradient and perturbation based methods. Gradient based methods rely on gradients calculated during the backward pass and use these to find which parts of the input contribute to the output most. diff --git a/frontbackmatter/appendix-sclicom.tex b/frontbackmatter/appendix-sclicom.tex new file mode 100644 index 0000000..682e1af --- /dev/null +++ b/frontbackmatter/appendix-sclicom.tex @@ -0,0 +1,17 @@ +\section{Flow of images to splits}\label{app:folds-splits-viz} +The splits are created as described in \cref{subsubsec:slicom-folds}. +The process is visualized in \cref{fig:folds-splits-viz}. + +\begin{figure*} + \centering + \includegraphics[height=0.55\paperheight]{pediatric-brain-tumours/images/folds-splits-viz.pdf} + \caption[Flow of images to splits]{ + The flow of images to splits. + Top row shows available cases. + Every block is one case. + Middle row shows available images and is linked to the cases. + Bottom row shows training, validation and test splits. + Colors show flow within one fold. + } + \label{fig:folds-splits-viz} +\end{figure*} diff --git a/frontbackmatter/appendix-skin.tex b/frontbackmatter/appendix-skin.tex index 41deac2..ddfe9ef 100644 --- a/frontbackmatter/appendix-skin.tex +++ b/frontbackmatter/appendix-skin.tex @@ -1,9 +1,136 @@ +\section{Fits to stress-strain curves} +PCA and logistic curve fits to the skin stress-strain curves are shown in \cref{fig:pca_fits, fig:logistic_fits}, respectively. + +\begin{figure*} + \ContinuedFloat + \centering + \includegraphics[width=0.24\linewidth]{skinstression/images/pca-fits/sample_6.pdf} + \includegraphics[width=0.24\linewidth]{skinstression/images/pca-fits/sample_7.pdf} + \includegraphics[width=0.24\linewidth]{skinstression/images/pca-fits/sample_8.pdf} + \includegraphics[width=0.24\linewidth]{skinstression/images/pca-fits/sample_9.pdf} \\ + \includegraphics[width=0.24\linewidth]{skinstression/images/pca-fits/sample_10.pdf} + \includegraphics[width=0.24\linewidth]{skinstression/images/pca-fits/sample_11.pdf} + \includegraphics[width=0.24\linewidth]{skinstression/images/pca-fits/sample_12.pdf} + \includegraphics[width=0.24\linewidth]{skinstression/images/pca-fits/sample_13.pdf} \\ + \includegraphics[width=0.24\linewidth]{skinstression/images/pca-fits/sample_14.pdf} + \includegraphics[width=0.24\linewidth]{skinstression/images/pca-fits/sample_15.pdf} + \includegraphics[width=0.24\linewidth]{skinstression/images/pca-fits/sample_16.pdf} + \includegraphics[width=0.24\linewidth]{skinstression/images/pca-fits/sample_17.pdf} \\ + \includegraphics[width=0.24\linewidth]{skinstression/images/pca-fits/sample_18.pdf} + \includegraphics[width=0.24\linewidth]{skinstression/images/pca-fits/sample_19.pdf} + \includegraphics[width=0.24\linewidth]{skinstression/images/pca-fits/sample_20.pdf} + \includegraphics[width=0.24\linewidth]{skinstression/images/pca-fits/sample_21.pdf} \\ + \includegraphics[width=0.24\linewidth]{skinstression/images/pca-fits/sample_22.pdf} + \includegraphics[width=0.24\linewidth]{skinstression/images/pca-fits/sample_23.pdf} + \includegraphics[width=0.24\linewidth]{skinstression/images/pca-fits/sample_24.pdf} + \includegraphics[width=0.24\linewidth]{skinstression/images/pca-fits/sample_25.pdf} \\ + \includegraphics[width=0.24\linewidth]{skinstression/images/pca-fits/sample_26.pdf} + \includegraphics[width=0.24\linewidth]{skinstression/images/pca-fits/sample_27.pdf} + \includegraphics[width=0.24\linewidth]{skinstression/images/pca-fits/sample_28.pdf} + \includegraphics[width=0.24\linewidth]{skinstression/images/pca-fits/sample_29.pdf} + \raggedleft Continued on next page. +\end{figure*} + +\begin{figure*} + \centering + \includegraphics[width=0.24\linewidth]{skinstression/images/pca-fits/sample_30.pdf} + \includegraphics[width=0.24\linewidth]{skinstression/images/pca-fits/sample_31.pdf} + \includegraphics[width=0.24\linewidth]{skinstression/images/pca-fits/sample_32.pdf} + \includegraphics[width=0.24\linewidth]{skinstression/images/pca-fits/sample_33.pdf} \\ + \includegraphics[width=0.24\linewidth]{skinstression/images/pca-fits/sample_34.pdf} + \includegraphics[width=0.24\linewidth]{skinstression/images/pca-fits/sample_45.pdf} + \includegraphics[width=0.24\linewidth]{skinstression/images/pca-fits/sample_46.pdf} + \includegraphics[width=0.24\linewidth]{skinstression/images/pca-fits/sample_47.pdf} \\ + \includegraphics[width=0.24\linewidth]{skinstression/images/pca-fits/sample_48.pdf} + \includegraphics[width=0.24\linewidth]{skinstression/images/pca-fits/sample_49.pdf} + \includegraphics[width=0.24\linewidth]{skinstression/images/pca-fits/sample_50.pdf} + \includegraphics[width=0.24\linewidth]{skinstression/images/pca-fits/sample_51.pdf} \\ + \includegraphics[width=0.24\linewidth]{skinstression/images/pca-fits/sample_52.pdf} + \includegraphics[width=0.24\linewidth]{skinstression/images/pca-fits/sample_53.pdf} + \includegraphics[width=0.24\linewidth]{skinstression/images/pca-fits/sample_54.pdf} + \includegraphics[width=0.24\linewidth]{skinstression/images/pca-fits/sample_55.pdf} \\ + \includegraphics[width=0.24\linewidth]{skinstression/images/pca-fits/sample_56.pdf} + \includegraphics[width=0.24\linewidth]{skinstression/images/pca-fits/sample_57.pdf} + \includegraphics[width=0.24\linewidth]{skinstression/images/pca-fits/sample_58.pdf} + \includegraphics[width=0.24\linewidth]{skinstression/images/pca-fits/sample_59.pdf} \\ + \includegraphics[width=0.24\linewidth]{skinstression/images/pca-fits/sample_60.pdf} + \includegraphics[width=0.24\linewidth]{skinstression/images/pca-fits/sample_61.pdf} + \caption[PCA fits]{ + PCA fits for every truncated and interpolated strain-stress curve. + The interpolated measurements (blue) are estimated by the PCA curve (red) along with their $R^2$. + PCA is done on all available thigh data. + Note that the vertical axes are not equal. + } + \label{fig:pca_fits} +\end{figure*} + +\begin{figure*} + \ContinuedFloat + \centering + \includegraphics[width=0.24\linewidth]{skinstression/images/logistic-fits/sample_6} + \includegraphics[width=0.24\linewidth]{skinstression/images/logistic-fits/sample_7.pdf} + \includegraphics[width=0.24\linewidth]{skinstression/images/logistic-fits/sample_8.pdf} + \includegraphics[width=0.24\linewidth]{skinstression/images/logistic-fits/sample_9.pdf} \\ + \includegraphics[width=0.24\linewidth]{skinstression/images/logistic-fits/sample_10.pdf} + \includegraphics[width=0.24\linewidth]{skinstression/images/logistic-fits/sample_11.pdf} + \includegraphics[width=0.24\linewidth]{skinstression/images/logistic-fits/sample_12.pdf} + \includegraphics[width=0.24\linewidth]{skinstression/images/logistic-fits/sample_13.pdf} \\ + \includegraphics[width=0.24\linewidth]{skinstression/images/logistic-fits/sample_14.pdf} + \includegraphics[width=0.24\linewidth]{skinstression/images/logistic-fits/sample_15.pdf} + \includegraphics[width=0.24\linewidth]{skinstression/images/logistic-fits/sample_16.pdf} + \includegraphics[width=0.24\linewidth]{skinstression/images/logistic-fits/sample_17.pdf} \\ + \includegraphics[width=0.24\linewidth]{skinstression/images/logistic-fits/sample_18.pdf} + \includegraphics[width=0.24\linewidth]{skinstression/images/logistic-fits/sample_19.pdf} + \includegraphics[width=0.24\linewidth]{skinstression/images/logistic-fits/sample_20.pdf} + \includegraphics[width=0.24\linewidth]{skinstression/images/logistic-fits/sample_21.pdf} \\ + \includegraphics[width=0.24\linewidth]{skinstression/images/logistic-fits/sample_22.pdf} + \includegraphics[width=0.24\linewidth]{skinstression/images/logistic-fits/sample_23.pdf} + \includegraphics[width=0.24\linewidth]{skinstression/images/logistic-fits/sample_24.pdf} + \includegraphics[width=0.24\linewidth]{skinstression/images/logistic-fits/sample_25.pdf} \\ + \includegraphics[width=0.24\linewidth]{skinstression/images/logistic-fits/sample_26.pdf} + \includegraphics[width=0.24\linewidth]{skinstression/images/logistic-fits/sample_27.pdf} + \includegraphics[width=0.24\linewidth]{skinstression/images/logistic-fits/sample_28.pdf} + \includegraphics[width=0.24\linewidth]{skinstression/images/logistic-fits/sample_29.pdf} + \raggedleft Continued on next page. +\end{figure*} + +\begin{figure*} + \centering + \includegraphics[width=0.24\linewidth]{skinstression/images/logistic-fits/sample_30.pdf} + \includegraphics[width=0.24\linewidth]{skinstression/images/logistic-fits/sample_31.pdf} + \includegraphics[width=0.24\linewidth]{skinstression/images/logistic-fits/sample_32.pdf} + \includegraphics[width=0.24\linewidth]{skinstression/images/logistic-fits/sample_33.pdf} \\ + \includegraphics[width=0.24\linewidth]{skinstression/images/logistic-fits/sample_34.pdf} + \includegraphics[width=0.24\linewidth]{skinstression/images/logistic-fits/sample_45.pdf} + \includegraphics[width=0.24\linewidth]{skinstression/images/logistic-fits/sample_46.pdf} + \includegraphics[width=0.24\linewidth]{skinstression/images/logistic-fits/sample_47.pdf} \\ + \includegraphics[width=0.24\linewidth]{skinstression/images/logistic-fits/sample_48.pdf} + \includegraphics[width=0.24\linewidth]{skinstression/images/logistic-fits/sample_49.pdf} + \includegraphics[width=0.24\linewidth]{skinstression/images/logistic-fits/sample_50.pdf} + \includegraphics[width=0.24\linewidth]{skinstression/images/logistic-fits/sample_51.pdf} \\ + \includegraphics[width=0.24\linewidth]{skinstression/images/logistic-fits/sample_52.pdf} + \includegraphics[width=0.24\linewidth]{skinstression/images/logistic-fits/sample_53.pdf} + \includegraphics[width=0.24\linewidth]{skinstression/images/logistic-fits/sample_54.pdf} + \includegraphics[width=0.24\linewidth]{skinstression/images/logistic-fits/sample_55.pdf} \\ + \includegraphics[width=0.24\linewidth]{skinstression/images/logistic-fits/sample_56.pdf} + \includegraphics[width=0.24\linewidth]{skinstression/images/logistic-fits/sample_57.pdf} + \includegraphics[width=0.24\linewidth]{skinstression/images/logistic-fits/sample_58.pdf} + \includegraphics[width=0.24\linewidth]{skinstression/images/logistic-fits/sample_59.pdf} \\ + \includegraphics[width=0.24\linewidth]{skinstression/images/logistic-fits/sample_60.pdf} + \includegraphics[width=0.24\linewidth]{skinstression/images/logistic-fits/sample_61.pdf} + \caption[Logistic fits]{ + Logistic fits (red) and their $R^2$ for every strain-stress curve (black). + Note that the vertical axes are not equal. + } + \label{fig:logistic_fits} +\end{figure*} + \section{Configuration spaces}\label{app:skin_conf_search_spaces} -The configuration search space for \textsc{Skinstression} is summarized in \cref{tab:conf_skin}. +The configuration search space for Skinstression is summarized in \cref{tab:conf_skin}. \begin{table} \centering - \caption[\textsc{Skinstression} configuration search space]{\textsc{Skinstression} configuration search space.} + \caption[Skinstression configuration search space]{Skinstression configuration search space.} \label{tab:conf_skin} \begin{tabular}{cccccc} \toprule \\ @@ -31,10 +158,10 @@ \section{Software diagrams}\label{app:skin_c4} \begin{figure*} \centering \includesvg[pretex=\small, width=\linewidth]{images/skinstression_system_context_diagram.svg} - \caption[\textsc{Skinstression} system context diagram]{ - System context diagram of \textsc{Skinstression}. + \caption[Skinstression system context diagram]{ + System context diagram of Skinstression. An experimentalist images collagen in skin tissue using an SHG microscope. - The microscope output serves as input to \textsc{Skinstression} which trains a convolutional neural network to find the strain-stress curve of the imaged tissue. + The microscope output serves as input to Skinstression which trains a convolutional neural network to find the strain-stress curve of the imaged tissue. The trained model can serve as a substitution to the SHG microscope, or provide new insights in why tissue has particular stretch properties. } \end{figure*} @@ -43,9 +170,9 @@ \section{Software diagrams}\label{app:skin_c4} \begin{figure*} \centering \includesvg[pretex=\tiny, angle=90, width=0.9\textheight,height=\linewidth,keepaspectratio]{images/skinstression_container_diagram.svg} - \caption[\textsc{Skinstression} container diagram]{ - Container diagram of \textsc{Skinstression}. - The bounding box shows internal communications of \textsc{Skinstression}. + \caption[Skinstression container diagram]{ + Container diagram of Skinstression. + The bounding box shows internal communications of Skinstression. Images generated with the SHG microscope get stored and can be read by PyimageQualityRanking (PyIQ). PyIQ sorts the images by quality, such that they can be read in order by notebooks and the main application. The main application reads locally stored configurations using Hydra. diff --git a/general_discussion_and_conclusion.tex b/general_discussion_and_conclusion.tex new file mode 100644 index 0000000..30404ce --- /dev/null +++ b/general_discussion_and_conclusion.tex @@ -0,0 +1 @@ +TODO \ No newline at end of file diff --git a/general_introduction.tex b/general_introduction.tex index 3729615..2b46ef9 100644 --- a/general_introduction.tex +++ b/general_introduction.tex @@ -1,9 +1,31 @@ \section{Deep learning for higher harmonic microscopy} -Insert some text relating the two projects to each other and introduce the overall challenges. +Visualizing living tissue and cells is of vital importance in life sciences and health care. +Standard, non-invasive techniques such as magnetic resonance imaging, ultrasound imaging, and computed tomography fail to image structures at resolutions high enough to distinguish structures as individual cells or connective tissue. +These structures are interesting for pathologists or skin stretch experts. +Higher harmonic generation (HHG) microscopy can image cells and tissue at resolutions of \qty{0.2}{\micro\meter} per pixel (mpp) in seconds. +These high resolution images %can be large (sometimes \num{3e8} 24-bit pixels) and +can contain complex structures and features. + +Interesting features are collagen and elastin fibers oriented in all directions when determining stretch properties of skin tissue. +Mechanically stretching the tissue to get stress-strain curves is time-expensive and could break the tissue. +Tissue images may have all information needed to determine stretch properties such as Young's modulus or maximum stress. +\Cref{ch:skinstression} studies the possibility of acquiring stress-strain curves from second harmonic generation (SHG) images alone. +This may be a step forward to find out skin properties \emph{in vivo} with an endoscope to aid plastic surgery. + +Other interesting features are disease patterns in the study of pathology. +Current clinical practice includes analysis of histopathological data. +However, making this data takes a long time, mainly caused by tissue processing. +HHG imaging can do this in seconds, allowing for intraoperative feedback. +Feedback can \eg include amount of resected tumor tissue or tumor type. +This would still require intraoperative image analysis, while time is scarce. +\Cref{ch:sclicom} studies the possibility to classify two pediatric brain tumors, medulloblastoma or pilocytic astrocytoma, from HHG images and explaining which regions were important for the classifications. + +The experiments are preceded by an introduction on HHG imaging and deep learning concepts used by the two projects in \cref{ch:theory}. +\Cref{ch:general_discussion_and_conclusion} discusses overarching challenges and gives recommendations for advancing AI for HHG imaging. \section{Reporting of clinical artificial intelligence} -The prediction models described here may aid health care providers in estimating the probability of risk that an outcome is present. -The Transparent Reporting of a multivariable prediction model for Individual Prognosis Or Diagnosis (TRIPOD) Initiative developed guidelines to report on such diagnostic models \cite{Collins2015, Moons2015}. +The prediction models described in this work may eventually aid health care providers in acquiring clinically relevant parameters or estimating the probability of risk that an outcome is present. +The Transparent Reporting of a multivariable prediction model for Individual Prognosis Or Diagnosis (TRIPOD) Initiative developed guidelines to report on such diagnostic models \cite{Collins2015, Moons2015, Heus2020}. Recent advances in artificial intelligence (AI) apply AI as black box predictive models in health care, often not sufficiently well reported. Transparent reporting on these black box models builds confidence in using and further developing the models. This is especially important in health care, where there is a need for automation while trust in AI is yet to be earned. @@ -14,6 +36,3 @@ \section{Reporting of clinical artificial intelligence} A clinician should then be explained how the model came to its conclusion, along with its confidence. To account for these challenges, an extension for the TRIPOD statement, TRIPOD-AI is currently being developed \cite{Collins2021,Collins2020}. Reports on the diagnostic models developed in this study aim to adhere to TRIPOD-AI as well as possible\footnote{The reader is invited to use the TRIPOD-AI accompanying PROBAST-AI \cite{Wolff2019a, Wolff2019b, Collins2021} checklist to assess the risk of bias of the predictive models.}. - -\section{Structure} -Quick explanation of thesis structure. \ No newline at end of file diff --git a/library.bib b/library.bib index 0095051..27bf6c4 100644 --- a/library.bib +++ b/library.bib @@ -591,6 +591,23 @@ @Article{Khattak2019 @article{Orr2020, title={Pathology, diagnostics, and classification of Medulloblastoma}, volume={30}, DOI={10.1111/bpa.12837}, number={3}, journal={Brain Pathology}, author={Orr, Brent A.}, year={2020}, pages={664–678}} @article{Vasile2017, title={Human astrocytes: Structure and functions in the healthy brain}, volume={222}, DOI={10.1007/s00429-017-1383-5}, number={5}, journal={Brain Structure and Function}, author={Vasile, Flora and Dossi, Elena and Rouach, Nathalie}, year={2017}, pages={2017–2029}} +@unpublished{Zhou2023, + title={Three-dimensional Characterization of Mechanical Properties and Microstructures of Human Dermal Skin}, + author={Zhou, Mengyao and González, Patrick, J. and van Haasterecht, Ludo and Soylu, Alperen and Mihailovski, Maria and van Zuijlen, Paul and Groot, Marie Louise}, +} + +@article{Haasterecht2023, title={Visualizing dynamic Three-dimensional changes of human reticular dermal collagen under mechanical strain}, volume={9}, DOI={10.1088/2057-1976/accc8e}, number={3}, journal={Biomedical Physics and Engineering Express}, author={van Haasterecht, L and Zhou, M and Ma, Y and Bartolini, L and Van Mourik, F and Van Zuijlen, P P and Groot, M L}, year={2023}, pages={035033}} + +@article{Millard2016, title={Medulloblastoma}, volume={31}, DOI={10.1177/0883073815600866}, number={12}, journal={Journal of Child Neurology}, author={Millard, Nathan E. and De Braganca, Kevin C.}, year={2016}, pages={1341–1353}} + +@mastersthesis{Spies2023, + type = {MSc thesis}, + author = {Spies, Sylvia}, + school = {University of Amsterdam and VU Amsterdam}, + title = {Validation of higher harmonic generation microscopy for the diagnosis of various pediatric tumors}, + year = {2023} +} + @article{Schirris2022, author = {Schirris, Yoni and Gavves, Efstratios and Nederlof, Iris and Horlings, Hugo Mark and Teuwen, Jonas}, date = {2022}, @@ -770,6 +787,16 @@ @misc{abc groups = {THG-XAI, MIL: multi-scale attention} } +@article{Taqi2018, title={A review of artifacts in histopathology}, volume={22}, DOI={10.4103/jomfp.jomfp_125_15}, number={2}, journal={Journal of Oral and Maxillofacial Pathology}, author={Taqi, SyedAhmed and Sami, SyedAbdus and Sami, LateefBegum and Zaki, SyedAhmed}, year={2018}, pages={279}} + +@article{Fahmideh2021, title={Pediatric brain tumors: Descriptive epidemiology, risk factors, and future directions}, volume={30}, DOI={10.1158/1055-9965.epi-20-1443}, number={5}, journal={Cancer Epidemiology, Biomarkers and Prevention}, author={Adel Fahmideh, Maral and Scheurer, Michael E.}, year={2021}, pages={813–821}} + +@article{Parwani2019, title={Next Generation Diagnostic Pathology: Use of digital pathology and artificial intelligence tools to augment a pathological diagnosis}, volume={14}, DOI={10.1186/s13000-019-0921-2}, number={1}, journal={Diagnostic Pathology}, author={Parwani, Anil V.}, year={2019}} + +@article{George2019, title={Will I need to move to get my first job?: Geographic relocation and other trends in the pathology job market}, volume={144}, DOI={10.5858/arpa.2019-0150-cp}, number={4}, journal={Archives of Pathology and Laboratory Medicine}, author={George, Melissa R. and Johnson, Kristen A. and Gratzinger, Dita A. and Brissette, Mark D. and McCloskey, Cindy B. and Conran, Richard Michael and Dixon, Lisa R. and Roberts, Cory Anthony and Rojiani, Amyn M. and Shyu, Irene and et al.}, year={2019}, pages={427–434}} + +@article{Kocarnik2022, title={Cancer incidence, mortality, years of life lost, years lived with disability, and disability-adjusted life years for 29 cancer groups from 2010 to 2019}, volume={8}, DOI={10.1001/jamaoncol.2021.6987}, number={3}, journal={JAMA Oncology}, author={Kocarnik, Jonathan M. and Compton, Kelly and Dean, Frances E. and Fu, Weijia and Gaw, Brian L. and Harvey, James D. and Henrikson, Hannah Jacqueline and Lu, Dan and Pennini, Alyssa and Xu, Rixing and et al.}, year={2022}, pages={420}} + @inproceedings{Chen2020, author = {Chen, Ting and Kornblith, Simon and Norouzi, Mohammad and Hinton, Geoffrey}, date = {2020}, @@ -975,6 +1002,14 @@ @article{Akiba2019 bibsource = {dblp computer science bibliography, https://dblp.org} } +@article{Kingma2014AdamAM, + title={Adam: A Method for Stochastic Optimization}, + author={Diederik P. Kingma and Jimmy Ba}, + journal={CoRR}, + year={2014}, + volume={abs/1412.6980} +} + @inproceedings{Paszke2019, author = {Paszke, Adam and Gross, Sam and Massa, Francisco and Lerer, Adam and Bradbury, James and Chanan, Gregory and Killeen, Trevor and Lin, Zeming and Gimelshein, Natalia and Antiga, Luca and Desmaison, Alban and Kopf, Andreas and Yang, Edward and DeVito, Zachary and Raison, Martin and Tejani, Alykhan and Chilamkurthy, Sasank and Steiner, Benoit and Fang, Lu and Bai, Junjie and Chintala, Soumith}, booktitle = {Advances in Neural Information Processing Systems 32}, @@ -1007,8 +1042,21 @@ @article{Pedregosa2011 year = {2011} } +@article{Verhaegen2012, title={Adaptation of the dermal collagen structure of human skin and scar tissue in response to stretch: An experimental study}, volume={20}, DOI={10.1111/j.1524-475x.2012.00827.x}, number={5}, journal={Wound Repair and Regeneration}, author={Verhaegen, Pauline D. and Schouten, Hennie J. and Tigchelaar-Gutter, Wikky and van Marle, Jan and van Noorden, Cornelis J. and Middelkoop, Esther and van Zuijlen, Paul P.}, year={2012}, pages={658–666}} + +@article{Holzapfel2001, + title={Biomechanics of soft tissue}, + author={Holzapfel, Gerhard A and others}, + journal={The handbook of materials behavior models}, + volume={3}, + number={1}, + pages={1049--1063}, + year={2001}, + publisher={Academic Press San Diego, CA, USA} +} + @article{Lipa2021, - journal = {Advances in Dermatology and Allergology/Postępy Dermatologii i Alergologii}, + journal = {Advances in Dermatology and Allergology}, issn = {1642-395X}, volume = {38}, number = {3}, @@ -1626,6 +1674,38 @@ @article{Li2016 bibsource = {dblp computer science bibliography, https://dblp.org} } +@article{Heus2020, title={Transparent reporting of multivariable prediction models in Journal and conference abstracts: Tripod for abstracts}, volume={173}, DOI={10.7326/m20-0193}, number={1}, journal={Annals of Internal Medicine}, author={Heus, Pauline and Reitsma, Johannes B. and Collins, Gary S. and Damen, Johanna A.A.G. and Scholten, Rob J.P.M. and Altman, Douglas G. and Moons, Karel G.M. and Hooft, Lotty}, year={2020}, pages={42–47}} + +@article{Litjens2017, +title = {A survey on deep learning in medical image analysis}, +journal = {Medical Image Analysis}, +volume = {42}, +pages = {60-88}, +year = {2017}, +issn = {1361-8415}, +doi = {https://doi.org/10.1016/j.media.2017.07.005}, +url = {https://www.sciencedirect.com/science/article/pii/S1361841517301135}, +author = {Geert Litjens and Thijs Kooi and Babak Ehteshami Bejnordi and Arnaud Arindra Adiyoso Setio and Francesco Ciompi and Mohsen Ghafoorian and Jeroen A.W.M. {van der Laak} and Bram {van Ginneken} and Clara I. Sánchez}, +keywords = {Deep learning, Convolutional neural networks, Medical imaging, Survey}, +abstract = {Deep learning algorithms, in particular convolutional networks, have rapidly become a methodology of choice for analyzing medical images. This paper reviews the major deep learning concepts pertinent to medical image analysis and summarizes over 300 contributions to the field, most of which appeared in the last year. We survey the use of deep learning for image classification, object detection, segmentation, registration, and other tasks. Concise overviews are provided of studies per application area: neuro, retinal, pulmonary, digital pathology, breast, cardiac, abdominal, musculoskeletal. We end with a summary of the current state-of-the-art, a critical discussion of open challenges and directions for future research.} +} + +@article{Bjorck2018, + author = {Johan Bjorck and + Carla P. Gomes and + Bart Selman}, + title = {Understanding Batch Normalization}, + journal = {CoRR}, + volume = {abs/1806.02375}, + year = {2018}, + url = {http://arxiv.org/abs/1806.02375}, + eprinttype = {arXiv}, + eprint = {1806.02375}, + timestamp = {Mon, 13 Aug 2018 16:46:58 +0200}, + biburl = {https://dblp.org/rec/journals/corr/abs-1806-02375.bib}, + bibsource = {dblp computer science bibliography, https://dblp.org} +} + @article{Fukushima1980, title = {Neocognitron: A self-organizing neural network model for a mechanism of pattern recognition unaffected by shift in position}, volume = {36}, diff --git a/main.tex b/main.tex index f797ee1..de52cb0 100644 --- a/main.tex +++ b/main.tex @@ -33,6 +33,10 @@ % Load the package for hyperreferences \usepackage{kaorefs} +\hypersetup{ + linkcolor=black +} + % To be able to include Tikz files % \usepackage{standalone} @@ -171,7 +175,7 @@ % Define the style for the TOC, LOF, and LOT %\setstretch{1} % Uncomment to modify line spacing in the ToC -%\hypersetup{linkcolor=blue} % Uncomment to set the colour of links in the ToC +% \hypersetup{linkcolor=blue} % Uncomment to set the colour of links in the ToC \setlength{\textheight}{230\vscale} % Manually adjust the height of the ToC pages % Turn on compatibility mode for the etoc package @@ -211,38 +215,52 @@ \chapter{General introduction}\label{ch:general_introduction} % \pagelayout{margin} % Restore margins \pagelayout{wide} -\chapter{Theory of artificial neural networks} +\chapter{Theory of higher harmonic generation and artificial neural networks}\label{ch:theory} \pagelayout{margin} % Restore margins +\input{ANN/hhg.tex} \input{ANN/theory.tex} % \pagelayout{wide} % No margins -% % \addpart[Skinstression]{Strain-stress regression on \\ second harmonic generation images \\ using {\normalfont\textsc{Skinstression}}} +% % \addpart[Skinstression]{Strain-stress regression on \\ second harmonic generation images \\ using {\normalfontSkinstression}} % \addpart[Skinstression]{Developing and validating a strain-stress regression model on second harmonic generation images from old adult skin tissue} % \pagelayout{margin} % Restore margins \pagelayout{wide} -\chapter[Skinstression]{Developing and validating a strain-stress regression model on second harmonic generation images from old adult skin tissue} +\chapter[Skinstression]{Developing and validating a strain-stress regression model on second harmonic generation images from old adult skin tissue}\label{ch:skinstression} \pagelayout{margin} % Restore margins +\input{skinstression/sections/abstract.tex} +\clearpage \input{skinstression/sections/introduction.tex} \input{skinstression/sections/theory.tex} \input{skinstression/sections/methods.tex} \input{skinstression/sections/results.tex} \input{skinstression/sections/discussion.tex} +\input{skinstression/sections/conclusion.tex} \input{skinstression/sections/supplementary.tex} \pagelayout{wide} % No margins -\chapter[SCLICOM]{Developing and validating a clinical context aware multi-instance learning with self-supervised pre-training model on higher harmonic generation images from medulloblastoma and pilocytic astrocytoma in children} +\chapter[SCLICOM]{Developing and validating a clinical context aware multi-instance learning model with self-supervised pre-training on higher harmonic generation images of medulloblastoma and pilocytic astrocytoma in children}\label{ch:sclicom} \pagelayout{margin} % Restore margins + +\input{pediatric-brain-tumours/sections/abstract.tex} +\clearpage \input{pediatric-brain-tumours/sections/introduction.tex} \input{pediatric-brain-tumours/sections/theory/theory.tex} \input{pediatric-brain-tumours/sections/methods/methods.tex} \input{pediatric-brain-tumours/sections/results/results.tex} \input{pediatric-brain-tumours/sections/discussion/discussion.tex} +\input{pediatric-brain-tumours/sections/conclusion.tex} \input{pediatric-brain-tumours/sections/supplementary_materials.tex} +\pagelayout{wide} % No margins +\chapter[General discussion and conclusion]{General discussion and conclusion}\label{ch:general_discussion_and_conclusion} +\pagelayout{margin} % Restore margins + +\input{general_discussion_and_conclusion.tex} + % \defbibnote{bibnote}{Here are the references in citation order.\par\bigskip} % Prepend this text to the bibliography % \printbibliography[heading=bibintoc, title=Bibliography, prenote=bibnote] % Add the bibliography heading to the ToC, set the title of the bibliography and output the bibliography note \pagelayout{wide} @@ -261,6 +279,15 @@ \chapter{Skinstression} \input{frontbackmatter/appendix-skin.tex} +\addcontentsline{toc}{part}{Appendix} + +\pagelayout{wide} +\chapter{SCLICOM} +\pagelayout{margin} + +\input{frontbackmatter/appendix-sclicom.tex} + + %---------------------------------------------------------------------------------------- \backmatter % Denotes the end of the main document content diff --git a/pediatric-brain-tumours/images/PMC_HHG_32_Hersenen_I-05_5x7_200slow-summary.png b/pediatric-brain-tumours/images/PMC_HHG_32_Hersenen_I-05_5x7_200slow-summary.png new file mode 100644 index 0000000..c03bac6 Binary files /dev/null and b/pediatric-brain-tumours/images/PMC_HHG_32_Hersenen_I-05_5x7_200slow-summary.png differ diff --git a/pediatric-brain-tumours/images/PMC_HHG_32_Hersenen_I-05_5x7_200slow-tiles-low-a.png b/pediatric-brain-tumours/images/PMC_HHG_32_Hersenen_I-05_5x7_200slow-tiles-low-a.png new file mode 100644 index 0000000..9b933d0 Binary files /dev/null and b/pediatric-brain-tumours/images/PMC_HHG_32_Hersenen_I-05_5x7_200slow-tiles-low-a.png differ diff --git a/pediatric-brain-tumours/images/PMC_HHG_32_Hersenen_I-05_5x7_200slow-tiles.png b/pediatric-brain-tumours/images/PMC_HHG_32_Hersenen_I-05_5x7_200slow-tiles.png new file mode 100644 index 0000000..a7e1555 Binary files /dev/null and b/pediatric-brain-tumours/images/PMC_HHG_32_Hersenen_I-05_5x7_200slow-tiles.png differ diff --git a/pediatric-brain-tumours/images/PMC_HHG_36_Hersenen_I-05_8x8_200slow-attention.png b/pediatric-brain-tumours/images/PMC_HHG_36_Hersenen_I-05_8x8_200slow-attention.png new file mode 100644 index 0000000..f8a0efe Binary files /dev/null and b/pediatric-brain-tumours/images/PMC_HHG_36_Hersenen_I-05_8x8_200slow-attention.png differ diff --git a/pediatric-brain-tumours/images/PMC_HHG_36_Hersenen_I-05_8x8_200slow-summary.png b/pediatric-brain-tumours/images/PMC_HHG_36_Hersenen_I-05_8x8_200slow-summary.png index 4781ffe..0349280 100644 Binary files a/pediatric-brain-tumours/images/PMC_HHG_36_Hersenen_I-05_8x8_200slow-summary.png and b/pediatric-brain-tumours/images/PMC_HHG_36_Hersenen_I-05_8x8_200slow-summary.png differ diff --git a/pediatric-brain-tumours/images/PMC_HHG_36_Hersenen_I-05_8x8_200slow-tiles-low-a.png b/pediatric-brain-tumours/images/PMC_HHG_36_Hersenen_I-05_8x8_200slow-tiles-low-a.png new file mode 100644 index 0000000..58c2358 Binary files /dev/null and b/pediatric-brain-tumours/images/PMC_HHG_36_Hersenen_I-05_8x8_200slow-tiles-low-a.png differ diff --git a/pediatric-brain-tumours/images/PMC_HHG_36_Hersenen_I-05_8x8_200slow-tiles.png b/pediatric-brain-tumours/images/PMC_HHG_36_Hersenen_I-05_8x8_200slow-tiles.png index e325f85..6ec1ab9 100644 Binary files a/pediatric-brain-tumours/images/PMC_HHG_36_Hersenen_I-05_8x8_200slow-tiles.png and b/pediatric-brain-tumours/images/PMC_HHG_36_Hersenen_I-05_8x8_200slow-tiles.png differ diff --git a/pediatric-brain-tumours/images/folds-splits-viz.pdf b/pediatric-brain-tumours/images/folds-splits-viz.pdf new file mode 100644 index 0000000..2f773ef Binary files /dev/null and b/pediatric-brain-tumours/images/folds-splits-viz.pdf differ diff --git a/pediatric-brain-tumours/images/folds-splits-viz.png b/pediatric-brain-tumours/images/folds-splits-viz.png new file mode 100644 index 0000000..1165a49 Binary files /dev/null and b/pediatric-brain-tumours/images/folds-splits-viz.png differ diff --git a/pediatric-brain-tumours/images/folds-splits-viz.svg b/pediatric-brain-tumours/images/folds-splits-viz.svg new file mode 100644 index 0000000..738ef0c --- /dev/null +++ b/pediatric-brain-tumours/images/folds-splits-viz.svg @@ -0,0 +1,10274 @@ + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + train + + + + test + + + + val + + + + + + + + + + + + + + + + image + case + diff --git a/pediatric-brain-tumours/images/tsne-cc.pdf b/pediatric-brain-tumours/images/tsne-cc.pdf new file mode 100644 index 0000000..91ff213 Binary files /dev/null and b/pediatric-brain-tumours/images/tsne-cc.pdf differ diff --git a/pediatric-brain-tumours/sections/abstract.tex b/pediatric-brain-tumours/sections/abstract.tex new file mode 100644 index 0000000..a26074f --- /dev/null +++ b/pediatric-brain-tumours/sections/abstract.tex @@ -0,0 +1,24 @@ +\pdfbookmark[1]{Abstract}{pbt_abstract} +\section*{Abstract} + +\paragraph{Background and objective} +Higher harmonic generation (HHG) microscopy allows for intraoperative feedback. +Interpreting the feedback is time-consuming. +AI models might decrease the time needed for diagnosis. +A clinical context aware multi-instance learning model with self-supervised pre-training (SCLICOM) is developed and validated to automate diagnosis on HHG images. + +\paragraph{Methods} +A five-fold cross-validation study was conducted on HHG data from the Princess Máxima Center for pediatric oncology. +Outcomes of interest were pilocytic astrocytoma (PA) and medulloblastoma (MB). +A convolutional neural network with self-supervised pre-training (DeepSMILE) and without were validated. +A model with clinical context embedding was developed and validated. +The performance of the models was assessed by the area under the precision-recall-gain curve (AUPRG) and the mean average precision.\todo{THIS STILL HAS TO BE CHECKED} + +\paragraph{Results} +HHG biopsy images of 25 children with PA (17) and MB (8) were used. +The model achieved a mean average precision of \num{0.75} (\qty{95}{\percent} CI \numrange{0.5}{1}) and \num{0.41} (\qty{95}{\percent} CI \numrange{-0.15}{0.97}) AUPRG. +The possibility to select tiles with the highest attentions per tile served useful to help diagnose medulloblastoma or pilocytic astrocytoma. + +\paragraph{Discussion} +SCLICOM showed promising discrimination in predicting PA or MB, but needs further external validation. +After additional validation, the updated model may be used to intraoperatively discriminate between pediatric patients with PA or MB or to pre-select interesting regions for diagnosis. diff --git a/pediatric-brain-tumours/sections/conclusion.tex b/pediatric-brain-tumours/sections/conclusion.tex new file mode 100644 index 0000000..c5d0e6b --- /dev/null +++ b/pediatric-brain-tumours/sections/conclusion.tex @@ -0,0 +1,5 @@ +\section{Conclusion} +The goal of this study was to develop a classifier to distinguish pilocytic astrocytoma and medulloblastoma in higher harmonic generation images while also providing resection location as clinical context. +An attention-based multi-instance learning classifier pretrained with SimCLR and using BERT for clinical context embedding achieved a mean average precision of \num{0.89\pm 0.14} and \num{0.41\pm 0.56} AUPRG. +More data is needed to test if the proposed model performs better than an ImageNet pretrained model or a model without clinical context embedding performed. +Although the model could benefit from more data and may be fine-tuned on a broader range of tumors, the model may be used intraoperatively to validate medulloblastoma or pilocytic astrocytoma diagnoses or to pre-select interesting regions for diagnosis. diff --git a/pediatric-brain-tumours/sections/discussion/implications.tex b/pediatric-brain-tumours/sections/discussion/implications.tex index 5a39092..a17bcf5 100644 --- a/pediatric-brain-tumours/sections/discussion/implications.tex +++ b/pediatric-brain-tumours/sections/discussion/implications.tex @@ -12,5 +12,5 @@ \subsubsection{Attention weighted images might be used for visual guidance} \subsubsection{Prediction may be consulted as validation} When working with time constraints, such as in an intraoperative setting, human mistakes may occur more frequently. -Moreover, pathology using HHG microscopy is not well-established yet, so pathologists would need to be trained on HHG images (ref Sylvia), and therefore they make more errors in the beginning of using this modality. +Moreover, pathology using HHG microscopy is not well-established yet, so pathologists would need to be trained on HHG images~\cite{Spies2023}, and therefore they make more errors in the beginning of using this modality. If the model is further improved until a desired performance is reached, the prediction may serve as a (non-binding) diagnosis validation. diff --git a/pediatric-brain-tumours/sections/introduction.tex b/pediatric-brain-tumours/sections/introduction.tex index 080950d..bdf7857 100644 --- a/pediatric-brain-tumours/sections/introduction.tex +++ b/pediatric-brain-tumours/sections/introduction.tex @@ -1,11 +1,40 @@ \section{Introduction} -Introduction... +Cancer accounted for about \num{10000000} deaths worldwide in 2020. +\num{246000} of those were due to brain tumors~\cite{Kocarnik2022}. +In children, brain tumors are the leading cause of cancer mortality~\cite{Fahmideh2021}. +Pediatric brain tumors (PBT) in the Netherlands are treated in the Princess Máxima Center for pediatric oncology (PMC). +PMC treats many PBTs, but pilocytic astrocytoma (PA) and medulloblastoma (MB) are two of the most prevalent with an incidence of 0.91 and 0.40 per 100000 children, respectively~\cite{Fahmideh2021}. -Link HE with HHG. +Tumors need to be assessed by pathologists to determine tumor type and severity before making a treatment plan. +Starting treatment on time gets increasingly difficult with an expected shortage of pathologists~\cite{George2019}. +Pathologists also have access to more health data resulting from multiple imaging techniques, such as magnetic resonance imaging and positron emission tomography, and clinical records than ever before. +All this clinical data needs to be processed to optimize patient care. +These issues can be addressed by integrating machine learning effectively~\cite{Parwani2019}. -Tell something about tumors +One of the modalities that can benefit from machine learning is histopathology on whole-slide images (WSI). +Biopsies are processed and usually stained with haematoxylin and eosin (HE). +HE stained tissues are placed on glass slides and imaged with a microscope to get WSIs. +The images are used by pathologists to make a diagnosis. +A wide variety of disease patterns can be recognized by histopathology. +Pathologists look at large tissue areas and frequently annotate regions of interest to make reasoned diagnoses. -State goal (development and validation of the model), and purpose, and users. +Histopathology images come with artifacts~\cite{Taqi2018} and acquiring them takes a long time and can in general only be done post-operative. +Sometimes, intraoperative assessment is desired, as treatment may be tumor type specific. +Utilizing higher harmonic generation (HHG) microscopy as a non-invasive and label-free imaging technique enables intraoperative resection feedback. -State how to get there and other works that do something similar. +Manual tumor diagnosis on large HHG images is time-consuming. +Various techniques are available to automate tumor diagnosis in WSIs~\cite{Litjens2017}. +\textcite{Blokker2022} has shown that deep learning models trained on THG data can intraoperatively distinguish glioma from epilepsy brain tissue. +A convolutional neural network was trained end-to-end in a tile supervised manner, providing the neurosurgeon with a tile-level diagnosis. +Another promising branch of techniques relies on multi-instance learning (MIL) where WSIs are cut in tiles. +This allows for extraction of interesting features and limiting the amount of data per training batch. + +In this work, SCLICOM (from Self-supervised CLInical COntext Multi-instance learning) is proposed. +This study includes THG data, as well as SHG, 2PEF, tumor location data, providing an AI with more context to work with. +The model is based on DeepSMILE~\cite{Schirris2022} which was originally developed for HE images concerning breast and colorectal cancer. +DeepSMILE is a two-stage model that consists of a feature extractor and a MIL classifier. +The classifier is extended by including tumor location as clinical context, which is also available to pathologists. +The purpose of the product is to intraoperatively classify pilocytic astrocytoma and medulloblastoma using HHG images and clinical context in the form of textual tumor locations. +An attention system should give insight into which areas were relevant for classification. +The product may be adapted and trained to account for other tumors or more diseases at once. diff --git a/pediatric-brain-tumours/sections/methods/data-preparation.tex b/pediatric-brain-tumours/sections/methods/data-preparation.tex index f776547..f04e0ed 100644 --- a/pediatric-brain-tumours/sections/methods/data-preparation.tex +++ b/pediatric-brain-tumours/sections/methods/data-preparation.tex @@ -2,11 +2,11 @@ \subsection{Data preparation} The target classification is transformed to unique numbers. -From the raw HHG microscope data, images are exported\todo{Ref Sylvia thesis}. +Images are exported from the raw HHG microscope data by \textcite{Spies2023}. Overview images of \qty{1}{mpp} and close-up images of \qty{0.2}{mpp} of histologically interesting areas were made. To be able to use both images, the overview images were scaled to the same resolution as close-up images using Lanczos interpolation. -Given the size of most images (\qty{>100}{MB}) and to be able to use MIL, the images are subdivided into non-overlapping tiles of $224 \times 224$. +Given the size of the images (\qty{200}{MB} and $7700\times7900$ 8-bit RGB pixels on average) and to be able to use MIL, the images are subdivided into non-overlapping tiles of $224 \times 224$. Tiles overflowing the image are skipped. Many overview images contain a large empty space without tissue. diff --git a/pediatric-brain-tumours/sections/methods/participants.tex b/pediatric-brain-tumours/sections/methods/participants.tex index 4ab7efa..9afa7a1 100644 --- a/pediatric-brain-tumours/sections/methods/participants.tex +++ b/pediatric-brain-tumours/sections/methods/participants.tex @@ -1,5 +1,5 @@ \subsection{Participants} The data is acquired in the Princess Máxima Center for pediatric oncology, Utrecht. -Images were made of brain and solid tumor tissues excised from children (0--16). +Images were made of brain and solid tumor tissues excised from children (0--16 yr). All patients were eligible for imaging regardless of previously received therapy. diff --git a/pediatric-brain-tumours/sections/methods/sources-of-data.tex b/pediatric-brain-tumours/sections/methods/sources-of-data.tex index 850b984..2a5aaf0 100644 --- a/pediatric-brain-tumours/sections/methods/sources-of-data.tex +++ b/pediatric-brain-tumours/sections/methods/sources-of-data.tex @@ -1,5 +1,5 @@ \subsection{Source of data} Data is collected from 26 September 2022 until 13 April 2023 during routine care in the Princess Máxima Center for pediatric oncology. -HHG images were acquired before HE-acquisition as described in [ref Sylvia]. +HHG images were acquired before HE-acquisition as described in~\cite{Spies2023}. The dataset is used for train, validation, and test subsets. diff --git a/pediatric-brain-tumours/sections/methods/statistical-analysis-methods.tex b/pediatric-brain-tumours/sections/methods/statistical-analysis-methods.tex index 20267b2..94579d3 100644 --- a/pediatric-brain-tumours/sections/methods/statistical-analysis-methods.tex +++ b/pediatric-brain-tumours/sections/methods/statistical-analysis-methods.tex @@ -34,18 +34,20 @@ \subsubsection{Classifier} For DeepMIL, the attention block consists of a linear layer of size 256 with dropout and a $\tanh$ activation function and another linear layer with dropout, resulting in a scalar. The classifier block consists of a linear layer with dropout and $\mathrm{softmax}$ activation function and has an output size of the number of classes to predict, \ie two. For VarMIL, the last linear layer accepts an input of twice the size, to account for the variance vector. -For CCMIL, the last linear layer incorporates tumor locations in a 768-dimensional space. -The embedding is done with TinyClinicalBERT~\cite{Rohanian2023}, provided by Huggingface Transformers~\cite{Wolf2020}. +For CCMIL, the last linear layer maps textual tumor locations (\eg "fourth ventricle") to a 312-dimensional space. +Text is embedded with TinyClinicalBERT~\cite{Rohanian2023}, provided by Huggingface Transformers~\cite{Wolf2020}, using its \texttt{[CLS]}-token. During training, BERT's parameters were frozen. All other parameters were He initialized. +To validate BERT, the location embeddings were projected to two dimensions with t-SNE. -\subsubsection{Internal validation} +\subsubsection{Internal validation}\label{subsubsec:slicom-folds} The data was split in five training and test data folds, stratified by case. All training splits were further split randomly, again stratified by case. +The flow of images to splits is visualized in \cref{app:folds-splits-viz}. \subsubsection{Pre-training} The latter model is trained on 1 NVIDIA A30 GPU for 2 days with a bag size of 256 with gradients accumulated over two epochs to imitate larger bags. -The Adam optimizer was used with a learning rate of $3\times 10{-4}$ and $\beta_{1,2} = \{0.9,\, 0.999\}$ without weight decay and without a learning rate scheduler. +The Adam optimizer was used with a learning rate of $3\times 10^{-4}$ and $\beta_{1,2} = \{0.9,\, 0.999\}$ without weight decay and without a learning rate scheduler. The pretrained models are internally assessed by visualizing the extracted features in two ways. First, tiles corresponding to ten nearest neighbors in feature space are compared. diff --git a/pediatric-brain-tumours/sections/results/participants.tex b/pediatric-brain-tumours/sections/results/participants.tex index ad93a5c..7f87fb4 100644 --- a/pediatric-brain-tumours/sections/results/participants.tex +++ b/pediatric-brain-tumours/sections/results/participants.tex @@ -22,7 +22,7 @@ \subsection{Participants} Ganglioglioma & 3 \\ Ependymoma & 1 \\ Glioma & 1 \\ - Medulloblastoma & 1 \\ + Medullomyoblastoma & 1 \\ Diffuse midline glioma & 1 \\ Dysembryoplastic neuroepithelial tumor & 1 \\ Pituitary Neuroendocrine Tumors & 1 \\ diff --git a/pediatric-brain-tumours/sections/results/performance.tex b/pediatric-brain-tumours/sections/results/performance.tex index cc84f16..25a5a3c 100644 --- a/pediatric-brain-tumours/sections/results/performance.tex +++ b/pediatric-brain-tumours/sections/results/performance.tex @@ -116,9 +116,9 @@ \subsubsection{MIL} \cmidrule{2-4} \cmidrule{5-7} & AUC & AUPR & AUPRG & AUC & AUPR & AUPRG \\ \midrule - ImageNet + VarMIL & \num{0.59 \pm 0.33} & \num{0.80 \pm 0.16} & \num{0.21 \pm 0.60} & \num{0.86 \pm 0.15} & \num{0.96 \pm 0.04} & \num{0.57 \pm 0.43} \\ - SimCLR + VarMIL & \num{0.73 \pm 0.17} & \num{0.88 \pm 0.10} & \num{0.39 \pm 0.37} & \num{0.69 \pm 0.24} & \num{0.89 \pm 0.08} & \num{0.28 \pm 0.37} \\ - SimCLR + CCMIL & \num{0.75 \pm 0.25} & \num{0.89 \pm 0.14} & \num{0.41 \pm 0.56} & \num{0.71 \pm 0.28} & \num{0.90 \pm 0.10} & \num{0.34 \pm 0.54} \\ + ImageNet + VarMIL & \num{0.86 \pm 0.15} & \num{0.96 \pm 0.04} & \num{0.57 \pm 0.43} & \num{0.59 \pm 0.33} & \num{0.80 \pm 0.16} & \num{0.21 \pm 0.60} \\ + SimCLR + VarMIL & \num{0.69 \pm 0.24} & \num{0.89 \pm 0.08} & \num{0.28 \pm 0.37} & \num{0.73 \pm 0.17} & \num{0.88 \pm 0.10} & \num{0.39 \pm 0.37} \\ + SimCLR + CCMIL & \num{0.71 \pm 0.28} & \num{0.90 \pm 0.10} & \num{0.34 \pm 0.54} & \num{0.75 \pm 0.25} & \num{0.89 \pm 0.14} & \num{0.41 \pm 0.56} \\ \bottomrule \end{tabular*} \end{table*} @@ -147,27 +147,46 @@ \subsubsection{MIL} \end{figure*} \subsection{Explainability} +\subsubsection{Attention weighted images} Attention weighted images are shown in \cref{fig:a-weighted-images}. They are created from medulloblastoma and pilocytic astrocytoma data from the test set of fold 0 using the corresponding model. Only a small portion of the tiles is weighted with a substantial attention weight, resulting in rather dark images. \begin{figure*} \centering - \begin{tabular}[\linewidth]{c} + \begin{tabularx}{\linewidth}{c} \includegraphics[width=\linewidth]{pediatric-brain-tumours/images/PMC_HHG_36_Hersenen_I-05_8x8_200slow-summary.png} \\ \includegraphics[width=\linewidth]{pediatric-brain-tumours/images/PMC_HHG_36_Hersenen_I-05_8x8_200slow-tiles.png} \\ + \includegraphics[width=\linewidth]{pediatric-brain-tumours/images/PMC_HHG_36_Hersenen_I-05_8x8_200slow-tiles-low-a.png} \\ \midrule - \includegraphics[width=\linewidth]{pediatric-brain-tumours/images/PMC_HHG_32_Hersenen_I-05_5x7_200slow_II-summary.png} \\ - \includegraphics[width=\linewidth]{pediatric-brain-tumours/images/PMC_HHG_32_Hersenen_I-05_5x7_200slow_II-tiles.png} - \end{tabular} + \includegraphics[width=\linewidth]{pediatric-brain-tumours/images/PMC_HHG_32_Hersenen_I-05_5x7_200slow-summary.png} \\ + \includegraphics[width=\linewidth]{pediatric-brain-tumours/images/PMC_HHG_32_Hersenen_I-05_5x7_200slow-tiles.png} \\ + \includegraphics[width=\linewidth]{pediatric-brain-tumours/images/PMC_HHG_32_Hersenen_I-05_5x7_200slow-tiles-low-a.png} + \end{tabularx} \caption[Attention weighted images]{ Attention weighted images. Tiles are multiplied with min-max-normalized attention weights. Top: medulloblastoma (prediction score = 0.76). - Bottom: pilocytic astrocytoma (prediction score = 0.86). - Odd rows show the original image (A), the attention weighted image (B) and the pathologist annotations (C). - Even rows show the tiles with the highest attention weight. - The corresponding normalized attention weight is shown. + Bottom: pilocytic astrocytoma (prediction score = 0.81). + First rows show the original image (A), the attention weighted image (B) and the local attentions (linear grayscale) compared with pathologist annotations (green) (C). + Second and third rows show the tiles with the highest and lowest attention weights, respectively. + The corresponding normalized attention weight is printed. } \label{fig:a-weighted-images} -\end{figure*} \ No newline at end of file +\end{figure*} + +\subsubsection{Location embeddings} +A t-SNE projection of the location embeddings is shown in \cref{fig:tsne-cc}. +Texts seem grouped (\eg, "ventricle", "posterior cranial fossa", "cereb-", "brainstem", "lobe"). +The groupings are visualized by fitting a Gaussian mixture model with six components to the t-SNE projections. + +\begin{figure} + \centering + \includegraphics[width=\linewidth]{pediatric-brain-tumours/images/tsne-cc.pdf} + \caption[T-SNE projections of location embeddings]{ + T-SNE projections (perplexity $= 8$, $10^4$ iterations) of location embeddings. + Texts are connected to points with arrows. + A contour plot of a six-component Gaussian mixture model to show groups of text embeddings. + } + \label{fig:tsne-cc} +\end{figure} diff --git a/pediatric-brain-tumours/sections/results/usability.tex b/pediatric-brain-tumours/sections/results/usability.tex index eb27938..23c08b9 100644 --- a/pediatric-brain-tumours/sections/results/usability.tex +++ b/pediatric-brain-tumours/sections/results/usability.tex @@ -1,6 +1,6 @@ \subsection{Usability} The prediction model can be used intraoperatively to predict tumor type and amount in a biopsy. -The biopsy can be placed on the scanner as in [ref Sylvia] and optionally the location of the tumor can be given in natural language. +The biopsy can be placed on the scanner as described in~\cite{Spies2023} and optionally the location of the tumor can be given in natural language. The model outputs a prediction in seconds. To integrate the model with the target system, the raw data needs to be converted to images of \qty{0.2}{mpp} for the model to accept it. @@ -12,4 +12,4 @@ \subsection{Usability} Data polluted with blood or a malfunctioning imaging system are not detected by the model. The user should proceed with caution if any of such artifacts appear. -The model is shown to be accurate for images with the blood artifact where there are also regions of high quality [fig]. +% The model is shown to be accurate for images with the blood artifact where there are also regions of high quality [fig]. diff --git a/pediatric-brain-tumours/sections/theory/MIL.tex b/pediatric-brain-tumours/sections/theory/MIL.tex index 9b29b66..525ceff 100644 --- a/pediatric-brain-tumours/sections/theory/MIL.tex +++ b/pediatric-brain-tumours/sections/theory/MIL.tex @@ -1,5 +1,9 @@ \subsection{Multi-instance learning} +Now that a SimCLR trained backbone is trained and tiles can be compressed into feature vectors, the features can be used for classification. +To this end, a classifier must be chosen and trained. +Multi-instance learning (MIL) is a suitable technique to deal with multiple features concerning the same outcome. + \subsubsection{Classical} Multi-instance learning (MIL) is a supervised learning method. Typically, every instance in a dataset is labelled individually. @@ -28,11 +32,15 @@ \subsubsection{Classical} \label{fig:classifier} \end{figure*} -\todo[noline]{Describe need for newer MIL methods} - \subsubsection{Attention-based MIL pooling} +For pathology studies, it is important to visualize which instances are important for classification. +With classical MIL, this is not possible. To overcome the restrictive nature of maximum pooling, \textcite{Ilse2018} propose DeepMIL and use an adaptive weighted average of instances. This weighted average includes learnable weights in an attention-based manner. +High weights should be assigned to instances that are likely to have a positive label. +The weights allow distinguishing interesting instances from uninteresting ones, see \cref{fig:explainer}. +The attention weights of specific instances (\eg, patches in an image) explain how the model comes to its diagnosis prediction which could be compared with the doctor's diagnosis. + Let $H=\{\vec{h}_1, \ldots, \vec{h}_K\}$ be a bag of $K$ embeddings. The weighted average of $H$ is \begin{equation} @@ -45,11 +53,7 @@ \subsubsection{Attention-based MIL pooling} where $\mathbf{w} \in \mathbb{R}^{L \times 1}$ and $\mathbf{V}\in \mathbb{R}^{L \times M}$ are the learnable parameters. The denominator ensures the weights sum to 1. $\vec{z}$ is further processed in an MLP for classification. - -The weights allow distinguishing interesting instances from uninteresting ones, see \cref{fig:explainer} -High weights should be assigned to instances that are likely to have a positive label. -This is particularly important for pathology studies. -The attention weights of specific instances (\eg, patches in an image) explain how the model comes to its diagnosis prediction that can be compared with the doctor's diagnosis. +The weights can also be multiplied by the corresponding input tiles to show which features were important for the prediction. \begin{figure*} \centering @@ -64,12 +68,14 @@ \subsubsection{Attention-based MIL pooling} \end{figure*} \subsubsection{Variance MIL pooling}\label{subsubsection:theory_varmil} -Learning weights just for a weighted average discards any information between instances, while that might be important for the bag prediction. -Variance MIL (VarMIL) by \textcite{Schirris2022} propose to add a learned attention-weighted variance, +DeepMIL has the disadvantage of discarding any inter-tile information. +In a clinical setting, this high-level information can model \eg the intratumor heterogeneity or tumor border shape. + +\textcite{Schirris2022} propose Variance MIL (VarMIL) which adds a learnable attention-weighted variance, \begin{equation} - \boldsymbol{\sigma} = \frac{K}{K - 1} \sum_{k=1}^{K}a_k\left(\vec{h}_k-\vec{z}\right)^2. + \boldsymbol{\sigma} = \frac{K}{K - 1} \sum_{k=1}^{K}a_k\left(\vec{h}_k-\vec{z}\right)^2, \end{equation} -In a clinical setting, this variance can model \eg the intratumor heterogeneity or tumor border shape. +to capture global features. The weighted average and variance are concatenated in a single vector, such that \begin{equation} \hat{\vec{z}} = @@ -78,10 +84,11 @@ \subsubsection{Variance MIL pooling}\label{subsubsection:theory_varmil} \boldsymbol{\sigma} \end{pmatrix}. \end{equation} +As with DeepMIL, the weights can still be used to highlight tiles. \subsubsection{Clinical Context MIL} Clinical contexts such as locations of tumor resections are important for clinical decisions by pathologists. -For example, medulloblastoma is mostly found in the fourth ventricle or cerebellar parenchyma.\todo{https://pubmed.ncbi.nlm.nih.gov/26336203/} +For example, medulloblastoma is mostly found in the fourth ventricle or cerebellar parenchyma~\cite{Millard2016}. This information along with the attention weighted tiles (see \cref{subsubsection:theory_varmil}) may lead to better performance. As pathologists also have access to this information, it is reasonable for an AI model to use the same available information. @@ -100,12 +107,13 @@ \subsubsection{Clinical Context MIL} where $\boldsymbol{\mathcal{C}}$ is the clinical context text embedding. $\tilde{\vec{z}}$ is used as input for the trainable classifier. -The text embedding can be created in various ways~\cite{Khattak2019}. +The text embedding can be created in various ways\footnote{See \textcite{Khattak2019} for a review.}. The most direct way would be to create a list of possible sentences and convert the input text to a one-hot encoded vector, effectively selecting a specific sentence from the vocabulary. The drawback of this method is that it cannot handle out-of-vocabulary (OOV) words, which requires an extensive vocabulary that might become obsolete. One of the most promising methods for creating text embeddings for use in downstream tasks are transformer based methods~\cite{Vaswani2017}. Attention-based models such as BERT~\cite{Devlin2018} and ELMo~\cite{Peters2018} are able to distinguish important words and can distil the right meaning of homographs. +With BERT's \texttt{[CLS]}-token, a sentence classification can be created to embed a sentence to a point in a high-dimensional space. These models can learn OOV words which enables them to be used in new contexts. The aggregator of CCMIL is visualized in \cref{fig:CCMIL_aggregator}. diff --git a/pediatric-brain-tumours/sections/theory/feature-extraction.tex b/pediatric-brain-tumours/sections/theory/feature-extraction.tex index 4079d8e..f0b0876 100644 --- a/pediatric-brain-tumours/sections/theory/feature-extraction.tex +++ b/pediatric-brain-tumours/sections/theory/feature-extraction.tex @@ -30,7 +30,7 @@ \subsection{Feature extraction} \begin{align} s_{i,j} = \frac{\vec{z}_i \cdot \vec{z}_j }{\|\vec{z}_i\|\|\vec{z}_j\|}, \end{align} -$N$ the number of samples in the batch, $\tau$ the temperature to scale the similarity with, and $\mathbb{1}$ the indicator function which maps all elements that satisfy it subscript to 1. +$N$ the number of samples in the batch, $\tau$ the temperature to scale the similarity with, and $\mathbb{1}_{[k\neq i]}=1$ if $k \neq i$ and 0 otherwise. SimCLR is visualized in \cref{fig:simclr}. diff --git a/pediatric-brain-tumours/sections/theory/mb-pa-hhg.tex b/pediatric-brain-tumours/sections/theory/mb-pa-hhg.tex new file mode 100644 index 0000000..7490bea --- /dev/null +++ b/pediatric-brain-tumours/sections/theory/mb-pa-hhg.tex @@ -0,0 +1,2 @@ +\subsection{Medulloblastoma and pilocytic astrocytoma} +TODO. How do they look like in HHG images? diff --git a/pediatric-brain-tumours/sections/theory/theory.tex b/pediatric-brain-tumours/sections/theory/theory.tex index bb8b99f..fc27790 100644 --- a/pediatric-brain-tumours/sections/theory/theory.tex +++ b/pediatric-brain-tumours/sections/theory/theory.tex @@ -1,5 +1,6 @@ \section{Theory} -\input{pediatric-brain-tumours/sections/theory/masking.tex} +\input{pediatric-brain-tumours/sections/theory/mb-pa-hhg.tex} \input{pediatric-brain-tumours/sections/theory/feature-extraction.tex} \input{pediatric-brain-tumours/sections/theory/MIL.tex} \input{pediatric-brain-tumours/sections/theory/performance.tex} +\input{pediatric-brain-tumours/sections/theory/masking.tex} diff --git a/skinstression/images/lds/a.pdf b/skinstression/images/lds/a.pdf index cb96ae9..d9f4a8c 100644 Binary files a/skinstression/images/lds/a.pdf and b/skinstression/images/lds/a.pdf differ diff --git a/skinstression/images/lds/k.pdf b/skinstression/images/lds/k.pdf index 501a34d..e1f2fb0 100644 Binary files a/skinstression/images/lds/k.pdf and b/skinstression/images/lds/k.pdf differ diff --git a/skinstression/images/lds/xc.pdf b/skinstression/images/lds/xc.pdf index 22cf8c1..735e47b 100644 Binary files a/skinstression/images/lds/xc.pdf and b/skinstression/images/lds/xc.pdf differ diff --git a/skinstression/images/logistic-fits/r2_hist.pdf b/skinstression/images/logistic-fits/r2_hist.pdf index 12b5df0..52b9df6 100644 Binary files a/skinstression/images/logistic-fits/r2_hist.pdf and b/skinstression/images/logistic-fits/r2_hist.pdf differ diff --git a/skinstression/images/logistic-fits/sample_10.pdf b/skinstression/images/logistic-fits/sample_10.pdf index d72ad26..f188aab 100644 Binary files a/skinstression/images/logistic-fits/sample_10.pdf and b/skinstression/images/logistic-fits/sample_10.pdf differ diff --git a/skinstression/images/logistic-fits/sample_11.pdf b/skinstression/images/logistic-fits/sample_11.pdf index 77e264e..f7db56a 100644 Binary files a/skinstression/images/logistic-fits/sample_11.pdf and b/skinstression/images/logistic-fits/sample_11.pdf differ diff --git a/skinstression/images/logistic-fits/sample_12.pdf b/skinstression/images/logistic-fits/sample_12.pdf index d606d58..be3b49d 100644 Binary files a/skinstression/images/logistic-fits/sample_12.pdf and b/skinstression/images/logistic-fits/sample_12.pdf differ diff --git a/skinstression/images/logistic-fits/sample_13.pdf b/skinstression/images/logistic-fits/sample_13.pdf index c5dab93..3740f62 100644 Binary files a/skinstression/images/logistic-fits/sample_13.pdf and b/skinstression/images/logistic-fits/sample_13.pdf differ diff --git a/skinstression/images/logistic-fits/sample_14.pdf b/skinstression/images/logistic-fits/sample_14.pdf index f626653..a53858a 100644 Binary files a/skinstression/images/logistic-fits/sample_14.pdf and b/skinstression/images/logistic-fits/sample_14.pdf differ diff --git a/skinstression/images/logistic-fits/sample_15.pdf b/skinstression/images/logistic-fits/sample_15.pdf index c0ded93..42074f7 100644 Binary files a/skinstression/images/logistic-fits/sample_15.pdf and b/skinstression/images/logistic-fits/sample_15.pdf differ diff --git a/skinstression/images/logistic-fits/sample_16.pdf b/skinstression/images/logistic-fits/sample_16.pdf index 8d05b99..c012196 100644 Binary files a/skinstression/images/logistic-fits/sample_16.pdf and b/skinstression/images/logistic-fits/sample_16.pdf differ diff --git a/skinstression/images/logistic-fits/sample_17.pdf b/skinstression/images/logistic-fits/sample_17.pdf index b4dd896..782b537 100644 Binary files a/skinstression/images/logistic-fits/sample_17.pdf and b/skinstression/images/logistic-fits/sample_17.pdf differ diff --git a/skinstression/images/logistic-fits/sample_18.pdf b/skinstression/images/logistic-fits/sample_18.pdf index a160f2d..8e8c971 100644 Binary files a/skinstression/images/logistic-fits/sample_18.pdf and b/skinstression/images/logistic-fits/sample_18.pdf differ diff --git a/skinstression/images/logistic-fits/sample_19.pdf b/skinstression/images/logistic-fits/sample_19.pdf index 2197c41..b656d22 100644 Binary files a/skinstression/images/logistic-fits/sample_19.pdf and b/skinstression/images/logistic-fits/sample_19.pdf differ diff --git a/skinstression/images/logistic-fits/sample_20.pdf b/skinstression/images/logistic-fits/sample_20.pdf index 424e986..7cde9a4 100644 Binary files a/skinstression/images/logistic-fits/sample_20.pdf and b/skinstression/images/logistic-fits/sample_20.pdf differ diff --git a/skinstression/images/logistic-fits/sample_21.pdf b/skinstression/images/logistic-fits/sample_21.pdf index ace57e9..6238ebd 100644 Binary files a/skinstression/images/logistic-fits/sample_21.pdf and b/skinstression/images/logistic-fits/sample_21.pdf differ diff --git a/skinstression/images/logistic-fits/sample_22.pdf b/skinstression/images/logistic-fits/sample_22.pdf index 8c72653..05b3f8a 100644 Binary files a/skinstression/images/logistic-fits/sample_22.pdf and b/skinstression/images/logistic-fits/sample_22.pdf differ diff --git a/skinstression/images/logistic-fits/sample_23.pdf b/skinstression/images/logistic-fits/sample_23.pdf index 8c2fdb0..517a07a 100644 Binary files a/skinstression/images/logistic-fits/sample_23.pdf and b/skinstression/images/logistic-fits/sample_23.pdf differ diff --git a/skinstression/images/logistic-fits/sample_24.pdf b/skinstression/images/logistic-fits/sample_24.pdf index 8eab527..d3b1bd2 100644 Binary files a/skinstression/images/logistic-fits/sample_24.pdf and b/skinstression/images/logistic-fits/sample_24.pdf differ diff --git a/skinstression/images/logistic-fits/sample_25.pdf b/skinstression/images/logistic-fits/sample_25.pdf index 07a27b5..b2cccc4 100644 Binary files a/skinstression/images/logistic-fits/sample_25.pdf and b/skinstression/images/logistic-fits/sample_25.pdf differ diff --git a/skinstression/images/logistic-fits/sample_26.pdf b/skinstression/images/logistic-fits/sample_26.pdf index fb5f4c1..74a7832 100644 Binary files a/skinstression/images/logistic-fits/sample_26.pdf and b/skinstression/images/logistic-fits/sample_26.pdf differ diff --git a/skinstression/images/logistic-fits/sample_27.pdf b/skinstression/images/logistic-fits/sample_27.pdf index 7832200..3617428 100644 Binary files a/skinstression/images/logistic-fits/sample_27.pdf and b/skinstression/images/logistic-fits/sample_27.pdf differ diff --git a/skinstression/images/logistic-fits/sample_28.pdf b/skinstression/images/logistic-fits/sample_28.pdf index 5eb9f88..54e9771 100644 Binary files a/skinstression/images/logistic-fits/sample_28.pdf and b/skinstression/images/logistic-fits/sample_28.pdf differ diff --git a/skinstression/images/logistic-fits/sample_29.pdf b/skinstression/images/logistic-fits/sample_29.pdf index 84719d9..728e172 100644 Binary files a/skinstression/images/logistic-fits/sample_29.pdf and b/skinstression/images/logistic-fits/sample_29.pdf differ diff --git a/skinstression/images/logistic-fits/sample_30.pdf b/skinstression/images/logistic-fits/sample_30.pdf index 79e9355..fc9b4c0 100644 Binary files a/skinstression/images/logistic-fits/sample_30.pdf and b/skinstression/images/logistic-fits/sample_30.pdf differ diff --git a/skinstression/images/logistic-fits/sample_31.pdf b/skinstression/images/logistic-fits/sample_31.pdf index e913e81..34e6898 100644 Binary files a/skinstression/images/logistic-fits/sample_31.pdf and b/skinstression/images/logistic-fits/sample_31.pdf differ diff --git a/skinstression/images/logistic-fits/sample_32.pdf b/skinstression/images/logistic-fits/sample_32.pdf index 301753b..efdecfe 100644 Binary files a/skinstression/images/logistic-fits/sample_32.pdf and b/skinstression/images/logistic-fits/sample_32.pdf differ diff --git a/skinstression/images/logistic-fits/sample_33.pdf b/skinstression/images/logistic-fits/sample_33.pdf index 17cfb54..8e22d91 100644 Binary files a/skinstression/images/logistic-fits/sample_33.pdf and b/skinstression/images/logistic-fits/sample_33.pdf differ diff --git a/skinstression/images/logistic-fits/sample_34.pdf b/skinstression/images/logistic-fits/sample_34.pdf index b523815..c67865b 100644 Binary files a/skinstression/images/logistic-fits/sample_34.pdf and b/skinstression/images/logistic-fits/sample_34.pdf differ diff --git a/skinstression/images/logistic-fits/sample_35.pdf b/skinstression/images/logistic-fits/sample_35.pdf index 0cd2691..3c1ab82 100644 Binary files a/skinstression/images/logistic-fits/sample_35.pdf and b/skinstression/images/logistic-fits/sample_35.pdf differ diff --git a/skinstression/images/logistic-fits/sample_36.pdf b/skinstression/images/logistic-fits/sample_36.pdf index 044fdea..fca9392 100644 Binary files a/skinstression/images/logistic-fits/sample_36.pdf and b/skinstression/images/logistic-fits/sample_36.pdf differ diff --git a/skinstression/images/logistic-fits/sample_37.pdf b/skinstression/images/logistic-fits/sample_37.pdf index d57be6e..e9b6274 100644 Binary files a/skinstression/images/logistic-fits/sample_37.pdf and b/skinstression/images/logistic-fits/sample_37.pdf differ diff --git a/skinstression/images/logistic-fits/sample_38.pdf b/skinstression/images/logistic-fits/sample_38.pdf index f9f1d8e..1bfb0ec 100644 Binary files a/skinstression/images/logistic-fits/sample_38.pdf and b/skinstression/images/logistic-fits/sample_38.pdf differ diff --git a/skinstression/images/logistic-fits/sample_39.pdf b/skinstression/images/logistic-fits/sample_39.pdf index 38712d6..6c8e9b6 100644 Binary files a/skinstression/images/logistic-fits/sample_39.pdf and b/skinstression/images/logistic-fits/sample_39.pdf differ diff --git a/skinstression/images/logistic-fits/sample_40.pdf b/skinstression/images/logistic-fits/sample_40.pdf index 63f7237..20a0c60 100644 Binary files a/skinstression/images/logistic-fits/sample_40.pdf and b/skinstression/images/logistic-fits/sample_40.pdf differ diff --git a/skinstression/images/logistic-fits/sample_41.pdf b/skinstression/images/logistic-fits/sample_41.pdf index 072e7d6..1c6cde3 100644 Binary files a/skinstression/images/logistic-fits/sample_41.pdf and b/skinstression/images/logistic-fits/sample_41.pdf differ diff --git a/skinstression/images/logistic-fits/sample_42.pdf b/skinstression/images/logistic-fits/sample_42.pdf index cd03d2f..eddc100 100644 Binary files a/skinstression/images/logistic-fits/sample_42.pdf and b/skinstression/images/logistic-fits/sample_42.pdf differ diff --git a/skinstression/images/logistic-fits/sample_43.pdf b/skinstression/images/logistic-fits/sample_43.pdf index d4c2048..1018e03 100644 Binary files a/skinstression/images/logistic-fits/sample_43.pdf and b/skinstression/images/logistic-fits/sample_43.pdf differ diff --git a/skinstression/images/logistic-fits/sample_44.pdf b/skinstression/images/logistic-fits/sample_44.pdf index 2609152..00bffc2 100644 Binary files a/skinstression/images/logistic-fits/sample_44.pdf and b/skinstression/images/logistic-fits/sample_44.pdf differ diff --git a/skinstression/images/logistic-fits/sample_45.pdf b/skinstression/images/logistic-fits/sample_45.pdf index fd77bad..3cd220d 100644 Binary files a/skinstression/images/logistic-fits/sample_45.pdf and b/skinstression/images/logistic-fits/sample_45.pdf differ diff --git a/skinstression/images/logistic-fits/sample_46.pdf b/skinstression/images/logistic-fits/sample_46.pdf index f024e04..05b0f9c 100644 Binary files a/skinstression/images/logistic-fits/sample_46.pdf and b/skinstression/images/logistic-fits/sample_46.pdf differ diff --git a/skinstression/images/logistic-fits/sample_47.pdf b/skinstression/images/logistic-fits/sample_47.pdf index 817fc85..81448b3 100644 Binary files a/skinstression/images/logistic-fits/sample_47.pdf and b/skinstression/images/logistic-fits/sample_47.pdf differ diff --git a/skinstression/images/logistic-fits/sample_48.pdf b/skinstression/images/logistic-fits/sample_48.pdf index 688ae60..b1712f5 100644 Binary files a/skinstression/images/logistic-fits/sample_48.pdf and b/skinstression/images/logistic-fits/sample_48.pdf differ diff --git a/skinstression/images/logistic-fits/sample_49.pdf b/skinstression/images/logistic-fits/sample_49.pdf index e5bedd6..8885ec1 100644 Binary files a/skinstression/images/logistic-fits/sample_49.pdf and b/skinstression/images/logistic-fits/sample_49.pdf differ diff --git a/skinstression/images/logistic-fits/sample_50.pdf b/skinstression/images/logistic-fits/sample_50.pdf index 21cbda9..e3259e8 100644 Binary files a/skinstression/images/logistic-fits/sample_50.pdf and b/skinstression/images/logistic-fits/sample_50.pdf differ diff --git a/skinstression/images/logistic-fits/sample_51.pdf b/skinstression/images/logistic-fits/sample_51.pdf index 2680409..036b0d8 100644 Binary files a/skinstression/images/logistic-fits/sample_51.pdf and b/skinstression/images/logistic-fits/sample_51.pdf differ diff --git a/skinstression/images/logistic-fits/sample_52.pdf b/skinstression/images/logistic-fits/sample_52.pdf index 0e86dc6..f304cf2 100644 Binary files a/skinstression/images/logistic-fits/sample_52.pdf and b/skinstression/images/logistic-fits/sample_52.pdf differ diff --git a/skinstression/images/logistic-fits/sample_53.pdf b/skinstression/images/logistic-fits/sample_53.pdf index b13c683..759531c 100644 Binary files a/skinstression/images/logistic-fits/sample_53.pdf and b/skinstression/images/logistic-fits/sample_53.pdf differ diff --git a/skinstression/images/logistic-fits/sample_54.pdf b/skinstression/images/logistic-fits/sample_54.pdf index b43cabb..4793a17 100644 Binary files a/skinstression/images/logistic-fits/sample_54.pdf and b/skinstression/images/logistic-fits/sample_54.pdf differ diff --git a/skinstression/images/logistic-fits/sample_55.pdf b/skinstression/images/logistic-fits/sample_55.pdf index 7840e78..da8c6c6 100644 Binary files a/skinstression/images/logistic-fits/sample_55.pdf and b/skinstression/images/logistic-fits/sample_55.pdf differ diff --git a/skinstression/images/logistic-fits/sample_56.pdf b/skinstression/images/logistic-fits/sample_56.pdf index db3d103..da47dd6 100644 Binary files a/skinstression/images/logistic-fits/sample_56.pdf and b/skinstression/images/logistic-fits/sample_56.pdf differ diff --git a/skinstression/images/logistic-fits/sample_57.pdf b/skinstression/images/logistic-fits/sample_57.pdf index 27376f0..6221b88 100644 Binary files a/skinstression/images/logistic-fits/sample_57.pdf and b/skinstression/images/logistic-fits/sample_57.pdf differ diff --git a/skinstression/images/logistic-fits/sample_58.pdf b/skinstression/images/logistic-fits/sample_58.pdf index d29ad48..b0ae94e 100644 Binary files a/skinstression/images/logistic-fits/sample_58.pdf and b/skinstression/images/logistic-fits/sample_58.pdf differ diff --git a/skinstression/images/logistic-fits/sample_59.pdf b/skinstression/images/logistic-fits/sample_59.pdf index 15a1e88..8752051 100644 Binary files a/skinstression/images/logistic-fits/sample_59.pdf and b/skinstression/images/logistic-fits/sample_59.pdf differ diff --git a/skinstression/images/logistic-fits/sample_6.pdf b/skinstression/images/logistic-fits/sample_6.pdf index 437f32c..dac06c9 100644 Binary files a/skinstression/images/logistic-fits/sample_6.pdf and b/skinstression/images/logistic-fits/sample_6.pdf differ diff --git a/skinstression/images/logistic-fits/sample_60.pdf b/skinstression/images/logistic-fits/sample_60.pdf index d019630..616d97b 100644 Binary files a/skinstression/images/logistic-fits/sample_60.pdf and b/skinstression/images/logistic-fits/sample_60.pdf differ diff --git a/skinstression/images/logistic-fits/sample_61.pdf b/skinstression/images/logistic-fits/sample_61.pdf index 561c16b..495eec5 100644 Binary files a/skinstression/images/logistic-fits/sample_61.pdf and b/skinstression/images/logistic-fits/sample_61.pdf differ diff --git a/skinstression/images/logistic-fits/sample_7.pdf b/skinstression/images/logistic-fits/sample_7.pdf index 9e09963..8de937a 100644 Binary files a/skinstression/images/logistic-fits/sample_7.pdf and b/skinstression/images/logistic-fits/sample_7.pdf differ diff --git a/skinstression/images/logistic-fits/sample_8.pdf b/skinstression/images/logistic-fits/sample_8.pdf index a534e2e..8bd12c0 100644 Binary files a/skinstression/images/logistic-fits/sample_8.pdf and b/skinstression/images/logistic-fits/sample_8.pdf differ diff --git a/skinstression/images/logistic-fits/sample_9.pdf b/skinstression/images/logistic-fits/sample_9.pdf index bd09490..c83d4ff 100644 Binary files a/skinstression/images/logistic-fits/sample_9.pdf and b/skinstression/images/logistic-fits/sample_9.pdf differ diff --git a/skinstression/images/measuring-setup.png b/skinstression/images/measuring-setup.png new file mode 100644 index 0000000..047c735 Binary files /dev/null and b/skinstression/images/measuring-setup.png differ diff --git a/skinstression/images/pca-fits/PC_age.pdf b/skinstression/images/pca-fits/PC_age.pdf new file mode 100644 index 0000000..61d0e61 Binary files /dev/null and b/skinstression/images/pca-fits/PC_age.pdf differ diff --git a/skinstression/images/pca-fits/PC_person.pdf b/skinstression/images/pca-fits/PC_person.pdf new file mode 100644 index 0000000..7feb7d3 Binary files /dev/null and b/skinstression/images/pca-fits/PC_person.pdf differ diff --git a/skinstression/images/pca-fits/PC_sex.pdf b/skinstression/images/pca-fits/PC_sex.pdf new file mode 100644 index 0000000..ad9f8ef Binary files /dev/null and b/skinstression/images/pca-fits/PC_sex.pdf differ diff --git a/skinstression/images/pca-fits/r2_hist.pdf b/skinstression/images/pca-fits/r2_hist.pdf index 5ebfa20..00389e7 100644 Binary files a/skinstression/images/pca-fits/r2_hist.pdf and b/skinstression/images/pca-fits/r2_hist.pdf differ diff --git a/skinstression/images/pca-fits/sample_10.pdf b/skinstression/images/pca-fits/sample_10.pdf index c4ecaf3..1b75dbf 100644 Binary files a/skinstression/images/pca-fits/sample_10.pdf and b/skinstression/images/pca-fits/sample_10.pdf differ diff --git a/skinstression/images/pca-fits/sample_11.pdf b/skinstression/images/pca-fits/sample_11.pdf index dcf5350..c221a2d 100644 Binary files a/skinstression/images/pca-fits/sample_11.pdf and b/skinstression/images/pca-fits/sample_11.pdf differ diff --git a/skinstression/images/pca-fits/sample_12.pdf b/skinstression/images/pca-fits/sample_12.pdf index 17b030e..7cf886e 100644 Binary files a/skinstression/images/pca-fits/sample_12.pdf and b/skinstression/images/pca-fits/sample_12.pdf differ diff --git a/skinstression/images/pca-fits/sample_13.pdf b/skinstression/images/pca-fits/sample_13.pdf index 9a55f24..12f8012 100644 Binary files a/skinstression/images/pca-fits/sample_13.pdf and b/skinstression/images/pca-fits/sample_13.pdf differ diff --git a/skinstression/images/pca-fits/sample_14.pdf b/skinstression/images/pca-fits/sample_14.pdf index 3ec93e1..d24d6d2 100644 Binary files a/skinstression/images/pca-fits/sample_14.pdf and b/skinstression/images/pca-fits/sample_14.pdf differ diff --git a/skinstression/images/pca-fits/sample_15.pdf b/skinstression/images/pca-fits/sample_15.pdf index 97c8c6f..2f876ba 100644 Binary files a/skinstression/images/pca-fits/sample_15.pdf and b/skinstression/images/pca-fits/sample_15.pdf differ diff --git a/skinstression/images/pca-fits/sample_16.pdf b/skinstression/images/pca-fits/sample_16.pdf index 58959da..b60a2f6 100644 Binary files a/skinstression/images/pca-fits/sample_16.pdf and b/skinstression/images/pca-fits/sample_16.pdf differ diff --git a/skinstression/images/pca-fits/sample_17.pdf b/skinstression/images/pca-fits/sample_17.pdf index 93379a1..e868c69 100644 Binary files a/skinstression/images/pca-fits/sample_17.pdf and b/skinstression/images/pca-fits/sample_17.pdf differ diff --git a/skinstression/images/pca-fits/sample_18.pdf b/skinstression/images/pca-fits/sample_18.pdf index 0d68b05..1e59c0e 100644 Binary files a/skinstression/images/pca-fits/sample_18.pdf and b/skinstression/images/pca-fits/sample_18.pdf differ diff --git a/skinstression/images/pca-fits/sample_19.pdf b/skinstression/images/pca-fits/sample_19.pdf index 5b553ef..7cf7eeb 100644 Binary files a/skinstression/images/pca-fits/sample_19.pdf and b/skinstression/images/pca-fits/sample_19.pdf differ diff --git a/skinstression/images/pca-fits/sample_20.pdf b/skinstression/images/pca-fits/sample_20.pdf index 089acdd..7530dbd 100644 Binary files a/skinstression/images/pca-fits/sample_20.pdf and b/skinstression/images/pca-fits/sample_20.pdf differ diff --git a/skinstression/images/pca-fits/sample_21.pdf b/skinstression/images/pca-fits/sample_21.pdf index 6302b78..83d1bd5 100644 Binary files a/skinstression/images/pca-fits/sample_21.pdf and b/skinstression/images/pca-fits/sample_21.pdf differ diff --git a/skinstression/images/pca-fits/sample_22.pdf b/skinstression/images/pca-fits/sample_22.pdf index 48efd7e..13b8ad6 100644 Binary files a/skinstression/images/pca-fits/sample_22.pdf and b/skinstression/images/pca-fits/sample_22.pdf differ diff --git a/skinstression/images/pca-fits/sample_23.pdf b/skinstression/images/pca-fits/sample_23.pdf index 864b310..b2c0575 100644 Binary files a/skinstression/images/pca-fits/sample_23.pdf and b/skinstression/images/pca-fits/sample_23.pdf differ diff --git a/skinstression/images/pca-fits/sample_24.pdf b/skinstression/images/pca-fits/sample_24.pdf index 494527a..518f72c 100644 Binary files a/skinstression/images/pca-fits/sample_24.pdf and b/skinstression/images/pca-fits/sample_24.pdf differ diff --git a/skinstression/images/pca-fits/sample_25.pdf b/skinstression/images/pca-fits/sample_25.pdf index a0c1795..f4bf69b 100644 Binary files a/skinstression/images/pca-fits/sample_25.pdf and b/skinstression/images/pca-fits/sample_25.pdf differ diff --git a/skinstression/images/pca-fits/sample_26.pdf b/skinstression/images/pca-fits/sample_26.pdf index b73a225..2991d69 100644 Binary files a/skinstression/images/pca-fits/sample_26.pdf and b/skinstression/images/pca-fits/sample_26.pdf differ diff --git a/skinstression/images/pca-fits/sample_27.pdf b/skinstression/images/pca-fits/sample_27.pdf index 2b6eecb..43809fa 100644 Binary files a/skinstression/images/pca-fits/sample_27.pdf and b/skinstression/images/pca-fits/sample_27.pdf differ diff --git a/skinstression/images/pca-fits/sample_28.pdf b/skinstression/images/pca-fits/sample_28.pdf index 691480d..c2d3895 100644 Binary files a/skinstression/images/pca-fits/sample_28.pdf and b/skinstression/images/pca-fits/sample_28.pdf differ diff --git a/skinstression/images/pca-fits/sample_29.pdf b/skinstression/images/pca-fits/sample_29.pdf index 25e7b10..d5774be 100644 Binary files a/skinstression/images/pca-fits/sample_29.pdf and b/skinstression/images/pca-fits/sample_29.pdf differ diff --git a/skinstression/images/pca-fits/sample_30.pdf b/skinstression/images/pca-fits/sample_30.pdf index ef599d7..4dc57d1 100644 Binary files a/skinstression/images/pca-fits/sample_30.pdf and b/skinstression/images/pca-fits/sample_30.pdf differ diff --git a/skinstression/images/pca-fits/sample_31.pdf b/skinstression/images/pca-fits/sample_31.pdf index 47b2bf8..7d2ffa2 100644 Binary files a/skinstression/images/pca-fits/sample_31.pdf and b/skinstression/images/pca-fits/sample_31.pdf differ diff --git a/skinstression/images/pca-fits/sample_32.pdf b/skinstression/images/pca-fits/sample_32.pdf index 30e92a0..7eec558 100644 Binary files a/skinstression/images/pca-fits/sample_32.pdf and b/skinstression/images/pca-fits/sample_32.pdf differ diff --git a/skinstression/images/pca-fits/sample_33.pdf b/skinstression/images/pca-fits/sample_33.pdf index d45838e..2894195 100644 Binary files a/skinstression/images/pca-fits/sample_33.pdf and b/skinstression/images/pca-fits/sample_33.pdf differ diff --git a/skinstression/images/pca-fits/sample_34.pdf b/skinstression/images/pca-fits/sample_34.pdf index d42dd18..5985e8a 100644 Binary files a/skinstression/images/pca-fits/sample_34.pdf and b/skinstression/images/pca-fits/sample_34.pdf differ diff --git a/skinstression/images/pca-fits/sample_45.pdf b/skinstression/images/pca-fits/sample_45.pdf index 22ba029..6897410 100644 Binary files a/skinstression/images/pca-fits/sample_45.pdf and b/skinstression/images/pca-fits/sample_45.pdf differ diff --git a/skinstression/images/pca-fits/sample_46.pdf b/skinstression/images/pca-fits/sample_46.pdf index 132edf2..a6d106a 100644 Binary files a/skinstression/images/pca-fits/sample_46.pdf and b/skinstression/images/pca-fits/sample_46.pdf differ diff --git a/skinstression/images/pca-fits/sample_47.pdf b/skinstression/images/pca-fits/sample_47.pdf index 78fdad4..8cbba99 100644 Binary files a/skinstression/images/pca-fits/sample_47.pdf and b/skinstression/images/pca-fits/sample_47.pdf differ diff --git a/skinstression/images/pca-fits/sample_48.pdf b/skinstression/images/pca-fits/sample_48.pdf index 798da2d..cdd649a 100644 Binary files a/skinstression/images/pca-fits/sample_48.pdf and b/skinstression/images/pca-fits/sample_48.pdf differ diff --git a/skinstression/images/pca-fits/sample_49.pdf b/skinstression/images/pca-fits/sample_49.pdf index 16fb791..7b9e4e3 100644 Binary files a/skinstression/images/pca-fits/sample_49.pdf and b/skinstression/images/pca-fits/sample_49.pdf differ diff --git a/skinstression/images/pca-fits/sample_50.pdf b/skinstression/images/pca-fits/sample_50.pdf index 2c508c4..c1a925f 100644 Binary files a/skinstression/images/pca-fits/sample_50.pdf and b/skinstression/images/pca-fits/sample_50.pdf differ diff --git a/skinstression/images/pca-fits/sample_51.pdf b/skinstression/images/pca-fits/sample_51.pdf index 95f4e69..4b6864c 100644 Binary files a/skinstression/images/pca-fits/sample_51.pdf and b/skinstression/images/pca-fits/sample_51.pdf differ diff --git a/skinstression/images/pca-fits/sample_52.pdf b/skinstression/images/pca-fits/sample_52.pdf index 5fd7854..22947b4 100644 Binary files a/skinstression/images/pca-fits/sample_52.pdf and b/skinstression/images/pca-fits/sample_52.pdf differ diff --git a/skinstression/images/pca-fits/sample_53.pdf b/skinstression/images/pca-fits/sample_53.pdf index 0c6a3b5..b815d2a 100644 Binary files a/skinstression/images/pca-fits/sample_53.pdf and b/skinstression/images/pca-fits/sample_53.pdf differ diff --git a/skinstression/images/pca-fits/sample_54.pdf b/skinstression/images/pca-fits/sample_54.pdf index a91b693..5e04dfa 100644 Binary files a/skinstression/images/pca-fits/sample_54.pdf and b/skinstression/images/pca-fits/sample_54.pdf differ diff --git a/skinstression/images/pca-fits/sample_55.pdf b/skinstression/images/pca-fits/sample_55.pdf index 414280e..3b0bde3 100644 Binary files a/skinstression/images/pca-fits/sample_55.pdf and b/skinstression/images/pca-fits/sample_55.pdf differ diff --git a/skinstression/images/pca-fits/sample_56.pdf b/skinstression/images/pca-fits/sample_56.pdf index 13cfea2..19eebb0 100644 Binary files a/skinstression/images/pca-fits/sample_56.pdf and b/skinstression/images/pca-fits/sample_56.pdf differ diff --git a/skinstression/images/pca-fits/sample_57.pdf b/skinstression/images/pca-fits/sample_57.pdf index c240df7..e73f4e5 100644 Binary files a/skinstression/images/pca-fits/sample_57.pdf and b/skinstression/images/pca-fits/sample_57.pdf differ diff --git a/skinstression/images/pca-fits/sample_58.pdf b/skinstression/images/pca-fits/sample_58.pdf index 363e232..7221e41 100644 Binary files a/skinstression/images/pca-fits/sample_58.pdf and b/skinstression/images/pca-fits/sample_58.pdf differ diff --git a/skinstression/images/pca-fits/sample_59.pdf b/skinstression/images/pca-fits/sample_59.pdf index adbfc8e..289469f 100644 Binary files a/skinstression/images/pca-fits/sample_59.pdf and b/skinstression/images/pca-fits/sample_59.pdf differ diff --git a/skinstression/images/pca-fits/sample_6.pdf b/skinstression/images/pca-fits/sample_6.pdf index 914d7b6..cd78980 100644 Binary files a/skinstression/images/pca-fits/sample_6.pdf and b/skinstression/images/pca-fits/sample_6.pdf differ diff --git a/skinstression/images/pca-fits/sample_60.pdf b/skinstression/images/pca-fits/sample_60.pdf index 769f22d..7c4ffad 100644 Binary files a/skinstression/images/pca-fits/sample_60.pdf and b/skinstression/images/pca-fits/sample_60.pdf differ diff --git a/skinstression/images/pca-fits/sample_61.pdf b/skinstression/images/pca-fits/sample_61.pdf index e16408c..2e77541 100644 Binary files a/skinstression/images/pca-fits/sample_61.pdf and b/skinstression/images/pca-fits/sample_61.pdf differ diff --git a/skinstression/images/pca-fits/sample_7.pdf b/skinstression/images/pca-fits/sample_7.pdf index 9a1d1f5..cc8dbb9 100644 Binary files a/skinstression/images/pca-fits/sample_7.pdf and b/skinstression/images/pca-fits/sample_7.pdf differ diff --git a/skinstression/images/pca-fits/sample_8.pdf b/skinstression/images/pca-fits/sample_8.pdf index 90c8f82..5a7efde 100644 Binary files a/skinstression/images/pca-fits/sample_8.pdf and b/skinstression/images/pca-fits/sample_8.pdf differ diff --git a/skinstression/images/pca-fits/sample_9.pdf b/skinstression/images/pca-fits/sample_9.pdf index b0a791b..77bd395 100644 Binary files a/skinstression/images/pca-fits/sample_9.pdf and b/skinstression/images/pca-fits/sample_9.pdf differ diff --git a/skinstression/images/pca-fits/screeplot.pdf b/skinstression/images/pca-fits/screeplot.pdf new file mode 100644 index 0000000..f5d5e7b Binary files /dev/null and b/skinstression/images/pca-fits/screeplot.pdf differ diff --git a/skinstression/images/stress-strain-curve.pdf b/skinstression/images/stress-strain-curve.pdf new file mode 100644 index 0000000..fe678f8 Binary files /dev/null and b/skinstression/images/stress-strain-curve.pdf differ diff --git a/skinstression/images/stress-strain-curve.png b/skinstression/images/stress-strain-curve.png new file mode 100644 index 0000000..57c9245 Binary files /dev/null and b/skinstression/images/stress-strain-curve.png differ diff --git a/skinstression/images/stress-strain-curve.svg b/skinstression/images/stress-strain-curve.svg new file mode 100644 index 0000000..78df6e9 --- /dev/null +++ b/skinstression/images/stress-strain-curve.svg @@ -0,0 +1,208 @@ + + + + + + + + + + + + I/II + III + IV + stress (a.u.) + strain (a.u.) + 1 + 0 + + + + log dstress/dstrain (a.u.) + log strain (a.u.) + + A + B + + diff --git a/skinstression/sections/abstract.tex b/skinstression/sections/abstract.tex new file mode 100644 index 0000000..f8d7ca0 --- /dev/null +++ b/skinstression/sections/abstract.tex @@ -0,0 +1,25 @@ +\pdfbookmark[1]{Abstract}{skin_abstract} +\section*{Abstract} + +\paragraph{Background and objective} +Second harmonic generation (SHG) microscopy allows for imaging of biological tissue at micrometer scale such as collagen fibers. +Mechanical human skin stretch experiments were done to relate HHG images to stretch properties. +Stretch properties might be extracted by AI models to substitute mechanical measurements. +A skin stretch regression model (Skinstression) is developed and validated to compute the stress-strain curves corresponding to SHG images of human skin tissue. + +\paragraph{Methods} +A holdout study was conducted on SHG data of human skin tissue. +Outcomes of interest were the maximum stress, strain offset and maximum Young's modulus which together construct a stress-strain logistic curve. +A convolutional neural network was developed and validated. +The performance of the models was assessed by the coefficient of determination $R^2$ and occlusion was used to possibly explain predictions. +Artificially adding and removing collagen was done to attack the model. + +\paragraph{Results} +SHG skin tissue images of 18 old adult (5 men, 4 women, 6 unknown, ages \qtyrange{61}{94}{yr}) were used. +The model achieved a mean $R^2$ of \num{-0.36} (\qty{95}{\percent} CI \numrange{-1.66}{1.06}) on the test set. +Occlusion did not give insight into stretch property predictions. +Adversarial attacks seem to induce predictions corresponding with adding or removing collagen. + +\paragraph{Discussion} +Skinstression needs further training and external validation. +After additional training and validation, the updated model may be used to replace mechanical skin stretch measurements and ultimately analyze live microendoscope images to be used by plastic surgeons. diff --git a/skinstression/sections/conclusion.tex b/skinstression/sections/conclusion.tex new file mode 100644 index 0000000..8437a86 --- /dev/null +++ b/skinstression/sections/conclusion.tex @@ -0,0 +1,8 @@ +\section{Conclusion} +The goal of this study was to develop and validate a model that can construct a stress-strain curve corresponding to SHG images of old adult human skin tissue. +The convolutional neural network achieved a mean $R^2$ of \num{-0.36} (\qty{95}{\percent} CI \numrange{-1.66}{1.06}) on the test set. +Occlusion attribution maps hardly give any explanation to the prediction. +Artificially adding and removing collagen seems to respectively increase and decrease the maximum tissue stress, which is expected. +The model may benefit from more training data and must be validated on a test set that more resembles the training set distribution. +Future studies should use nested k-fold cross validation for selecting models and measuring performance. +An updated version of the model might replace mechanical skin stretch measurements. diff --git a/skinstression/sections/discussion.tex b/skinstression/sections/discussion.tex index f6955a6..f3134d1 100644 --- a/skinstression/sections/discussion.tex +++ b/skinstression/sections/discussion.tex @@ -11,8 +11,8 @@ \subsection{Logistic curve fits stress-strain curves better than an exponential This may lead to generalizability problems. For these reasons, the logistic curve parameters are used as a predictors. -\subsection{Shannon entropy is a better measure to exclude noise than kurtosis}\label{sec:disc_ent_vs_kur} -Measures to be used to exclude noise are kurtosis (\cref{fig:skin_kurtosis_depth}) and Shannon entropy (\cref{fig:skin_entropy_depth}). +\subsection{Shannon entropy is a better measure to exclude noise in collagen images than kurtosis}\label{sec:disc_ent_vs_kur} +Kurtosis and Shannon entropy were used to exclude noise in SHG image of collagen. For rather homogeneous images like stack 8, kurtosis is able to characterize the fogginess, while entropy finds images of equal quality. However, for well-structured tissue like stack 11, 12, 13 and 4, entropy recognizes faint images and qualifies them as bad. This is particularly useful for images that did not include any useful information (like stack 37). @@ -29,6 +29,7 @@ \subsection{The model does not generalize well} Although the train results were promising, the test results of \cref{fig:skinstression-test} show a lack of generalizability. Only two images from different stacks of the test set yield a stress-strain curve with $R^2=0.99$. This shows that the training set does not include enough features similar to the features in the test set. +As most of the images in the test set belong to one person, it may be possible that this person has exceptional skin tissue compared to samples in the training set. More images need to be included in the training set for the model to generalize well to held-out test cases. \subsection{Artificially increasing collagen density increases maximum stress prediction and stiffness and vice versa} @@ -79,7 +80,7 @@ \subsubsection{Split dataset before image and target transformations} \begin{equation} \texttt{actual test} = \{1,\, 2,\, 3,\, \ldots,\, N_\mathrm{images}\} - (\texttt{train} \cup \texttt{validation}). \end{equation} -This version of \texttt{actual test} has one major drawback, which is that the \qty{20}{\percent} highest indices are reserved for the test set, effectively excluding them from the shuffle. +This version of \texttt{actual test} has one major drawback, which is that the \qty{20}{\percent} highest indices are reserved for the test set, effectively excluding them from the shuffle, leaving a test set with most samples from one person. A future study should perform a train/validation split on shuffled indices after splitting off the test set. \subsubsection{Excluding noise and denoising on stack level} @@ -88,7 +89,7 @@ \subsubsection{Excluding noise and denoising on stack level} These noisy images may still harm training and could be excluded, too. Possibly, excluding noisy stacks can for example be done by calculating the Shannon entropy of all truncated stacks and include stacks with the highest entropy. -In addition to noise exclusion, denoising stacks with three-dimensional N2V or individual slices with N2V2\footnote{At the time of writing, N2V2 is not yet compatible with three-dimensional images.} could increase model performance. +In addition to noise exclusion, denoising stacks with three-dimensional Noise2Void~\cite{Krull2019} or individual slices with Noise2Void2~\cite{Hock2022}\footnote{At the time of writing, Noise2Void2 is not yet compatible with three-dimensional images.} could increase model performance. This is because noise can occlude patterns that describe stretch information. \subsubsection{Using three-dimensional images} @@ -101,13 +102,13 @@ \subsubsection{Using full-size images} Higher resolution images contain more detailed information on the collagen structure, and are therefore expected to increase the performance. The presented model has to be redesigned to accept images larger than $258\times258$. Moreover, raw data is stored with a larger dynamic range (16-bit instead of 8-bit). -Being able to see small differences in neighbouring pixel intensities increases the available information \eg in darker regions, where collagen is sparse, but still significantly present. +Being able to see small differences in neighboring pixel intensities increases the available information \eg in darker regions, where collagen is sparse, but still significantly present. \subsubsection{Weighting samples by goodness of target fit} -The neural network learns from targets that are a result of logistic curves fitted to a series of datapoints. +The neural network learns from targets that are a result of logistic curves fitted to a series of points. The goodness of fit differs between curves. Fits that do not describe the data well should not negatively impact the model training. -One way to achieve this is by reweighting the loss function as +One way to achieve this is by re-weighting the loss function as \begin{equation} \mathcal{L}_{R^2\,\mathrm{weighted}} = \begin{cases} diff --git a/skinstression/sections/introduction.tex b/skinstression/sections/introduction.tex index 3ebab40..a2da1a7 100644 --- a/skinstression/sections/introduction.tex +++ b/skinstression/sections/introduction.tex @@ -6,16 +6,16 @@ \section{Introduction} A recent study (Mengyao) aims to show the connection between collagen density and stretch. To measure the strain-stress response of skin tissue, mechanical measurements have to be performed. -Second harmonic generation (SHG) imaging allows to image collagen and two-photon excitation microscopy (2PEF) elastin. +Second harmonic generation (SHG) imaging allows imaging collagen and two-photon excitation microscopy (2PEF) elastin. Setups have been built to collect SHG and 2PEF signals simultaneously, allowing for rich skin tissue imaging. Collagen fibers are clearly visible. % Previous studies have shown that collagen and not elastin dictates the stretching ability of skin tissue. -SHG images of the collagen networks in conjuction with the strain-stress curves suggest that the images already contain stretch information. +SHG images of the collagen networks in conjunction with the strain-stress curves suggest that the images already contain stretch information. Retrieving complex features from labelled images can be done using supervised deep learning. Supervised deep learning is considered a black-box technique that aims to learn a mapping from input to output. -With the SHG images and corresponding stress-strain measurements at hand, \textsc{Skinstression} is developed with the ultimate goal to replace mechanical measurements on skin tissue to quantify skin stretch. +With the SHG images and corresponding stress-strain measurements at hand, Skinstression (from skin stretch regression) is developed with the ultimate goal to replace mechanical measurements on skin tissue to quantify skin stretch. Possibly, the model can be used to non-invasively investigate physical parameters to aid plastic surgery. -For example, the prior flexibility of skin tissue determines the amount of manual stretching needed to close a gap after excision.\marginnote{Ref for the manual stretching of tissue part.} +For example, the prior flexibility of skin tissue determines the amount of manual continuous or cyclic stretching needed to close a gap after excision~\cite{Verhaegen2012}. Efforts to develop such a model have already been made by Soylu~\cite{Soylu2022}. However, those methods do not consider complete separation of training and test sets in both inference and label creation, possibly leading to biased results. @@ -23,12 +23,12 @@ \section{Introduction} The original model does not incorporate physical properties of the strain-stress curves but relies on principal component analysis. Frequently, machine learning models and neural networks in particular lack the ability to explain how the model comes to its conclusions. -Meanwhile, explainable artificial intelligence (XAI) techniques exist to shed light on the inner workings of algorithms, but are used sparingly (ref). +Meanwhile, explainable artificial intelligence (XAI) techniques exist to shed light on the inner workings of algorithms, but are used sparingly. In the context of skin tissue, XAI could give insight into where the strength and elasticity comes from. -E.\ g.\ a human expert might recognize straight collagen fibrils as a stiff network (ref), but it is interesting to see if an AI explains stiffness in the same manner. +A human expert might recognize straight collagen fibrils as a stiff network~\cite{Holzapfel2001}, but it is interesting to see if an AI explains stiffness in the same manner. The objective of this study is to extend Soylu's model. -To this end, an application, \textsc{Skinstression}, will be developed and validated with separate training and validation data. +To this end, an application, Skinstression, will be developed and validated with separate training and validation data. A new, physics informed neural network will be implemented for explainability. The data will not only consist of single slices from depth scans, but of subsets considering multiple slices per depth scan. Moreover, XAI procedures will be adopted to better explain the black box output. diff --git a/skinstression/sections/methods.tex b/skinstression/sections/methods.tex index cf8f4a0..74046cf 100644 --- a/skinstression/sections/methods.tex +++ b/skinstression/sections/methods.tex @@ -1,24 +1,35 @@ \section{Methods} -\begin{enumerate} - \item TODO: SQUASH SECTION BELOW INTO FEWER SECTIONS AND MAKE IT FLOW - \item TODO: DETAIL -\end{enumerate} -\subsection{Sources of data} +\subsection{Data} -Data is obtained in previous studies by A.\ Soylu, M.\ Zhou, and L.\ van Haasterecht at the Medical Imaging center of the VUmc, Amsterdam.\marginnote{Is this true? Not in Beverwijk for Ludo?} -Human thigh and abdomen skin tissue were excised. -Pieces of these tissues were imaged with multiphoton microscopy and their stress-strain response curves were measured mechanically.\marginnote{Ref to the imaging and stretching papers?} +\subsubsection{Sources of data} + +Human skin tissue was excised from cadavers and healthy subjects for previous studies at the Red Cross Hospital in Beverwijk, the Netherlands~\cite{Haasterecht2023, Zhou2023}. +Pieces of these tissues were imaged with multiphoton microscopy and their stress-strain response curves were measured mechanically, see \cref{fig:lab-setup}~\cite{Haasterecht2023, Zhou2023}. Data is acquired in batches from April 2021 until July 2022. Development and testing data come from the same source. -Cadavers were eligible for thigh skin excision and abdomen skin is cut during plastic surgery.\marginnote{is this true?} -It is unknown if individuals received treatment relevant for this study. +\begin{figure*} + \centering + \includegraphics[width=\linewidth]{skinstression/images/measuring-setup.png} + \caption[Skin stretch setup]{ + Skin stretch setup. + The schematic of the experimental setup (A) shows a femtosecond pulse laser, which central wavelength is 1050 nm with pulse duration less than 80 fs; AOM- acousto-optic modulator; SL-scan lens; TL-tube lens, focus tunable; DMP1-dichroic mirror reflecting backscattering signals from fundamental photons; DMP2 dichroic mirror splitting 2PEF and SHG channels; M- Mirror; F- bandpass filter, F520/35; L- focusing lens; PMT-photomultiplier tube detectors. + The PMT signals are shown as image stacks (C) where red color represents collagen fibers and blue color represents elastin fibers. + A photograph of the skin stretching for \qty{150}{\percent} is shown (B). + Adapted with permission from \fullcite{Zhou2023} (Ref.~\cite{Zhou2023}). + } + \label{fig:lab-setup} +\end{figure*} +It is unknown if individuals received treatment relevant for this study. -\subsection{Data preparation}\label{sec:skin_data_prep} +The sample size is arrived at taking into account all previously included subjects and excluding abdomen data and scar tissue. +This amounts to a total of 1649 SHG images from 63 samples to train on. +% For a detailed summary of the number of samples, see fig. (graph with nodes and edges explaining number of images/curves). +Due to the limited amount of participants, individuals with unknown gender or age were included. -\subsubsection{Preprocessing} +\subsubsection{Data preprocessing}\label{sec:skin_data_prep} Depth stack images with a size of $1000\times 1000$ with a planar resolution of \qty{1}{\micro\meter} of all skin tissues were kindly provided by M.\ Zhou. All stacks were separated into slices. @@ -46,11 +57,11 @@ \subsubsection{Preprocessing} \end{equation} with $N$ the total number of pixels, $k$ an individual image and $i,j$ the pixel in the horizontal and vertical dimension, respectively. -The images are downsampled to $258\times258$ to fit into the neural network. +The images are resized to $258\times258$ to fit into the neural network. \subsubsection{Image selection}\label{subsec:image-selection} SHG microscopy images from skin tissue suffer from optical phenomena. -The most evident problem is that imaging deeper into the tissue, photons are detected with less spatial accuracy thanks to scattering. +The most evident problem is that imaging deeper into the tissue, photons are detected with less spatial accuracy because of scattering. The deeper photons travel into tissue, the more possible paths photons can take to return to the detector. Moreover, the chance of photons getting absorbed by the tissue increases with penetration depth. Therefore, less photons get reflected from deeper tissue. @@ -61,160 +72,49 @@ \subsubsection{Image selection}\label{subsec:image-selection} Candidates for this measure are Shannon entropy, kurtosis, and skew for reasons explained in \ref{subsec:imq} These quality measures are calculated per image using PyImageQualityRanking \cite{Koho2016}, such that the quality measure can be validated by observing manually. -\subsubsection{Image denoising} -Another optical disadvantage of multiphoton microscopy is the occurrence of noise.\marginnote{Look up sources of noise.} +% \subsubsection{Image denoising} +% Another optical disadvantage of multiphoton microscopy is the occurrence of noise.\marginnote{Look up sources of noise.} -Unfortunately, obtaining clearer images is hard. -One way to reduce noise in the image is to use more photons. -Either by averaging more images or increasing the amount of photons per image. -Increasing the amount of photons penetrating the tissue increase the risk of damaging the tissue. +% Unfortunately, obtaining clearer images is hard. +% One way to reduce noise in the image is to use more photons. +% Either by averaging more images or increasing the amount of photons per image. +% Increasing the amount of photons penetrating the tissue increase the risk of damaging the tissue. -Another way to deal with noisy images is to process the images. -Promising denoising neural networks have been developed. -The difficulty with this is that clean target data for supervised training is often not available in a biomedical setting. -To counteract this, Noise2Noise (N2N) \cite{Lehtinen2018} was developed. -Noise2Void N2V \cite{Krull2019}, a successor of N2N, does not rely on pairs of noisy images. -Instead it only needs one image and corrupts it to use as target and learns a mapping between the noisy image and the newly created noisy image. -This is useful if only one biomedical image is available.\marginnote{If denoising is used, move parts of this section to a theory section and refer to it here.} +% Another way to deal with noisy images is to process the images. +% Promising denoising neural networks have been developed. +% The difficulty with this is that clean target data for supervised training is often not available in a biomedical setting. +% To counteract this, Noise2Noise (N2N) \cite{Lehtinen2018} was developed. +% Noise2Void N2V \cite{Krull2019}, a successor of N2N, does not rely on pairs of noisy images. +% Instead it only needs one image and corrupts it to use as target and learns a mapping between the noisy image and the newly created noisy image. +% This is useful if only one biomedical image is available.\marginnote{If denoising is used, move parts of this section to a theory section and refer to it here.} -The original N2V model produces a checkerboard pattern. -Noise2Void2 (N2V2) \cite{Hock2022} is an extension to N2V and reduces this artifact.\marginnote{Actually didn't do denoising, but if done, describe how here :)} +% The original N2V model produces a checkerboard pattern. +% Noise2Void2 (N2V2) \cite{Hock2022} is an extension to N2V and reduces this artifact.\marginnote{Actually didn't do denoising, but if done, describe how here :)} \subsubsection{Data augmentation} To make the model more robust, data augmentation is applied. -Before the downsampling, the preprocessed images are cropped randomly from $1000\times1000$ to $700\times700$ preserving the aspect ratio. +Before resizing, the preprocessed images are cropped randomly from $1000\times1000$ to $700\times700$ preserving the aspect ratio. The global brightness is adjusted randomly with $\pm \qty{30}{\percent}$. The images are then randomly mirrored horizontally and vertically with a probability of \qty{50}{\percent}. All data augmentations were performed with Torchvision \cite{torchvision2016}. -\subsection{Outcome} -The strain-stress response curves of individual skin tissue pieces were the outcome of interest. -The prediction is assessed by comparing it with measured strain-stress curves where $R^2$ with a \qty{95}{\percent} confidence interval is calculated. -The measurement is done mechanically by an experimentalist. -The mechanical measurement itself is blind to clinical information. - -\subsection{Predictors}\label{sec:skin_predictors} - -% -------------------------------------------------- -% SEARCHING FOR A SIMPLE SKIN STRAIN-STRESS MODEL -% -------------------------------------------------- - -\subsubsection{Searching for a simple skin strain-stress model} - -Supervised learning requires targets for the model to train on. -Ideally, individual targets allow for physical interpretation and can together describe all the available data. - -\paragraph{Empirical strain-stress regions} -Although skin tissue has a complex nature, measurements to quantify skin stretch show similar features. -Measurements always show three domains: the toe, heel and leg domain (see fig.). -The toe region is at the very start of the curve. -This region is seems relatively flat as the fiber network consists of mostly unstretched fibers. -Therefore, the fibers cannot exert force as a reaction to external stretching force. -However, in the heel region where skin tissue is stretched more, fibers can exert more force. -When enough force is exerted on the tissue, fibers stretch maximally and fibers react with maximum force in the leg region. -This region is observed to be roughly linear. -Overstretching the tissue then breaks the fiber network, decreasing the possibility to exert force. - -\paragraph{Exponential} -Strain-stress curves can be also be visualized by showing the log derivative of stress with respect to strain against the log of strain (fig). -Typically, this figure has three regions. -The first region indicates a linear relationship between small forces and small strain. -Then, the derivative increases until it reaches a purely exponential part. -If skin stretching follows this kind of behaviour, a simple mathematical model can be derived. -Inspecting the figure, the linear part shows the ordinary differential equation -\begin{equation} - \frac{\mathrm{d}\sigma}{\mathrm{d}\gamma} \propto \sigma, -\end{equation} -where $\gamma = \chi - 1$. -Solving for $\sigma$, we get -\begin{equation} - \sigma \propto e^{\lambda\gamma}, -\end{equation} -where $\lambda$ is some factor dictating the speed with which the exponential increases. -At no extension, i.\ e.\ $\gamma=0$, it can be assumed that there is no stress. -Therefore, -\begin{equation} - \sigma \propto e^{\lambda\gamma} - 1. -\end{equation} -At small extensions, i.\ e.\ $\lambda\gamma \ll 1$, $e^{\lambda\gamma} \approx (1 + \lambda\gamma + \ldots)$ using a Taylor expansion. -So -\begin{equation} - \sigma_{\lambda\gamma\ll 1} \propto 1 + \lambda\gamma + \ldots - 1 \approx G_0 \gamma, -\end{equation} -where $G_0$ is some linear coefficient at small extensions. -In this work, $\gamma = \chi - 1$, where $\chi$ is the stretch. -The full expression then becomes -\begin{equation}\label{eq:exp} - \sigma = \frac{G_0}{\lambda}\left(e^{\lambda(\chi - 1)}-1\right). -\end{equation} - -This model assumes that data follows the previously described curve where there is a small rise at small extensions and an indefinitely exponentially increasing stress for larger extensions. - -The exponential model is fit to some stress-strain curves using OriginPro~\cite{OriginPro}. - -\paragraph{Principal component analysis} -In an earlier study \cite{Soylu2022}, principal component analysis (PCA) is used to reduce the dimensionality of the strain-stress data. -In summary, after PCA, every measurement $Y$ can be approximated by -\begin{equation}\label{eq:pca} - Y \approx Y_\mathrm{PCA} = \mathbf{A} \mathbf{V} + \bar{Y}, -\end{equation} -where $\mathbf{A}$ and $\mathbf{V}$ are matrices containing respectively the eigenvalues and -vectors of the the measurement data. -$\bar{Y}$ is the measurement mean. -Choosing the first $p$ principal components allows for dimensionality reduction. - -Using PCA to create eigenvalues to weight the eigenvectors has some caveats. -First, the training and test sets must be treated separately. -The test set has to be projected on the space spanned by the first $p$ eigenvectors of the training set. -This may induce problems as the test set could contain information that does not come close to -Second, PCA depends on interpolation, \ie every strain-stress curve must be formed by either a set of strain or stress values. -This reduces the domain of the data. - -\paragraph{Logistic curve} -The empirical observations where the force response of skin tissue changes states, suggests a logistic curve, which can be written as -\begin{equation}\label{eq:logistic_curve} - \sigma = \frac{\sigma_\mathrm{max}}{1+e^{-E_\mathrm{max} (\gamma - \gamma_c)}}, -\end{equation} -where $\sigma$ and $\gamma$ are the stress and engineering strain, $\sigma_\mathrm{max}$ is the maximum stress, $E_\mathrm{max}$ is the maximum Young's modulus and $\gamma_c$ the strain offset. -This equation assumes that there is a maximum force that the tissue can exert, in contrary to the theoretical approach above (ref). - -Using the logistic curve as an alternative to PCA has two major advantages. -Every curve can be treated separately and measurements can contain data across arbitrary domains and with arbitrary intervals as no interpolation is necessary. - -Strain-stress curves for all individuals were kindly provided by M.\ Zhou. -The curves only include datapoints where the skin extension is larger than zero and the force positive. - -To every strain-stress curve, eq.~\ref{eq:logistic_curve} is fitted with Scipy \cite{2020SciPy-NMeth}. -The optimal parameters were be used for training the model. - -The AI was trained on a with LDS smoothed target variable distribution. -The targets were weighted with the inverse square root, to limit the impact of LDS. - -It is important to perform a study on bias for possible explanations of varying AI performance. -The samples were taken from individuals with varying age and gender. -Moreover, from some individuals, more mechanical measurements were taken. -Therefore, the age, gender, and number of samples were summarized. -\marginnote{Ref to density/thickness/orientation analysis of Mengyao?} - -\subsection{Sample size} -The sample size is arrived at taking into account all previously included subjects (ref ludo, ref mengyao, ref alperen) and excluding abdomen data and scar tissue. -This amounts to a total of 1649 SHG images to train on. -% For a detailed summary of the number of samples, see fig. (graph with nodes and edges explaining number of images/curves). - -\subsection{Missing data} -Due to the limited amount of participants, individuals with unknown gender or age were included. -\subsection{Statistical analysis methods} +\subsection{Model} +The outcome of interest of the model are strain-stress response curves from SHG images from individual skin tissue pieces. +The model is named Skinstression (from skin stretch regression). +% The measurements were done mechanically by experimentalists. +% The measurement device itself is blind to clinical information. This section shows the methods obtain a trained model from stress-strain curves and images to use it for inference and interrogate it with XAI techniques. -The statistical methods are summarized in \cref{fig:skin_stat_methods}. +\Cref{fig:skin_stat_methods} shows the model development flow. \begin{figure*}[p] \centering % \includegraphics[height=\dimexpr\textheight-55.89pt\relax]{mermaid/skin_analysis.pdf} \includegraphics{mermaid/skin_analysis.pdf} - \caption[Flowchart of statistical analysis methods for \textsc{Skinstression}]{ - Flowchart of statistical analysis methods for \textsc{Skinstression}. + \caption[Skinstression development flow]{ + Skinstression development flow. Exponentials, principal component analysis (PCA) reconstructions, and logistic curves were fit to the stress-strain curves. The best-fitted model was used to create training targets. The targets and images were used as input for further training. @@ -236,11 +136,11 @@ \subsubsection{Predictor pre-selection} In particular, the exponential regime of the fit should describe the leg part of the curve. \paragraph{Principal component analysis} -PCA requires information on at least one axis to align between every curve. +PCA requires curves to align on at least one axis. The first step to achieve this is excluding all stretch values above the stretch of the maximum of the shortest curve. \textcite{Soylu2022} did linear interpolation on the curves and restricted both stretch and stress to minim peak value. PCA on two variables requires only one shared set of points. -Moreover, results of \citeauthor{Soylu2022} show knicks in the PCA reconstructions near the end of the curves, which could originate from a limited amount of datapoints or linear interpolation. +Moreover, results of \citeauthor{Soylu2022} show knicks in the PCA reconstructions near the end of the curves, which could originate from a limited amount of data or linear interpolation. Therefore, in this study, a non-uniform, univariate, interpolating spline was fitted to all points using Scipy \cite{2020SciPy-NMeth} and the stress was calculated from the spline at the stetch values of the curve with the lowest maximum stretch. After PCA on the complete dataset using Scikit-learn \cite{Pedregosa2011}, the explained variance per component was calculated and used as a method to find an appropriate number of principal components. From these principal components, the curves where reconstructed using \cref{eq:pca}. @@ -249,7 +149,7 @@ \subsubsection{Predictor pre-selection} Only if PCA on the full dataset works reasonably well, it is possible to use PCA on a subset and use it to reconstruct another subset. This would be useful if PCA was used to construct predictors, as using PCA results of the full dataset introduce information leakage from the test sets to the training set, because the components describe data from both subsets. -This is unlike Ref.~\cite{Soylu2022} where information leakage was not considered.\marginnote{Where to put PCA bias study?} +This is unlike Ref.~\cite{Soylu2022} where information leakage was not considered. \subsubsection{Convolutional neural network} The basis of the model originates from Liang \emph{et al.} \cite{Liang2017} and is adapted by Soylu \cite{Soylu2022}. @@ -258,7 +158,7 @@ \subsubsection{Convolutional neural network} The output is then normalized per batch using BN (\cref{sec:bn}). The normalized batch is passed through a ReLU (\cref{subsec:relu}) layer. After activation, three $2\times2$ maxpool (\cref{subsec:maxpool}) layers are applied. -The next second block is like the first block, but with a $5\times5$ convolution kernel en just one maxpool layer. +The next second block is like the first block, but with a $5\times5$ convolution kernel and just one maxpool layer. The third block is like the second block, but with a $3\times3$ convolution kernel. The fourth block is like the second and third block, but with a $6\times6$ and without a maxpool layer. @@ -271,7 +171,6 @@ \subsubsection{Convolutional neural network} \begin{figure*} \includegraphics{skinstression/images/skinstression.pdf} - % \input{PlotNeuralNet/examples/Skinstression/skinstression-new.tex} \caption[Network architecture]{ The convolutional neural network consists of five blocks. The first four blocks contain convolution, maxpooling, and batch normalization layers. @@ -281,7 +180,7 @@ \subsubsection{Convolutional neural network} \label{fig:model} \end{figure*} -The dropout layers in \cite{Soylu2022} are replaced by BN layers, as the input is unnormalized and studies report better performance with BN. +The dropout layers in \cite{Soylu2022} are replaced by BN layers, as the input is not normalized and studies report better performance with BN. Bias of all layers preceding BN layers have been set to zero. The neural network weights are initialized according to the method described by \textcite{He2015a}, using a uniform transform. @@ -289,7 +188,7 @@ \subsubsection{Convolutional neural network} \subsubsection{Hyperparameter optimization} First, benchmark search 1 was done using Successive Halving with 100 trials. See \cref{app:skin_conf_search_spaces} for a summary of configuration search space $\mathcal{C}$. -To allow trials to warmup, a minimum of 100 epochs were allowed. +To allow trials to warm up, a minimum of 100 epochs were allowed. To limit the trial duration, a maximum of 3000 epochs were allowed. The number of trials were reduced with a reduction factor of $\eta=4$. Trial parameters were sampled using the non-multivariate TPE algorithm. @@ -298,8 +197,8 @@ \subsubsection{Hyperparameter optimization} Every trial used the complete dataset after data preparation (\cref{sec:skin_data_prep}). The search uses a few data augmentations that are assumed to not alter the physical context of the image. -That is, force is exerted on the tissue in the horizontal direction in the image. -Therefore, flipping the image either vertically or horizontally is assumed to not change the stretch behaviour. +That is, force is exerted on the tissue unilaterally, which is horizontal in the image. +Therefore, flipping the image either vertically or horizontally is assumed to not change the stretch behavior. Both flipping operations occur with a probability of $0.5$. Moreover, the images' intensity is randomly changed uniformly by \qtyrange{0.7}{1.3}{\percent}. @@ -325,7 +224,7 @@ \subsubsection{Hyperparameter optimization} Algorithms provided by Optuna \cite{Akiba2019} were used to choose trial configurations and keep track of trials. \begin{table} - \caption[\textsc{Skinstression} hyperparameter search studies]{ + \caption[Skinstression hyperparameter search studies]{ Summary of settings for hyperparameter searches performed. Hyperparameters are grouped by operation type (image preprocessing, image augmenting, target weighting) and in applied order. Every search is done with the search space described in \cref{tab:conf_skin}. @@ -379,18 +278,19 @@ \subsubsection{Training} The learning rate was warmed up linearly for the first 20 epochs. To see the influence of image quality, the model is trained using an ordered set of images. The images are ordered with maximum entropy first and the model is trained on all images and the top 10, 20 and 30 images of every stack. -Moreover, to see the effect of using the original zoom level with the greatest detail at hand, the best 20 images of every stack were centercropped to $500\times500$ and further randomly cropped randomly to $258\times258$. +Moreover, to see the effect of using the original zoom level with the greatest detail at hand, the best 20 images of every stack were center-cropped to $500\times500$ and further randomly cropped randomly to $258\times258$. The lowest validation focal loss is used to compare model performance. The model with the lowest validation loss is used for testing. Pytorch \cite{Paszke2019} was used to perform automatic differentiation on NVIDIA GeForce RTX 2070 Super GPUs on the BAZIS high performance computing cluster. -\subsubsection{Dataset}\label{subsec:skin_dataset} +\subsubsection{Internal validation}\label{subsec:skin_dataset} The thigh dataset is randomly distributed into a training (\qty{64}{\percent}), validation (\qty{16}{\percent}), and test (\qty{20}{\percent}). The distribution is stratified by person, meaning samples corresponding to the same person cannot live in two subsets simultaneously. The AI learns from the training set. Every iteration, it is validated against the validation set. During inference, the AI is applied to the test set as external validation. +The prediction is assessed by comparing it with measured strain-stress curves where $R^2$ is calculated with a \qty{95}{\percent} confidence interval. % \subsubsection{Model selection} % MISSCHIEN MODEL ENSEMBLING EN VALIDATION SET GEBRUIKEN VOOR VINDEN VAN BESTE AANTAL MODELS? diff --git a/skinstression/sections/results.tex b/skinstression/sections/results.tex index 3157cba..7cf860c 100644 --- a/skinstression/sections/results.tex +++ b/skinstression/sections/results.tex @@ -1,6 +1,6 @@ \section{Results} -\begin{figure} +\begin{figure*} \centering \includegraphics[width=\linewidth]{skinstression/images/truncated-and-interpolated-curves.pdf} \caption[Truncated and spline interpolated curves]{ @@ -8,7 +8,7 @@ \section{Results} using non-uniform, interpolating splines on the stretch values of the curve with the lowest maximum stretch. } \label{fig:trunc_interp_curves} -\end{figure} +\end{figure*} \subsection{Participants}\label{subsec:results_participants} @@ -17,8 +17,7 @@ \subsection{Participants}\label{subsec:results_participants} All thigh data is included, which is different from the original study, where only the 48 latest samples were used. These considerations result in data including 15 individuals (5 men, 4 women, 3 unknown). Ages range from 61 to 94. -Skin tissue is cut from the thigh and cut in multiple pieces of roughly the same shape.\marginpar{protocol?} -From every skin tissue piece, strain-stress curves are measured. +From every skin tissue piece, measured strain-stress curves are shown in \cref{fig:trunc_interp_curves}. The number of measured strain-stress curves range from 1 to 13. The source of data is summarized in \cref{fig:source_of_data}. @@ -53,138 +52,43 @@ \subsubsection{Exponential} \subsubsection{PCA} The PCA fit for every truncated and interpolated strain-stress curve is depicted in \cref{fig:pca_fits}. +An example is shown in \cref{fig:pca-fit-and-r2-histogram}. For every fit, $R^2$ is calculated with respect to the interpolated and truncated data. On average, $\overline{R^2} \approx \num{0.9927 \pm 0.0022}$. Due to the nature of PCA, the exponential part of the curves that rise later is not included in the making of the fit. -\begin{figure*} - \ContinuedFloat - \centering - \includegraphics[width=0.24\linewidth]{skinstression/images/pca-fits/sample_6.pdf} - \includegraphics[width=0.24\linewidth]{skinstression/images/pca-fits/sample_7.pdf} - \includegraphics[width=0.24\linewidth]{skinstression/images/pca-fits/sample_8.pdf} - \includegraphics[width=0.24\linewidth]{skinstression/images/pca-fits/sample_9.pdf} \\ - \includegraphics[width=0.24\linewidth]{skinstression/images/pca-fits/sample_10.pdf} - \includegraphics[width=0.24\linewidth]{skinstression/images/pca-fits/sample_11.pdf} - \includegraphics[width=0.24\linewidth]{skinstression/images/pca-fits/sample_12.pdf} - \includegraphics[width=0.24\linewidth]{skinstression/images/pca-fits/sample_13.pdf} \\ - \includegraphics[width=0.24\linewidth]{skinstression/images/pca-fits/sample_14.pdf} - \includegraphics[width=0.24\linewidth]{skinstression/images/pca-fits/sample_15.pdf} - \includegraphics[width=0.24\linewidth]{skinstression/images/pca-fits/sample_16.pdf} - \includegraphics[width=0.24\linewidth]{skinstression/images/pca-fits/sample_17.pdf} \\ - \includegraphics[width=0.24\linewidth]{skinstression/images/pca-fits/sample_18.pdf} - \includegraphics[width=0.24\linewidth]{skinstression/images/pca-fits/sample_19.pdf} - \includegraphics[width=0.24\linewidth]{skinstression/images/pca-fits/sample_20.pdf} - \includegraphics[width=0.24\linewidth]{skinstression/images/pca-fits/sample_21.pdf} \\ - \includegraphics[width=0.24\linewidth]{skinstression/images/pca-fits/sample_22.pdf} - \includegraphics[width=0.24\linewidth]{skinstression/images/pca-fits/sample_23.pdf} - \includegraphics[width=0.24\linewidth]{skinstression/images/pca-fits/sample_24.pdf} - \includegraphics[width=0.24\linewidth]{skinstression/images/pca-fits/sample_25.pdf} \\ - \includegraphics[width=0.24\linewidth]{skinstression/images/pca-fits/sample_26.pdf} - \includegraphics[width=0.24\linewidth]{skinstression/images/pca-fits/sample_27.pdf} - \includegraphics[width=0.24\linewidth]{skinstression/images/pca-fits/sample_28.pdf} - \includegraphics[width=0.24\linewidth]{skinstression/images/pca-fits/sample_29.pdf} - \raggedleft Continued on next page. -\end{figure*} - -\begin{figure*} +\begin{marginfigure} \centering - \includegraphics[width=0.24\linewidth]{skinstression/images/pca-fits/sample_30.pdf} - \includegraphics[width=0.24\linewidth]{skinstression/images/pca-fits/sample_31.pdf} - \includegraphics[width=0.24\linewidth]{skinstression/images/pca-fits/sample_32.pdf} - \includegraphics[width=0.24\linewidth]{skinstression/images/pca-fits/sample_33.pdf} \\ - \includegraphics[width=0.24\linewidth]{skinstression/images/pca-fits/sample_34.pdf} - \includegraphics[width=0.24\linewidth]{skinstression/images/pca-fits/sample_45.pdf} - \includegraphics[width=0.24\linewidth]{skinstression/images/pca-fits/sample_46.pdf} - \includegraphics[width=0.24\linewidth]{skinstression/images/pca-fits/sample_47.pdf} \\ - \includegraphics[width=0.24\linewidth]{skinstression/images/pca-fits/sample_48.pdf} - \includegraphics[width=0.24\linewidth]{skinstression/images/pca-fits/sample_49.pdf} - \includegraphics[width=0.24\linewidth]{skinstression/images/pca-fits/sample_50.pdf} - \includegraphics[width=0.24\linewidth]{skinstression/images/pca-fits/sample_51.pdf} \\ - \includegraphics[width=0.24\linewidth]{skinstression/images/pca-fits/sample_52.pdf} - \includegraphics[width=0.24\linewidth]{skinstression/images/pca-fits/sample_53.pdf} - \includegraphics[width=0.24\linewidth]{skinstression/images/pca-fits/sample_54.pdf} - \includegraphics[width=0.24\linewidth]{skinstression/images/pca-fits/sample_55.pdf} \\ - \includegraphics[width=0.24\linewidth]{skinstression/images/pca-fits/sample_56.pdf} - \includegraphics[width=0.24\linewidth]{skinstression/images/pca-fits/sample_57.pdf} - \includegraphics[width=0.24\linewidth]{skinstression/images/pca-fits/sample_58.pdf} - \includegraphics[width=0.24\linewidth]{skinstression/images/pca-fits/sample_59.pdf} \\ - \includegraphics[width=0.24\linewidth]{skinstression/images/pca-fits/sample_60.pdf} - \includegraphics[width=0.24\linewidth]{skinstression/images/pca-fits/sample_61.pdf} - \caption[PCA fits]{ - PCA fits for every truncated and interpolated strain-stress curve. - The interpolated measurements (blue) are estimated by the PCA curve (red) along with their $R^2$. - PCA is done on all available thigh data. - Note that the vertical axes are not equal. + \includegraphics[width=\linewidth]{skinstression/images/pca-fits/sample_6.pdf} \\ + \includegraphics[width=\linewidth]{skinstression/images/pca-fits/r2_hist.pdf} + \caption[Logistic curve fits to stress-strain curves]{ + Top panel: PCA fit (red) to one stress-strain curve (black). + $R^2 = 0.993$. + Bottom panel: distribution of $R^2$ on all stress-strain curves. + Mean $R^2 = 0.9927$ (red) with a standard deviation of \num{0.0022}. } - \label{fig:pca_fits} -\end{figure*} + \label{fig:pca-fit-and-r2-histogram} +\end{marginfigure} + \subsubsection{Logistic curve} The logistic curve fit for every strain-stress curve is shown in \cref{fig:logistic_fits}. +An example is shown in \cref{fig:logistic-fit-and-r2-histogram}. For every fit, $R^2$ is calculated. On average, $\overline{R^2} \approx \num{0.9979 \pm 0.0039}$. -\begin{figure*} - \ContinuedFloat - \centering - \includegraphics[width=0.24\linewidth]{skinstression/images/logistic-fits/sample_6} - \includegraphics[width=0.24\linewidth]{skinstression/images/logistic-fits/sample_7.pdf} - \includegraphics[width=0.24\linewidth]{skinstression/images/logistic-fits/sample_8.pdf} - \includegraphics[width=0.24\linewidth]{skinstression/images/logistic-fits/sample_9.pdf} \\ - \includegraphics[width=0.24\linewidth]{skinstression/images/logistic-fits/sample_10.pdf} - \includegraphics[width=0.24\linewidth]{skinstression/images/logistic-fits/sample_11.pdf} - \includegraphics[width=0.24\linewidth]{skinstression/images/logistic-fits/sample_12.pdf} - \includegraphics[width=0.24\linewidth]{skinstression/images/logistic-fits/sample_13.pdf} \\ - \includegraphics[width=0.24\linewidth]{skinstression/images/logistic-fits/sample_14.pdf} - \includegraphics[width=0.24\linewidth]{skinstression/images/logistic-fits/sample_15.pdf} - \includegraphics[width=0.24\linewidth]{skinstression/images/logistic-fits/sample_16.pdf} - \includegraphics[width=0.24\linewidth]{skinstression/images/logistic-fits/sample_17.pdf} \\ - \includegraphics[width=0.24\linewidth]{skinstression/images/logistic-fits/sample_18.pdf} - \includegraphics[width=0.24\linewidth]{skinstression/images/logistic-fits/sample_19.pdf} - \includegraphics[width=0.24\linewidth]{skinstression/images/logistic-fits/sample_20.pdf} - \includegraphics[width=0.24\linewidth]{skinstression/images/logistic-fits/sample_21.pdf} \\ - \includegraphics[width=0.24\linewidth]{skinstression/images/logistic-fits/sample_22.pdf} - \includegraphics[width=0.24\linewidth]{skinstression/images/logistic-fits/sample_23.pdf} - \includegraphics[width=0.24\linewidth]{skinstression/images/logistic-fits/sample_24.pdf} - \includegraphics[width=0.24\linewidth]{skinstression/images/logistic-fits/sample_25.pdf} \\ - \includegraphics[width=0.24\linewidth]{skinstression/images/logistic-fits/sample_26.pdf} - \includegraphics[width=0.24\linewidth]{skinstression/images/logistic-fits/sample_27.pdf} - \includegraphics[width=0.24\linewidth]{skinstression/images/logistic-fits/sample_28.pdf} - \includegraphics[width=0.24\linewidth]{skinstression/images/logistic-fits/sample_29.pdf} - \raggedleft Continued on next page. -\end{figure*} - -\begin{figure*} +\begin{marginfigure} \centering - \includegraphics[width=0.24\linewidth]{skinstression/images/logistic-fits/sample_30.pdf} - \includegraphics[width=0.24\linewidth]{skinstression/images/logistic-fits/sample_31.pdf} - \includegraphics[width=0.24\linewidth]{skinstression/images/logistic-fits/sample_32.pdf} - \includegraphics[width=0.24\linewidth]{skinstression/images/logistic-fits/sample_33.pdf} \\ - \includegraphics[width=0.24\linewidth]{skinstression/images/logistic-fits/sample_34.pdf} - \includegraphics[width=0.24\linewidth]{skinstression/images/logistic-fits/sample_45.pdf} - \includegraphics[width=0.24\linewidth]{skinstression/images/logistic-fits/sample_46.pdf} - \includegraphics[width=0.24\linewidth]{skinstression/images/logistic-fits/sample_47.pdf} \\ - \includegraphics[width=0.24\linewidth]{skinstression/images/logistic-fits/sample_48.pdf} - \includegraphics[width=0.24\linewidth]{skinstression/images/logistic-fits/sample_49.pdf} - \includegraphics[width=0.24\linewidth]{skinstression/images/logistic-fits/sample_50.pdf} - \includegraphics[width=0.24\linewidth]{skinstression/images/logistic-fits/sample_51.pdf} \\ - \includegraphics[width=0.24\linewidth]{skinstression/images/logistic-fits/sample_52.pdf} - \includegraphics[width=0.24\linewidth]{skinstression/images/logistic-fits/sample_53.pdf} - \includegraphics[width=0.24\linewidth]{skinstression/images/logistic-fits/sample_54.pdf} - \includegraphics[width=0.24\linewidth]{skinstression/images/logistic-fits/sample_55.pdf} \\ - \includegraphics[width=0.24\linewidth]{skinstression/images/logistic-fits/sample_56.pdf} - \includegraphics[width=0.24\linewidth]{skinstression/images/logistic-fits/sample_57.pdf} - \includegraphics[width=0.24\linewidth]{skinstression/images/logistic-fits/sample_58.pdf} - \includegraphics[width=0.24\linewidth]{skinstression/images/logistic-fits/sample_59.pdf} \\ - \includegraphics[width=0.24\linewidth]{skinstression/images/logistic-fits/sample_60.pdf} - \includegraphics[width=0.24\linewidth]{skinstression/images/logistic-fits/sample_61.pdf} - \caption[Logistic fits]{ - Logistic fits (red) and their $R^2$ for every strain-stress curve (black). - Note that the vertical axes are not equal. + \includegraphics[width=\linewidth]{skinstression/images/logistic-fits/sample_6.pdf} \\ + \includegraphics[width=\linewidth]{skinstression/images/logistic-fits/r2_hist.pdf} + \caption[Logistic curve fits to stress-strain curves]{ + Top panel: Logistic curve fit (red) to one stress-strain curve (black). + $R^2 = 0.9983$. + Bottom panel: distribution of $R^2$ on all stress-strain curves. + Mean $R^2 = 0.9980$ (red) with a standard deviation of \num{0.0036}. } - \label{fig:logistic_fits} -\end{figure*} + \label{fig:logistic-fit-and-r2-histogram} +\end{marginfigure} Because the logistic curve describes the stress-strain data more accurate than PCA or the exponential, it will be used from now on. @@ -197,20 +101,20 @@ \subsubsection{Label density smoothing} where $s$ is the bin width. The smoothed distribution is shown in \cref{fig:skin_lds}. -\begin{figure*} +\begin{figure} \centering - \includegraphics{skinstression/images/lds/a.pdf} \\ - \includegraphics{skinstression/images/lds/k.pdf} \\ - \includegraphics{skinstression/images/lds/xc.pdf} - \caption[Original target and LDS distrutions]{ - Empirical target distribution (red) and LDS distribution (blue) for $\sigma_\mathrm{max}$, $E_\mathrm{max}$, and $\gamma_c$. + \includegraphics[width=\linewidth]{skinstression/images/lds/a.pdf} \\ + \includegraphics[width=\linewidth]{skinstression/images/lds/k.pdf} \\ + \includegraphics[width=\linewidth]{skinstression/images/lds/xc.pdf} + \caption[Label density smoothed target distributions]{ + Empirical target distribution (red) and label density smoothed (LDS) target distribution (blue) for $\sigma_\mathrm{max}$, $E_\mathrm{max}$, and $\chi_c$. LDS aims to fill the gaps between the actual measurements. LDS also extrapolates target distributions. For all targets, the kernel size was 30 units, and the standard deviation 3 units. - The bin width were \num{0.01}, \num{0.1}, and \num{0.001} units for $\sigma_\mathrm{max}$, $E_\mathrm{max}$, and $\gamma_c$, respectively. + The bin width were \num{0.01}, \num{0.1}, and \num{0.001} units for $\sigma_\mathrm{max}$, $E_\mathrm{max}$, and $\chi_c$, respectively. } \label{fig:skin_lds} -\end{figure*} +\end{figure} %%%%%%%%%%%%%%%%%%%%%%%%%%%% % IMAGE QUALITY @@ -374,8 +278,8 @@ \subsection{Training} \begin{margintable} \centering - \caption[\textsc{Skinstression} configuration]{ - \textsc{Skinstression} configuration used during training. + \caption[Skinstression configuration]{ + Skinstression configuration used during training. Parameters are ordered by their importance, calculated with fANOVA. LR, WD, and BS are learning rate, weight decay and batch size, respectively. } diff --git a/skinstression/sections/supplementary.tex b/skinstression/sections/supplementary.tex index 143a463..cbfcbad 100644 --- a/skinstression/sections/supplementary.tex +++ b/skinstression/sections/supplementary.tex @@ -1,6 +1,6 @@ \section{Supplementary materials} \subsection{Code} -The implementation of \textsc{Skinstression} can be found at \href{https://github.io/siemdejong/shg-strain-stress}{\faIcon{github} siemdejong/shg-strain-stress}. +The implementation of Skinstression can be found at \href{https://github.io/siemdejong/shg-strain-stress}{\faIcon{github} siemdejong/shg-strain-stress}. The context and container diagram are depicted in \cref{app:skin_c4}. \subsection{Data} diff --git a/skinstression/sections/theory.tex b/skinstression/sections/theory.tex index 093ae8d..cdd2fba 100644 --- a/skinstression/sections/theory.tex +++ b/skinstression/sections/theory.tex @@ -1,6 +1,125 @@ \section{Theory} -In addition to the theory described in \cref{pt:theory}, this section describes theory only applicable to the \textsc{Skinstression} project. +% In addition to the theory described in \cref{ch:theory}, this section describes theory only applicable to the Skinstression project. +\subsection{Collagen} +TODO. How does it look like in SHG imaging? + +% -------------------------------------------------- +% SEARCHING FOR A SIMPLE SKIN STRAIN-STRESS MODEL +% -------------------------------------------------- + +\subsection{Searching for a simple skin strain-stress model} + +Supervised learning requires targets for the model to train on. +Ideally, individual targets allow for physical interpretation and can together describe all the available data. + +\subsubsection{Empirical strain-stress regions} +Although skin tissue has a complex nature, measurements to quantify skin stretch show similar features. +Measurements show four domains: the toe, heel, leg and break domain (\cref{fig:stress-strain-curve-skin-tissue}A). +The toe region is at the very start of the curve. +This region is relatively flat as the fiber network consists of mostly straight fibers. +Therefore, the fibers cannot exert force as a reaction to external stretching force. +However, in the heel region where skin tissue is stretched more, fibers can exert more force. +When enough force is exerted on the tissue, fibers stretch maximally and fibers react with maximum force in the leg region. +This region is observed to be roughly linear. +Overstretching the tissue then breaks the fiber network, decreasing the possibility to exert force. + +\begin{figure} + \centering + \includegraphics[width=\linewidth]{skinstression/images/stress-strain-curve.pdf} + \caption[Stress-strain curve of human skin tissue]{ + Stress-strain curve of human skin tissue. + The curve is divided in three regions. + I: the toe region. + The fiber structure here consists of mostly straight fibers. + II: the heel region. + Fibers begin to stretch. + III: the leg region. + Fibers stretch maximally and react with maximum force. + IV: break region. + Fibers break and give up ability of applying force. + } + \label{fig:stress-strain-curve-skin-tissue} +\end{figure} + +\subsubsection{Exponential} +Strain-stress curves can also be visualized by showing the log derivative of stress with respect to strain against the log of strain as in \cref{fig:stress-strain-curve-skin-tissue}B. +Typically, this figure has three regions. +The first region indicates a linear relationship between small forces and small strain. +Then, the derivative increases until it reaches a purely exponential part~\cite{Holzapfel2001}. +If skin stretching follows this kind of behavior, a simple mathematical model can be derived. +Inspecting the figure, the linear part shows the ordinary differential equation +\begin{equation} + \frac{\mathrm{d}\sigma}{\mathrm{d}\gamma} \propto \sigma, +\end{equation} +where $\gamma = \chi - 1$. +Solving for $\sigma$, we get +\begin{equation} + \sigma \propto e^{\lambda\gamma}, +\end{equation} +where $\lambda$ is some factor dictating the speed with which the exponential increases. +At no extension, $\gamma=0$, it can be assumed that there is no stress. +Therefore, +\begin{equation} + \sigma \propto e^{\lambda\gamma} - 1. +\end{equation} +At small extensions, $\lambda\gamma \ll 1$, $e^{\lambda\gamma} \approx (1 + \lambda\gamma + \ldots)$ using a Taylor expansion. +So +\begin{equation} + \sigma_{\lambda\gamma\ll 1} \propto 1 + \lambda\gamma + \ldots - 1 \approx G_0 \gamma, +\end{equation} +where $G_0$ is some linear coefficient at small extensions. +In this work, $\gamma = \chi - 1$, where $\chi$ is the stretch. +The full expression then becomes +\begin{equation}\label{eq:exp} + \sigma = \frac{G_0}{\lambda}\left(e^{\lambda(\chi - 1)}-1\right). +\end{equation} + +This model assumes that data follows the previously described curve where there is a small rise at small extensions and an indefinitely exponentially increasing stress for larger extensions. + +The exponential model is fit to some stress-strain curves using OriginPro~\cite{OriginPro}. + +\subsubsection{Principal component analysis} +In an earlier study \cite{Soylu2022}, principal component analysis (PCA) is used to reduce the dimensionality of the strain-stress data. +In summary, after PCA, every measurement $Y$ can be approximated by +\begin{equation}\label{eq:pca} + Y \approx Y_\mathrm{PCA} = \mathbf{A} \mathbf{V} + \bar{Y}, +\end{equation} +where $\mathbf{A}$ and $\mathbf{V}$ are matrices containing respectively the eigenvalues and eigenvectors of the measurement data. +$\bar{Y}$ is the measurement mean. +Choosing the first $p$ principal components allows for dimensionality reduction. + +Using PCA to create eigenvalues to weight the eigenvectors has some caveats. +First, the training and test sets must be treated separately. +The test set has to be projected on the space spanned by the first $p$ eigenvectors of the training set. +This may induce problems as the test set could contain information that does not come close to +Second, PCA depends on interpolation, \ie every strain-stress curve must be formed by either a set of strain or stress values. +This reduces the domain of the data. + +\paragraph{Logistic curve} +The empirical observations where the force response of skin tissue changes states, suggests a logistic curve, which can be written as +\begin{equation}\label{eq:logistic_curve} + \sigma = \frac{\sigma_\mathrm{max}}{1+e^{-E_\mathrm{max} (\gamma - \gamma_c)}}, +\end{equation} +where $\sigma$ and $\gamma$ are the stress and engineering strain, $\sigma_\mathrm{max}$ is the maximum stress, $E_\mathrm{max}$ is the maximum Young's modulus and $\gamma_c$ the strain offset. +This equation assumes that there is a maximum force that the tissue can exert, in contrary to the theoretical approach above (ref). + +Using the logistic curve as an alternative to PCA has two major advantages. +Every curve can be treated separately and measurements can contain data across arbitrary domains and with arbitrary intervals as no interpolation is necessary. + +Strain-stress curves for all individuals were kindly provided by M.\ Zhou. +The curves only include points where the skin extension is larger than zero and the force positive. + +To every strain-stress curve, \cref{eq:logistic_curve} is fitted with Scipy \cite{2020SciPy-NMeth}.\todo{Move methods to methods} +The optimal parameters were used as targets for training the model. + +The AI was trained on a with LDS smoothed target variable distribution. +The targets were weighted with the inverse square root, to limit the impact of LDS. + +It is important to perform a study on bias for possible explanations of varying AI performance. +The samples were taken from individuals with varying age and gender. +Moreover, from some individuals, more mechanical measurements were taken. +Therefore, the age, gender, and number of samples were summarized. % -------------------------------------------------- % Label density smoothing @@ -24,7 +143,7 @@ \subsection{Label density smoothing} \subsection{Goodness of fit} -\subsubsection{Coefficient of determination}\label{subsec:coef_det} +% \subsubsection{Coefficient of determination}\label{subsec:coef_det} One possible way to quantify how good a fit is, is to calculate the coefficient of determination. The coefficient of determination of a dataset $y$ and its prediction $f$, is calculated with \begin{equation}