1) PACKAGES

The main packages used for the analysis are:

library(MALDIquant)
library(MALDIquantForeign)
library(MALDIrppa)
library(cluster)
library(scales)
library(ggplot2)
library(ggrepel)
library(dendextend)
library(mixOmics)
library(lsa)
library(knitr)
library(kableExtra)

2) TASKS AND PARAMETERS

Briefly, the main tasks are reported in the first column (TASK NAME) and the second column (CHOICE) shows if the tasks are selected or not by the user. The third column (METHOD I) and the fourth column (METHOD II) show the methods for the related task. Numeric parameters are collected from the fifth column to the seventh column (PARAMETER I,II,III). Following, the list of tasks, methods and parameters set by the user are:

Task Choice Method I Method II Parameter I Parameter II Parameter III
quality_pre yes
trimming yes 900 2500
quality_post yes
stabilization yes sqrt
smoothing yes SavitzkyGolay 10
baseline yes SNIP 25 75
normalization yes TIC
average yes mean
alignment yes MAD lowess 20 2.0 0.002
peak yes strict 0.2
pca yes
clustering yes average gap 3
heatmap yes 1
reporting yes yes

3) MASS SPECTRA ACQUISITION

The MALDI-TOF mass spectra uploaded by the user are 40:

##  [1] "13S13569_A_SP.txt" "13S13569_B_SP.txt" "13S13569_C_SP.txt"
##  [4] "13S13569_D_SP.txt" "13S14986_A_SP.txt" "13S14986_B_SP.txt"
##  [7] "13S14986_C_SP.txt" "13S14986_D_SP.txt" "13S15816_A_SP.txt"
## [10] "13S15816_B_SP.txt" "13S15816_C_SP.txt" "13S15816_D_SP.txt"
## [13] "13S15947_A_SP.txt" "13S15947_B_SP.txt" "13S15947_C_SP.txt"
## [16] "13S15947_D_SP.txt" "C103_A_SP.txt"     "C103_B_SP.txt"    
## [19] "C103_C_SP.txt"     "C103_D_SP.txt"     "C128_A_SP.txt"    
## [22] "C128_B_SP.txt"     "C128_C_SP.txt"     "C128_D_SP.txt"    
## [25] "C131_A_SP.txt"     "C131_B_SP.txt"     "C131_C_SP.txt"    
## [28] "C131_D_SP.txt"     "C132_A_SP.txt"     "C132_B_SP.txt"    
## [31] "C132_C_SP.txt"     "C132_D_SP.txt"     "C133_A_SP.txt"    
## [34] "C133_B_SP.txt"     "C133_C_SP.txt"     "C133_D_SP.txt"    
## [37] "C135_A_SP.txt"     "C135_B_SP.txt"     "C135_C_SP.txt"    
## [40] "C135_D_SP.txt"

All the mass spectra have been converted in the S4 class type ‘MassSpectrum’, with the information for each mass spectrum about:

The information of the first mass spectrum is reported for showing the ’MassSpectrum" class type:

## S4 class type            : MassSpectrum     
## Number of m/z values     : 41376            
## Range of m/z values      : 100.09 - 3985.992
## Range of intensity values: 0e+00 - 2.988e+03
## Memory usage             : 649.766 KiB

4) QUALITY CONTROL

The quality control on the uploaded mass spectra can be performed before and after the trimming task, in order to control if the mass spectra trimming affects the number of mass spectra that can be outliers for the dataset.

4.1) QUALITY CONTROL PRE-TRIMMING

The numbers of m/z values is shown per mass spectrum. Moreover, a control on empty mass spectra and on frequency of m/z values per mass spectrum (over a certain threshold) is reported. We suggest to remove empty mass spectra from the analysis; irregularities in frequency do not affect the following analysis.

## 
## 37895 41145 41376 41935 42246 42694 43887 45135 46132 46237 46352 46544 
##     1     1     1     1     1     1     1     1     1     1     1     1 
## 46617 48881 49038 49263 49503 49745 49858 50397 50485 50735 50881 51259 
##     1     1     1     1     1     1     1     1     1     1     1     4 
## 51458 51529 51578 52918 53484 53580 53674 53785 54902 56302 56668 56829 
##     1     1     1     1     1     1     1     1     1     1     1     1 
## 60424 
##     1

Any empty mass spectrum? FALSE

Are all the mass spectra regular? FALSE

The atypicality score A, calculated by using the Rousseeuw’s Q robust scale estimator, is used in order to provide a range of acceptance for the mass spectra (between the dotted red lines). Only the outliers are depicted with the codename of the mass spectrum.Fig.1

4.2) QUALITY CONTROL POST-TRIMMING

The trimming task is considered done even if the user does not provide a specific range of trimming (the mass spectra are considered in their entirety.) Nevertheless, the quality control after the trimming is still subject to the user’s choice. In this case the mass spectra are trimmed.

The atypicality score A is calculated after the trimming task. We suggest to compare the possible outliers between the two cases (before and after trimming), in order to decide if the related mass spectra should be eliminated from the study or not.