Significance Analysis of
INTeractome
 

Significance Analysis of INTeractome (SAINT) consists of a series of software tools for assigning confidence scores to protein-protein interactions based on quantitative proteomics data in AP-MS experiments. We posted the version used for spectral count data in the yeast kinase interactome work not incorporating control purification, as well as a generalized implementation for spectral count data with and without control purification.

 

SAINT versions

(1) We currently maintain and distribute 3 versions of SAINT: SAINT v2 (refs 2, 3, 4), SAINTexpress (ref 5), and SAINTq (ref 10).

(2) All programs in SAINT v2 rely on time consuming sampling-based inference (MCMC) and SAINTexpress was developed to address the issue. However, SAINTexpress reduced the number of options the user can moderate the scoring system with, and it only allows for the analysis of data sets with control purifications.

(3) SAINTq was recently added to the package to enable scoring of AP-SWATH data, where intensities are reported at the transition level. The new software allows the user to directly use the reproducibility information in transitions/peptides to score each protein as an interaction partner. SAINTq can also be used to score protein- or peptide-level intensity data. SAINTq with protein intensity data corresponds to exactly the same scoring system as the protein intensity-based scoring in SAINTexpress (intensity module).

 

Which version should I use?
(1) Use SAINTexpress for rapid and robust scoring of datasets when (i) you have spectral count data or protein-level intensity data and (ii) negative control runs represent contaminants well. 

(2) Use SAINT 2.0 when flexibility is needed in the scoring (this is enabled via the use of “options” that enable tailoring the scoring to the dataset, e.g. normalize abundance across all purifications).  SAINT 2.0 is also available to use within the Contaminant Repository for Affinity Purification (CRAPome.org; ref 6).  Both SAINT 2.0 and SAINTexpress are integrated in the ProHits LIMS for interaction proteomics (refs 7-8).

(3) Use SAINTq when you have peptide- or transition-level intensity data (e.g. data independent acquisition). You can also use this version for protein-level intensity data, which will result in the same analysis as the intensity version of SAINTexpress. This version requires a single input file which contains all quantitative data as well as bait and prey information. Previous versions of SAINT required the three pieces of information to be provided in separate files.

 

Download
The software and vignette can be downloaded from
http://sourceforge.net/projects/saint-apms/files/

 

Compilation with GNU Scientific Library (GSL)
GSL is free from the web. Simply run the bash script ‘compile’ to install the program. Directory containing the executable should also be added to the PATH variable. Data preparation steps are described in the vignette.

 

References
[1] Breitkreutz, A., Choi, H., Sharon,  J., Boucher, L., Neduva, V., Larsen, B.G., Lin, Z.-Y., Breitkreutz, B.-J., Stark, C., Liu, G., Ahn, J., Dewar-Darch, D., Tang, X., Almeida, V., Qin, Z.S., Pawson, T., Gingras, A.-C, Nesvizhskii, A., Tyers, M. (2010) A global protein kinase and phosphatase network. Science, 328:1043-6
* This describes the use of the unsupervised SAINT model that does not require negative controls for scoring.  The unsupervised model should only be used for large scale projects that profile > xxx baits that share very few interactions. This model is built in SAINT v1, distributed as "SAINTspc-noctrl-matrix" module in the SAINT v2.

[2] Choi, H., Larsen, B., Lin., Z.-Y., Breitkreutz, A., Mellacheruvu, D., Fermin, D., Qin, Z.S., Tyers, M., Gingras, A.-C. and Nesvizhskii, A.I. (2011) SAINT: probabilistic scoring of affinity purification - mass spectrometry data. Nature Methods, 8:70-3.
* This is the key reference for the SAINT series of algorithms (SAINT v2).  We introduced in there the semi-supervised SAINT model, which is based on comparing the spectral count distribution across the negative control runs to the counts for the same prey in the purification of the bait.

[3] Choi, H., Liu, G., Tyers, M., Gingras, A.-C. and Nesvizhskii, A.I. (2012) Analyzing protein-protein interactions from affinity purification-mass spectrometry data with SAINT. Cur Protoc Bioinformatics, Chapter 8:Unit8.15.
* This is a detailed protocol for the use of SAINT, which defines options (minFold, lowMode and Norm) that can be tailored to the dataset to be analyzed.

[4] Choi, H., Glatter, T., Gstaiger, M. and Nesvizhskii, A.I. (2012) SAINT-MS1: protein-protein interaction scoring using label-free intensity data in affinity purification-mass spectrometry experiments.  J Proteome Res, 11:2619-24.
* This describes the implementation of intensity data-based scoring in SAINT (by contrast to count data).  

[5] Teo, G., Liu, G., Zhang, J.P., Nesvizhskii, A.I., Gingras, A.-C., and Choi, H. (2013) SAINTexpress: improvements and additional features in Significance Analysis of INTeractome for AP-MS data.  J. Proteomics, 100:37-43.
* This manuscript will describe a computationally efficient version of the SAINT tool in which several improvements were made to more sensitively score interactions for prey proteins captured in different amounts across bait purifications.  This manuscript also presents an optional re-scoring based on externally acquired information.

[6] Mellacheruvu, D., Wright, Z., Couzens, A.L, Lambert, J.-P., St-Denis, N., Li, T., Miteva, Y.V., Hauri, S., Sardiu, M.E., Low, T.Y., Halim, V.A., Bagshaw, R., Hubner, N.C., al-Hakim, A., Bouchard, A., Faubert, D., Fermin, D., Dunham, W.H., Goudreault, M., Lin, Z.-Y., Gonzalez Badillo, B., Pawson, T., Durocher, D., Coulombe, B., Aebersold, R., Superti-Furga, G., Colinge, J., Heck, A.J.R., Choi, H., Gstaiger, M., Mohammed, S., Cristea, I.M., Bennett, K.L., Washburn, M.P., Raught, B., Ewing, R.M., Gingras, A.-C., and Nesvizhskii, A.I. (2013) The CRAPome: a Contaminant Repository for Affinity Purification Mass Spectrometry Data. Nature Methods, 10(8):730-6.
* SAINT 2.0 is used in this manuscript to help users score their interactions by making use of negative control purifications generated by the proteomics community.

[7] Liu, G., Zhang, J.P., Larsen, B, Stark, C., Breitkreutz, A., Lin, Z.-Y., Breitkreutz, B.-J., Ding, Y., Colwill, K., Pasculescu, A., Pawson, T, Wrana, J., Nesvizhskii, A.I., Raught, B, Tyers, M., and Gingras, A.-C. (2010) ProHits: an integrated software platform for mass spectrometry-based interaction proteomics. Nat Biotech, 28:1015-7
* Primary description of the ProHits LIMS which is integrated with SAINT.

[8] Liu, G., Zhang, J., Choi, H., Lambert, J.-P., Srikumar, T., Larsen, B., Nesvizhskii, A.I., Raught, B., Tyers, M., and Gingras, A.-C. (2012) Using ProHits to store, annotate and analyze affinity purification - mass spectrometry (AP-MS) data. Cur Protoc Bioinformatics, Chapter 8:Unit8.16
* This manuscript provides detailed protocols for the use of ProHits for managing interaction proteomics; SAINT 2.0 incorporation in ProHits is defined in this chapter.  Please also note that the current version of ProHits enables the user to run either SAINT 2.0 (with options) or SAINTexpress.

[9] Knight, J.D., Liu, G., Zhang, J., Pasculescu, A., Choi, H., and Gingras, A.-C. (2014) A web-tool for visualizing quantitative protein-protein interaction data. Proteomics, 15(8):1432-6. Download link: http://prohitstools.mshri.on.ca.
* This manuscript describes a new visualization tool for quantitative interactome data. The web tool provides convenient visualization modules for the raw and SAINTexpress adjusted quantitative data and other statistical summaries such as fold changes and confidence scores.

[10] Teo, G.C., Koh, H.W.L., Fermin, D., Lambert, J.-P., Knight, J.D., Gingras, A.-C., Choi, H. (2016) SAINTq: scoring protein-protein interactions in affinity purification - mass spectrometry experiments with fragment or peptide intensity data. In revision
* This manuscript describes the latest version of intensity-based SAINT versions using protein/peptide/transition-level intensity data with simplified input formatting.