Is Preview a search engine like Byonic, Mascot, or SEQUEST?
Not exactly. Preview samples the data, so it generally makes fewer identifications than a full search engine. On the other hand, it tests many more modifications and search options than any full search engine.
How should I use Preview?
Run Preview before you run any other searches, so that you will know what type of full search will be most effective.
Should I use Preview to recalibrate my m/z measurements?
Yes! If Preview makes sufficiently many identifications, say at least 20 precursors and at least 50 fragments, then you will generally get better results out of a full search with Preview’s recalibrated spectrum file than with the original spectrum file unless the original calibration is extremely good. If you have enough identifications to avoid over-fitting (say 100 or more precursors), you can even run Preview’s recalibrated spectrum file through Preview again for even more precise recalibration.
How does Preview recalibrate m/z measurements?
It maps measured m/z values to recalibrated m/z values using quadratic curves– the red curves shown in the plots. We have found that calibration does not drift much over the course of an LC-ESI run, so that the same quadratic curve works for all spectra. Calibration can change from plate to plate with MALDI, however, so it’s quite possible to see a lot of scatter in the m/z errors from a data set comprising many MALDI plates.
If Preview reports median precursor error of 2 ppm, should I set the precursor tolerance in the full search to 2 ppm?
No! The median error is the typical error for an abundant ion and at least 3 to 5 times smaller than the maximum error. Also check the number of “off-by-one” errors reported by Preview: even on high-accuracy instruments, many precursor masses may reflect the mass of the first isotope peak rather than the monoisotopic mass.
In the full search, should I enable all the modifications that Preview reports as “Common variable modifications”?
Not necessarily. Some full search engines do not support all the modifications supported by Preview. Some modifications are biologically uninteresting (for example, sodiation) and should only be enabled if they would contribute a significant number of additional identifications.
How does Preview compute False Discovery Rate (FDR)?
Preview uses the target / decoy approach to FDR estimation, and estimates the number of true identifications by the number of target identifications minus the number of decoy identifications. There is no need to add decoy proteins to the protein database, because Preview does this automatically. Preview does not report FDR, but it uses FDR internally to decide which identifications to accept.
How reliable are Preview’s statistics?
Preview’s statistics are especially good for “normal” shotgun proteomics, meaning digested multi-protein samples. Preview loses some reliability on very highly modified samples, in which many peptides carry more than one variable modification.
How can I use Preview to improve my sample processing?
Preview reports on the amount of nonspecific digestion, m/z measurement errors, and sample preparation artifacts such as over- and under-alkylation, carbamylation, oxidation, sodiation, and deamidation. This type of information can provide valuable feedback.
How should I read Preview’s peptide and protein identifications?
This list (accessible from the Detail page) gives the highest-scoring identification for each spectrum, so long as the score is high enough to be statistically significant. We don’t usually do much with this list of identifications: remember that Preview samples the data, and does NOT perform a full search.
How should I read Preview’s wildcard search results?
Preview’s wildcard is just what it sounds like: any mass shift on any one residue. Wildcard identifications are often approximate, with misplaced modifications, two modifications combined into one wildcard, two known modifications in a combination not considered by Preview’s other searches, and so forth. On the other hand, these identifications, especially if they have scores over 60, are rarely completely wrong. A wildcard search will find polymorphisms, unanticipated modifications, and mystery mass shifts in almost any sample.
Why do the Summary and Details statistics sometimes disagree?
The Summary page reports the overall gain to be achieved by enabling the modification, for example, 8.5% more identifications by allowing oxidized methionine for the BTK sample data. In contrast, the Details page reports the rate of modification, for example, 32.9% of peptides containing methionine contain at least one oxidized methionine. In other words, the Summary reports the “bottom line”, how many more identifications can be obtained by enabling the modification, while the Details page reports direct comparisons on specific, limited searches.
For example, to assess the rate of oxidized methionine, Preview searches the spectra only against methionine-containing peptides, and reports the results of the search on the Details page. Then after all searches have been done, Preview compiles the summary statistics by counting up all the identifications for all spectra.
Denominators in the percentages may also vary from search to search due to “second-order” effects such as multiply modified peptides and corrections for hits to decoys.
Preview’s statistics can lose accuracy on extreme data sets, those in which a large percentage (say 30% or more) of the peptides carry more than one type of modification: for example, a data set that is both highly over-alkylated and highly oxidized.