Abstract

Poster Presentations

Day 4, June 25（Wed.）　

Room P (Maesato East, Foyer, Ocean Wing)

4P-PM-40
PDF

Dependence of Decoy Database on Peptide Identification Numbers in Bottom-Up Proteomics

(¹Kyoto Univ., ²NIBN)
^oYuichiro Fujita¹, Yasushi Ishihama^1,2

In bottom-up proteomics, peptide identification is usually controlled by the false discovery rate (FDR) using the target-decoy method. However, the number of peptide spectrum matches (PSMs) at a fixed FDR may be affected by the contents of the decoy database (Decoy DB). This study investigates the variation in peptide identification caused by different decoy generation methods. Using a UniProt FASTA file with over 20,000 human proteins, we generated a Target DB and multiple Decoy DBs with AlphaPeptDeep. Two types of decoys were examined: (1) Protein-level, where whole protein sequences were reversed or randomly shuffled before digestion and fragmentation, and (2) Peptide-level, where peptide sequences were reversed or shuffled before fragmentation. The dataset JPST001624 (HeLa DDA data from jPOST) was analyzed, with FDR calculated using cosine similarity from reverse search. Preliminary results show little difference in PSMs between reversed and shuffled sequences but a clear difference between Protein-level and Peptide-level decoys. This suggests that peptide diversity in decoy generation affects identification outcomes. We will further investigate optimal Decoy DB selection criteria, their impact on protein identification, and robustness using alternative scores such as UniScore. Through this study, we aim to provide new insights toward the establishment of a more appropriate Target-Decoy approach.