The neXt-MP50 challenge

Annual CHPP report

neXt-MP50 Reports

neXt-CP50 Reports (a Dark Protein Initiative)

By analogy to the term “dark proteins” coined to represent structurally uncharacterized regions, C-HPP investigators have recently adopted the term “dark proteome” to collectively refer to those proteins for which we have insufficient
information on either protein expression, structure, function, or all of these: They include, for example, MPs (PE2−4), PE5, uPE1 proteins, and any potential proteins translated from smORF or lncRNAs. When focused on uPE1 proteins, there are nearly 2,000 proteins which have no functional information (Paik et al., 2018, JPR, DOI:10.1021/acs.jproteome.8b00383).

On March 1, 2018, HUPO C-HPP announced launching the neXt-CP50 where CP stands for “characterization of proteins”. This pilot project aims to characterize function of 50 uPE1 proteins during ~3 years (2018-2021). This challenge is to test the feasibility of the functional characterization of large numbers of dark proteins, 2000 at present the 15 teams are focusing on specific tractable targets that can be investigated
Of the C-HPP consortium international teams, 15 from 12 countries joined this project: Chr 2 (Switzerland), Chr 3 (Japan), Chr 4 (Taiwan), Chr 9, 11, 13 (Korea), Chr 10, Chr 14 (France), Chr 15 (Brazil), Chr 16 (Spain), Chr 17 (USA), Chr 18 (Russia), Chr 19 (Mexico), Chr 20 (China), and Chr Y (Iran).

1.1. Chromosome 1

1.2. Chromosome 2


  • Complete reannotation of the 33 missing proteins from chr 2 with associated literature - expected validation of ~20 proteins in September 2017
  • Targeted studies (SRM, IHC) on the 40 probable sperm proteins listed in Duek et al. 2016 - expected validation of ~10 chromosome 2 proteins in April 2017 (collaboration with Chromosome 14 team)
  • Data mining on the remaining chr2 MPs to define the best samples to track them - prioritized list expected May 2017

1.3. Chromosome 3

1.4. Chromosome 4

1.5. Chromosome 5


  • Participation in IVTT cluster, which target the 50 missing proteins in chromosomes 5, 10, 15, 16, and 19 that has the highest probability for identification in COV318, PANC1, DFCI024, D341MED, ST486, KLE cell line determined using mRNA array expression.
  • Close collaboration with chromosome 19 team in Moonshot project performing deep profiling of 4000 melanoma samples and sharing bioinformatics expertise. In this project missing proteins can be obtained from deep profiling of large number of primary and metastatic melanoma samples.
  • European trasncan-2 project on ovarian cancer will allow to obtain primary and metastatic protein profiles on large number of ovarian cancer samples, where missing proteins is expected to be identified.
  • Local project on proteogenomics profiling of human lung tissue in context of COPD. Human lung tissue is highly complex in term of cell types, which can be the source of multiple missing proteins.

1.6. Chromosome 6

1.7. Chromosome 7

1.8. Chromosome 8

1.9. Chromosome 9

1.10. Chromosome 10

1.11. Chromosome 11

1.12. Chromosome 12

1.13. Chromosome 13

Top-Down Approaches

  • Steps
    1. 20~30 missing proteins from liver (normal-like, tumor, and/or cell lines) in all Chr. mapped
    2. 20~30 missing proteins from pancreas (normal-like, tumor, and/or cell lines ) in all Chr. mapped
    3. 10~20 missing proteins from placenta (normal, pre-eclampcia) in all Chr. mapped
    • Total missing proteins: 50 in all Chr. mapped
  • Milestone dates:
    • Step 1. Jul. 2017
    • Step 2. Feb. 2018
    • Step 3. Aug. 2018

Bottom-Up Approaches

  • Steps
    1. Select target missing proteins under criteria (available/target sample, transcript expression level, number of proteotypic peptides, etc..)
    2. makes synthetic peptides’ MS spectrum of target proteins.
    3. Discovering and validating missing proteins.
    • Total missing proteins: 50 in all Chr. mapped
  • Milestone dates:
    • Step 1. Feb. 2017
    • Step 2. Feb. 2018
    • Step 3. Aug. 2018

1.14. Chromosome 14

1.15. Chromosome 15

1.16. Chromosome 16

Step by step milestone plan to find, identify and validate MPs:

• Prioritization of the 50 chr16 MPs to be targeted. Annotation with Missing Proteinpedia, HPA, Human MRM Atlas and Chr16 already generated information. February 2017.
• List of chr16 MPs that would require non standard procedures (non-trypsin digestion sites, lack of unique peptides...). March 2017.
• Preparation of SRM methods and identification of the biological sources with highest probability of finding the selected chr16 MPs. June 2017
• SRM analysis, 10 MPS/lab. Starting on June-July 2017).
• IVTT initiative (5 chromosomes participating). 5-10 proteins/year. (2 MPs already identified by SRM and 2 aditional are pending on validation)

1.17. Chromosome 17

1.18. Chromosome 18

1.19. Chromosome 19

1.20. Chromosome 20

1.21. Chromosome 21

1.22. Chromosome 22

1.23. Chromosome X

1.24. Chromosome Y

1.25. Mitochondrial Chromosome