Skip to main content

Thank you for visiting nature.com. You are using a browser version with limited support for CSS. To obtain the best experience, we recommend you use a more up to date browser (or turn off compatibility mode in Internet Explorer). In the meantime, to ensure continued support, we are displaying the site without styles and JavaScript.

  • Analysis
  • Published:

A resource for cell line authentication, annotation and quality control

Abstract

Cell line misidentification, contamination and poor annotation affect scientific reproducibility. Here we outline simple measures to detect or avoid cross-contamination, present a framework for cell line annotation linked to short tandem repeat and single nucleotide polymorphism profiles, and provide a catalogue of synonymous cell lines. This resource will enable our community to eradicate the use of misidentified lines and generate credible cell-based data.

This is a preview of subscription content, access via your institution

Access options

Buy this article

Prices may be subject to local taxes which are calculated during checkout

Figure 1: Inconsistencies in cell line nomenclature.
Figure 2: Analysis of STR and SNP fingerprinting of cell lines.
Figure 3

Similar content being viewed by others

Accession codes

Accessions

BioProject

References

  1. American Type Culture Collection Standards Development Organization Workgroup ASN-0002. Cell line misidentification: the beginning of the end. Nature Rev. Cancer 10 441–448 (2010)

  2. Editorial. Identity crisis. Nature 457, 935–936 (2009)

  3. Capes-Davis, A. et al. Match criteria for human cell line authentication: where do we draw the line? Int. J. Cancer 132 2510–2519 (2013)

    Article  CAS  Google Scholar 

  4. Dirks, W. G. & Drexler, H. G. STR DNA typing of human cell lines: detection of intra- and interspecies cross-contamination. Methods Mol. Biol. 946 27–38 (2013)

    Article  CAS  Google Scholar 

  5. Editorial. Announcement: Reducing our irreproducibility. Nature 496, 398 (2013).

  6. Lorsch, J. R., Collins, F. S. & Lippincott-Schwartz, J. Fixing problems with cell lines. Science 346, 1452–1453 (2014)

    Article  ADS  CAS  Google Scholar 

  7. Lacroix, M. Persistent use of “false” cell lines. Int. J. Cancer 122 1–4 (2008)

    Article  CAS  Google Scholar 

  8. ICLAC. Naming a Cell Line http://iclac.org/resources/cell-line-names/ (2014)

  9. Sarntivijai, S., Ade, A. S., Athey, B. D. & States, D. J. A bioinformatics analysis of the cell line nomenclature. Bioinformatics 24 2760–2766 (2008)

    Article  CAS  Google Scholar 

  10. Hunter, L. & Cohen, K. B. Biomedical language processing: what's beyond PubMed? Mol. Cell 21 589–594 (2006)

    Article  CAS  Google Scholar 

  11. Forbes, S. A. et al. COSMIC: mining complete cancer genomes in the Catalogue of Somatic Mutations in Cancer. Nucleic Acids Res. 39 D945–D950 (2011)

    Article  CAS  Google Scholar 

  12. Barretina, J. et al. The Cancer Cell Line Encyclopedia enables predictive modelling of anticancer drug sensitivity. Nature 483 603–607 (2012)

    Article  ADS  CAS  Google Scholar 

  13. Romano, P. et al. Cell Line Data Base: structure and recent improvements towards molecular authentication of human cell lines. Nucleic Acids Res. 37 D925–D932 (2009)

    Article  CAS  Google Scholar 

  14. Buehring, G. C., Eby, E. A. & Eby, M. J. Cell line cross-contamination: how aware are mammalian cell culturists of the problem and how to monitor it? In Vitro Cell. Dev. Biol. Anim. 40 211–215 (2004)

    Article  Google Scholar 

  15. Barallon, R. et al. Recommendation of short tandem repeat profiling for authenticating human cell lines, stem cells, and tissues. In Vitro Cell. Dev. Biol. Anim. 46 727–732 (2010)

    Article  Google Scholar 

  16. Parson, W. et al. Cancer cell line identification by short tandem repeat profiling: power and limitations. FASEB J. 19 434–436 (2005)

    Article  CAS  Google Scholar 

  17. Santos, F. R., Pandya, A. & Tyler-Smith, C. Reliability of DNA-based sex tests. Nature Genet. 18 103 (1998)

    Article  CAS  Google Scholar 

  18. Tanabe, H. et al. Cell line individualization by STR multiplex system in the cell bank found cross-contamination between ECV304 and EJ-1/T24. Tiss. Cult. Res. Commun. 18, 329–338 (1999)

    Google Scholar 

  19. Masters, J. R. et al. Short tandem repeat profiling provides an international reference standard for human cell lines. Proc. Natl Acad. Sci. USA 98 8012–8017 (2001)

    Article  ADS  CAS  Google Scholar 

  20. Castro, F. et al. High-throughput SNP-based authentication of human cell lines. Int. J. Cancer 132 308–314 (2013)

    Article  CAS  Google Scholar 

  21. Much, M., Buza, N. & Hui, P. Tissue identity testing of cancer by short tandem repeat polymorphism: pitfalls of interpretation in the presence of microsatellite instability. Hum. Pathol. 45 549–555 (2014)

    Article  CAS  Google Scholar 

  22. Didion, J. P. et al. SNP array profiling of mouse cell lines identifies their strains of origin and reveals cross-contamination and widespread aneuploidy. BMC Genomics 15 847 (2014)

    Article  Google Scholar 

  23. Capes-Davis, A. et al. Check your cultures! A list of cross-contaminated or misidentified cell lines. Int. J. Cancer 127 1–8 (2010)

    Article  CAS  Google Scholar 

  24. Cooper, J. K. et al. Species identification in cell culture: a two-pronged molecular approach. In Vitro Cell. Dev. Biol. Anim. 43 344–351 (2007)

    Article  CAS  Google Scholar 

  25. Masters, J. R. & Stacey, G. N. Changing medium and passaging cell lines. Nature Protocols 2 2276–2284 (2007)

    Article  CAS  Google Scholar 

  26. Zhang, J. et al. Intratumor heterogeneity in localized lung adenocarcinomas delineated by multiregion sequencing. Science. 346, 256–259 (2014)

  27. Masters, J. R. Cell-line authentication: end the scandal of false cell lines. Nature 492 186 (2012)

    Article  ADS  CAS  Google Scholar 

  28. Nardone, R. M. Eradication of cross-contaminated cell lines: a call for action. Cell Biol. Toxicol. 23 367–372 (2007)

    Article  Google Scholar 

  29. Wellcome Trust Sanger Institute. The Cell Lines Project http://cancer.sanger.ac.uk/cancergenome/projects/cell_lines/about (2015)

  30. Centers for Disease Control and Prevention. International Classification of Diseases, Ninth Revision, Clinical Modification (ICD-9-CM). (2011)

  31. ICLAC. Database of Cross-contaminated or Misidentified Cell Lines http://iclac.org/databases/cross-contaminations/ (version 7 2, released 10 October 2014).

  32. Wang, J. et al. High-throughput single nucleotide polymorphism genotyping using nanofluidic Dynamic Arrays. BMC Genomics 10 561 (2009)

    Article  Google Scholar 

  33. Parodi, B. et al. Species identification and confirmation of human and animal cell lines: a PCR-based method. Biotechniques 32 432–434,–436, 438–440 (2002)

    Article  Google Scholar 

  34. Steube, K. G., Meyer, C., Uphoff, C. C. & Drexler, H. G. A simple method using beta-globin polymerase chain reaction for the species identification of animal cell lines–a progress report. In Vitro Cell. Dev. Biol. Anim. 39 468–475 (2003)

    Article  CAS  Google Scholar 

  35. Hebert, P. D., Cywinska, A., Ball, S. L. & deWaard, J. R. Biological identifications through DNA barcodes. Proc. R. Soc. Lond. B 270 313–321 (2003)

    Article  CAS  Google Scholar 

Download references

Acknowledgements

We thank S. Ghosh for bioinformatics support, E. Hall and Y. Reid (ATCC) for their intellectual input and expertise in genetic testing. M. Kline for supplying STR profiles. J. Settleman and D. Stokoe for discussions.

Author information

Authors and Affiliations

Authors

Contributions

This collection of authenticated cell line data will be made available through NCBI’s BioProject and BioSample databases, accessible through accession number PRJNA271020, for continued community development and refinement. R.M.N. conceived and supervised the study; M.Y., S.K.S., M.M.Y.L.-C. and G.L. were responsible for cell line banking, experimentation and data collection; S.A., M.B., J.Y., C.K., R.B. and J.S.K. performed data curation and wrote the code for SNP and STR analyses; R.M.N., M.Y., S.K.S., M.M.Y.L.-C., M.B. and F.P. performed manual curation of cell line nomenclature and associated data. All authors discussed the results and commented on the manuscript.

Corresponding author

Correspondence to Richard M. Neve.

Ethics declarations

Competing interests

The majority of authors are employees of Genentech Inc. and/or hold stock in Roche.

Extended data figures and tables

Extended Data Figure 1 Comparison of STR and SNP genotyping assays.

a, Comparison of STR and SNP frequency distributions of pairwise identity alignment scores for 836 lines. Identity scores are computed using the Tanabe algorithm for both 8-locus STR and 48-locus SNP genotype results (compare with Fig. 2a). Total number of comparisons was 349,030 (348,953 non-synonymous and 77 synonymous pairs of cell lines). For plotting purposes, a random subset of 25,000 non-synonymous pairs is displayed. As a consequence of using fewer STR loci, non-synonymous STR standard deviation increased from 0.083 to 0.113, and more truly synonymous pairs now fall below the mean-plus-4-s.d. cutoff. b, Univariate distribution of SNP Tanabe identity scores for data shown in Fig. 2. Results for 2,862 replicate pairs are shown as black dots. (Synonymous pairs are included in density computation, but are so rare compared to non-synonymous pairs that they make no visible change in plotted curve.) Vertical scale is such that total area under curve is 1 unit. Reference lines were computed using non-synonymous pairs only. c, As for b, but showing 16-locus STR identity scores. True replicate pairs are shown in black; pairwise identity scores for a set of seven HeLa-derived lines—which are closely related genetically, but do not constitute true replicates—are shown in red. A mean ± 4s.d. reference line corresponding to a P value of 3.2 × 10−5, is shown for both graphs. Note that reference line is better separated from true replicate results for STR data than for SNP data.

Extended Data Figure 2 Impact of changing the confidence threshold on detecting cell line contamination by SNP profiling.

a, SNP detection using the Fluidigm system was performed on DNA extracted from differing ratios of AU565:Panc 08.13 cells. The raw data was analysed using confidence thresholds of 65 (Th65), 85 (Th85), 90 (Th90) and 95 (Th95). Examples of data are shown for Th65 and Th95. For each SNP XX, XY and YY allele calls are represented by green, blue and red, respectively, and no calls are in grey. b, Table showing percent identity when SNP calls were compared with the database of SNPs. As the confidence threshold increased, a lower level of contamination could be detected as evidenced by decreased correlation values. Ratios depict the relative abundance of AU565:Panc 08.13 cells (for example, 99:2 = 99% AU565 mixed with 2% Panc 08.13). Data are representative of at least two independent experiments.

Extended Data Figure 3 Electropherograms and table of results for STR profiling of DNA extracted from differing ratios of AU565:Panc 08.13 cells.

STRs were determined (see Methods) for DNA extracted from differing ratios of AU565:Panc 08.13 cells. a, Example electropherograms for five (D3S1358, THO1, D21S11, D18S51 and Penta E) of the 16 STR markers are shown. Ratios depict the relative abundance of AU565:Panc 08.13 cells (for example, 99:2 = 99% AU565 mixed with 2% Panc 08.13). Data are representative of at least two independent experiments. b, Table showing STR calls for all STR loci and the top matches when compared to the database of STR calls (Supplementary Table 3).

Extended Data Figure 4 Detection of cross-species contamination.

a, Images of early (p4) and later (p8) passage CoCM-1 cells in culture showing a subpopulation of small, round, loosely attached cells overwhelming the culture over time. b, c, PCR-based detection of human (left panel) and mouse (right panel) cytochrome b oxidase I (COX1) in cell lines (b) and in titrated mixtures of human (MOLT4) and mouse (STV2) cell lines (c) to determine limit of detection. 18S, PCR loading control. d, Flow cytometric analysis of mouse and human CD29 staining in contaminated CoCM-1 cell line. Data are representative of at least two independent experiments.

Supplementary information

Supplementary Information

This file contains the legends for Supplementary Tables 1-14. (PDF 93 kb)

Supplementary Tables

The 2 files in this zipped file contain Supplementary Tables 1-14 (see Supplementary Information file for details). (ZIP 30178 kb)

PowerPoint slides

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Yu, M., Selvaraj, S., Liang-Chu, M. et al. A resource for cell line authentication, annotation and quality control. Nature 520, 307–311 (2015). https://doi.org/10.1038/nature14397

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1038/nature14397

This article is cited by

Comments

By submitting a comment you agree to abide by our Terms and Community Guidelines. If you find something abusive or that does not comply with our terms or guidelines please flag it as inappropriate.

Search

Quick links

Nature Briefing: Cancer

Sign up for the Nature Briefing: Cancer newsletter — what matters in cancer research, free to your inbox weekly.

Get what matters in cancer research, free to your inbox weekly. Sign up for Nature Briefing: Cancer