A Two-dimensional Proteomic Profile of Tetrahymena thermophila Whole Cell Lysate

Abstract

Tetrahymena thermophila is a unicellular eukaryotic model organism used for a variety of biochemical, molecular and biological studies. According to its macronucleus genome sequence, it is expected to contain more than 27,000 protein-coding genes, although only a small proportion of them have information published specifically about them. Here, we present a reference map for whole cell lysate of T. thermophila obtained using two-dimensional gel electrophoresis (2-DE) combined with mass spectrometry. Although (2-DE) is one of the most efficient techniques for resolving complex protein mixtures and revealing the relative high-abundance proteins, it has not yet been applied generally to ciliates. In order to obtain qualitative protein samples for analysis, an appropriate homogenization method is required. Optimization of the homogenization method led to the analysis of nearly 4500 protein spots, the final identification of 375 different proteins using Mascot software and an additional 258 gene products using a newly developed web service, called Peptide Finder, resulting in a total of 631 different gene products that are considered to constitute the proteomic profile of the whole cell lysate of T. thermophila.

MALDI-TOF
proteome
proteomics
Tetrahymena thermophila
TRI reagent
2-D electrophoresis
Peptide Finder

Tetrahymena thermophila is a unicellular eukaryotic organism that belongs to the division of protists. It is a non-pathogenic, aerobian that possesses characteristic structures, such as cilia. Due to its structural and functional complexity, as well as the fact that its genome preserves many primary biochemical procedures of eukaryotes (1), T. thermophila is considered to be a valuable model organism for genetic and biological studies (2, 3). Its use has led to the discovery of important biomolecules such as catalytic RNA, telomere, telomerase and the cell motor dynein, as well as elucidation of the role of histone acetylation in gene expression (4). Lately, T. thermophila has been widely used for enzyme purification and in bioreactors (5-8).

Like many ciliates, T. thermophila possesses two types of nuclei, distinct in their function (9). The micronucleus (MIC) contains five pairs of mostly transcriptionally silent chromosomes, while the macronucleus (MAC) determines the phenotype of the cell. The genome of the latter has only recently been annotated and found to consist of 225 transcriptionally active chromosomes with a total size of 104 Mb; 27,000 protein-encoding sequences have been identified (10). The mitochondrial proteome of T. thermophila was recently published (11), while from previous studies, data about its ciliome (12), its phagosomic proteins (13), basal body proteins (14) and four novel β/γ crystalline domain proteins from a subcellular fraction, enriched in granules (15) were made available. Out of the 27,000 predicted protein-encoding sequences, approximately 15,000 have strong BLAST matches to known or predicted genes from other organisms. However, only 53 proteins are annotated in the protein database of UniProtKB/Swiss-Prot and 17,715 in UniProtKB/TrEMBL. In the present work, we applied proteomic technologies coupled with mass spectrometry and bioinformatics approaches in order to characterize the proteome of T. thermophila and construct its reference map.

Materials and Methods

Cell cultures. T. thermophila was cultured at 25°C under aerobic conditions in a medium consisting of 2% (w/v) proteose-peptone, 0.5% (w/v) sucrose, 0.2% (w/v) yeast extract and 1% (v/v) Fe²⁺ -9 mM EDTA (pH 5.5) (16). At the end of the logarithmic phase (~72 h), cells were harvested, centrifuged and the cell pellet was washed using 0.9% NaCl.

Figure 1.

Peptide map of total protein extract from T. thermophila. The first dimensional separation was conducted in a 3-10 immobilized pI gradient strip. The identified proteins are annotated by their accession numbers as listed in Table I.

Homogenization. Protein extraction was achieved using TRI-Reagent (17), an organic solution of phenol and guanidine thiocyanate that facilitates protein extraction with the simultaneous removal of DNA and RNA content. DNA and RNA are endogenous charged particles that enclose proteins and inhibit them from being separated during two-dimensional gel electrophoresis (2-DE). The use of TRI Reagent not only facilitates cell homogenization due to its organic compounds, but enriches the sample's protein content as it removes inhibiting molecules, giving clear, highly reproducible gel images and identifications.

Cells from cell cultures were homogenized in TRI Reagent as recommended by the manufacturer (Ambion/Applied Biosystems, Austin, TX, USA). A total of 10×10⁶ cells were homogenized in 1 ml TRI reagent and incubated for 5 min at room temperature. The solution was centrifuged at 12000×g for 5 min to remove undiluted material and 200 μl chloroform were added to the supernatant. After incubation for 15 min and centrifugation at 12000×g for 10 min, the RNA phase was removed and 300 μl ethanol were added to the residual phenol phase and interface to sediment the DNA. After 2-3 min, the solution was then centrifuged at 2000×g for 5 min. Three volumes of acetone were added to the supernatant, incubated for 10 min at room temperature and centrifuged at 12000×g for 10 min.

The protein pellet was washed with 1 ml of protein Wash 1 buffer (300 mM guanidine hydrochloride in 95% ethanol, 2.5% glycerol (v/v)) and incubated for 10 min at room temperature, centrifuged at 8000×g for 5 min, washed twice more with 1 ml of Wash 1 buffer and then washed with Protein Wash 2 (ethanol containing 2.5% glycerol (v/v)) for 10 min.

Two-dimensional electrophoresis (2-DE). Protein pellets were pooled and resuspended in urea lysis buffer (20 mM Tris, 7 M urea, 2 M thiourea, 4% 3-[(3-cholamidopropyl)dimethylammonio]-1-propanesulfonate (CHAPS), 10 mM 1,4-dithioerythritol, 1 mM EDTA) and a mixture of protease inhibitors [1 mM PMSF and 1 tablet complete™ (Roche Diagnostics, Basel, Switzerland), per 50 ml of suspension buffer] and phosphatase inhibitors (0.2 mM Na₂VO₃ and 1 mM NaF). The protein content in the supernatant was determined by applying the Bradford method (18) using BIO-RAD protein assay (BIO-RAD Laboratories, Hercules, CA, USA). 2-D Gel electrophoresis was performed as previously reported (19). Samples of 1.0 mg total protein were applied on immobilized 3-10 pI, 4-7 pI and 6.3-8.3 pI non-linear gradient strips (17 cm) at their basic and acidic ends. Focusing for the former two strips started at 250 V for 30 min and the voltage was gradually increased to 5000 V at 3 V/min and remained constant for an additional 16 h. For pI 6.3-8.3, strip rehydration lasted only 6 hours and focusing started at 250 V for 30 min and the voltage was gradually increased to 8000 V at 3 V/min and remained constant for an additional 15 h.

The second-dimensional separation was performed in 12% SDS-polyacrylamide gels (180×200×1.5 mm), running at 40 mA per gel in a PROTEAN apparatus (BIO-RAD). After fixation with 50% methanol containing 10% acetic acid for 2 h, the gels were stained overnight with colloidal Coomassie blue (Novex, San Diego, CA, USA), washed twice with water and scanned in a densitometer (GS-800 Calibrated Densitometer; BIO-RAD).

Peptide mass fingerprint and post source decay. Peptide analysis and protein identification were performed as previously described (20). Spots were detected using Melanie 4.02 software (GeneBio, Geneva Swiss) on the Coomassie blue-stained gel and the spots were then excised by Proteineer SPII (Bruker Daltonics, Bremen, Germany), destained with 30% acetonitrile in 50 mM ammonium bicarbonate and dried in a speed vacuum concentrator (MaxiDry Plus; Heto, Allered, Denmark). Each dried gel piece was rehydrated with 5 μl of 1 mM ammonium bicarbonate, containing 50 ng trypsin (Roche Diagnostics) and left for 16 h at room temperature. A total of 10 μl of 50% acetonitrile, containing 0.3% trifluoroacetic acid, were added to each gel piece and incubated for 20 min with constant shaking. A peptide mixture (1.5 μl) was simultaneously applied with 1 μl of matrix solution, consisting of 0.025% α-cyano-4-hydroxycinnamic acid, CHCA (Sigma-Aldrich, St. Louis, MO, USA) and the internal standard peptides [Des-Arg]-bradykinin (904.4681 Da; Sigma) and adrenocorticotropic hormone fragment 18-39 (2465.1989 Da; Sigma) in 65% ethanol, 35% acetonitrile and 0.03% trifluoroacetic acid.

Samples were analyzed for peptide mass fingerprint (PMF) with matrix assisted laser desorption ionization – Mass Spectrometry (MALDI-MS) in a time-of-flight mass spectrometer (Ultraflex II MALDI-TOF-TOF MS/MS; Bruker Daltonics, Bremen, Gernamy). Peptide matching and protein searches were performed automatically, as described by Berndt et al. (20). Each spectrum was interpreted by the Mascot Software (Matrix Sciences Ltd., London, UK) and Peptide Finder. For peptide identification, the monoisotopic masses were used and a mass tolerance of 0.0025% was allowed. Unmatched peptides or peptides with up to one miscleavage site were not considered. The peptide masses were compared with the theoretical peptide masses of all available proteins from all species using Swiss-Prot and TrEMBL databases. The probability score identified by the software was used as the criterion of the identification (http:/www.matrixscience.com). Samples not identified by PMF (probability significance of p<0.05) were automatically selected for post-source decay (PSD) MS-MS analysis with MALDI-MS-MS in the same spectrometer. The peptide masses chosen for PSD-MS-MS analysis had a signal intensity of >600 counts and were excluded from the trypsin autodigest, matrix and keratin peaks. The resulting PSD spectra were also interpreted by the Mascot Software and Mascot probability-based scores of p<0.02 were considered significant.

Figure 2.

Functional classification of identified proteins from T. thermophila. Proteins from Table I and II were classified into functional groups according to the UniProtKB/TrEMBL ontologies.

The identified proteins were automatically annotated on the gel image by ProteinScape software (Bruker Daltonics). Obtained spectra did not result in reliable significance in protein identification through the Mascot software, re-evaluated with a newly developed web service called Peptide Finder (21). The search was carried out through a database of completely digested T. thermophila proteins (http://bioserver-1.bioacademy.gr/Bioserver/PeptideFinder/). For the identification, complete peak lists with measured molecular weights obtained by the MALDI-MS analysis were uploaded to the software and mapping to the proteins of the database requested. Additionally, various filters are available in order to obtain a more refined list of peptides and proteins with the requested molecular weights. Peptide Finder Score is an index used to comparatively rank the candidate proteins based on the statistics of the molecular weight distributions. It suggests that a good identification should be based on the number of molecular mass matches, the frequency they present inside the Swiss-Prot database for the species (i.e. their randomness index) and the molecular mass (indicating the size) of the suggested protein.

Results

The objective of the present study was to characterize the relevant proteins of high and medium abundance of T. thermophila homogenate. For this reason, we employed 2-D gel electrophoresis technique. The homogenates were analysed in three different pH ranges (3-10, 4-7, 6.3-8.3) and 4,457 spots were totally detected by the Melanie 4.02 software. These spots were excised from the gels and analyzed for protein identification following in-gel digestion with trypsin. Each spot was analyzed for PMF with MALDI-MS in a time-of-flight mass spectrometer and proteins were identified automatically by the peptide mass matching. Proteins not identified by PMF were subsequently selected for PSD-MS-MS and analyzed with MALDI-MS-MS. Using an internal peptide standard to correct the measured peptide masses, we were able to use very narrow windows of mass tolerance (0.0025%) and hence, increase the confidence of identification, as well as the total identification rate up to 85%. This procedure resulted in the identification of 375 different gene products (Figure 1). The abbreviated and full names of the proteins, the theoretical MW as well as data from the MS analysis, such as the probability that the identification is a random event (Score), and the coverage of the protein by the identified peptides are listed in Table I. The protein identification data are available at the PRIDE database (http://www.ebi.ac.uk/pride/, experiment #3700). It should be noted that many of the identified proteins in these gels were represented by more than one spot with the same apparent molecular mass, but with different pI values, possibly due to post-translational modifications or isoenzyme variations.

View this table:

Table I.

Proteins from T. thermophila were extracted and separated by 2-D gel electrophoresis as described in the Materials and Methods. Proteins were identified by MALDI-MS and MS/SM, following in-gel digestion with trypsin. The proteins identified by Mascot Software are designated by their Swiss-Prot accession numbers and their full names. The theoretical MW as well as the protein amino acid sequence coverage by the matching peptides are given as an indication of the confidence of the identification. Mascot Score is −10 log(P), where P is the probability that the observed match is a random event.

View this table:

Table II.

Proteins from T. thermophila were extracted and separated by 2-D gel electrophoresis as described in the Materials and Methods. Proteins were identified by MALDI-MS and MS/SM, following in-gel digestion with trypsin. The proteins identified by the PeptideFinder software are designated by their SWISS-PROT accession numbers and their full names. The theoretical MW as well as the protein amino acid sequence coverage by the matching peptides are given as an indication of the confidence of the identification. PeptideFinder score is an index used to comparatively rank the candidate proteins based on the statistics of the molecular weight distribution.

Mass spectra that did not result in protein identification by the above procedure were processed for identification by the Peptide Finder software (21). By that procedure, we were able to identify 258 additional gene products (Table II), resulting in a total of 631 proteins that are considered to make up the proteome of T. thermophila.

Discussion

T. thermophila is one of the most studied model organisms in biology. Although numerous studies concerning specific proteins or subsets of proteins after fractionation exist, to date nothing has been published regarding its whole proteome. This is mainly due to the fact that T. thermophila has a very large number of protein-encoding sequences to be analysed all at once, with a single technique. Each technique has its limitations and requires compatibility of the homogenization and identification methods. In this study, we were able to identify a significant number of proteins, but only after proper optimization of the homogenization method, namely, by the removal of nucleic acids with the use of Tri-Reagent for the protein isolation instead of the conventional method (urea-thiourea) (data not shown).

Out of the 631 identified proteins, 8 have already been annotated in the Swiss Prot database, indicating that the remaining 623 proteins are as yet considered to be hypothetical. This is because Swiss Prot is a protein database that contains only proteins actually identified and studied, while TrEMBL is generated by computer translation of the genetic information from the EMBL Nucleotide Sequence Database. As expected, the latter procedure is subject to errors, so proteins predicted by the TrEMBL database have a risk of being hypothetical. The identification of these proteins and thus, the confirmation of a significant number of proteins found in the TrEMBL database all at once, was a major achievement.

A total of 92% of the identified proteins are correlated to the genome of T. thermophila, while the remaining 8% show high homology with proteins of other protozoa such as Stylonychia, Gastrostyla, Frontonia, Strombidinopsis, Toxoplasma, Plasmodium, Paramecium, Cryptosporidium, Theileria, Favella, and Chilodonella.

A brief comparison between our work and other studies, revealed only few similarities. Out of the 631 identified proteins, only 11 are common to those published for its ciliome (12) and 34 proteins are found in both our results and those for the mitochondrial proteome (11). As to be expected, this is a result of the different preparation and identification techniques used and mainly due to the fact that we chose to analyze the entire homogenate of the organism.

Nearly half of these identified proteins have unknown functions, while of the rest 47.5% are enzymes (lyases, ligases, isomerases, proteases, hydrolases, etc.) participating in various metabolic reactions, 22.4% are proteins that bind other molecules and ions for their action, such as ATP-, GTP-RNA and calcium-binding proteins (Figure 2). A total of 13.8% correspond to structural proteins, such as actin, tropomyosin, centrin, dynein and Granule lattice proteins, 4.4% were regulatory proteins, such as 14-3-3, AhpC/TSA and N-ethylmaleimide-sensitive factor and 3.8% are ribosomal proteins involved in protein synthesis. Another 3.8% are proteins chaperones, such as heat-shock proteins 90, 82 and DNAK, and 3.2% are translational proteins, such as translation elongation factor ef-1. Finally, 1.2% of the identified proteins are involved in signal transduction pathways, as the Mov34/MPN/PAD-1 and Ras protein families.

The identified proteins are considered to be the most abundant (22), but this number may easily seem small for the entire homogenate of such a complicated organism. However, since no further steps have been taken in proteomics for the analysis of total cell extract of T. thermophila and since protein databases are continuously altered, as they are based on the translation of the mRNA without taking into account the alterations that may occur before the final product of protein is reached, their addition to the list of proteins found to be expressed from T. thermophila is a notable contribution to the study of this organism.

The obtained results indicated that even though most of the identified proteins are essential for several functions of this protozoon and participate in various metabolic reactions, there is still a significant percentage corresponding to proteins of unknown function. The future aim is to elucidate their role or even suggest a possible connection with the existing functions. This indicates that Tetrahymena is an organism to be studied for many years and should still be a reliable model organism for biochemical and biological studies.

Received October 14, 2009.
Revision received April 14, 2010.
Accepted April 23, 2010.

References

1. Orias E
: Mapping the germ-line and somatic genomes of a ciliated protozoan, Tetrahymena thermophila. Genome Res 8(2): 91-99, 1998.
1. Turkewitz AP
: Out with a bang! Tetrahymena as a model system to study secretory granule biogenesis. Traffic 5: 63-68, 2004.
1. Wheatley DN,
2. Rasmussen L,
3. Tiedtke A
: Tetrahymena: a model for growth, cell cycle and nutritional studies, with biotechnological potential. Bioessays 16(5): 367-372, 1994.
1. Brownell JE,
2. Zhou J,
3. Ranalli T,
4. Kobayashi R,
5. Edmondson DG,
6. Roth SY,
7. Allis CD
: Tetrahymena histone acetyltransferase A: A homolog to yeast Gcn5p linking histone acetylation to gene activation. Cell 84: 843-851, 1996.
1. Noseda DG,
2. Gentili HG,
3. Nani ML,
4. Nusblat A,
5. Tiedtke A,
6. Florin-Christensen J,
7. Nudel CB
: A bioreactor model system specifically designed for Tetrahymena growth and cholesterol removal from milk. Appl Microbiol Biotechnol 75: 515-520, 2007.
1. Pauli W,
2. Berger S
: Toxicological comparisons of Tetrahymena species, end points and growth media: supplementary investigations to the pilot ring test. Chemosphere 35: 1043-1052, 1997.
1. Valcarce G,
2. Munoz L,
3. Nusblat A,
4. Nudel C,
5. Florin-Christensen J
: The improvement of milk by cultivation with ciliates. J Dairy Sci 84: 2136-2143s, 2001.
1. Weide T,
2. Herrmann L,
3. Bockau U,
4. Niebur N,
5. Aldag I,
6. Laroy W,
7. Contreras R,
8. Tiedtke A,
9. Hartmann MW
: Secretion of functional human enzymes by Tetrahymena thermophila. BMC Biotechnol 6: 19, 2006.
1. Hill DL
: The Biochemistry and Physiology of Tetrahymena. Academic Press., New York and London, 1972.
1. Eisen JA,
2. Coyne RS,
3. Wu M,
4. Wu D,
5. Thiagarajan M,
6. Wortman JR,
7. Badger JH,
8. Ren Q,
9. Amedeo P,
10. Jones KM,
11. Tallon LJ,
12. Delcher AL,
13. Salzberg SL,
14. Silva JC,
15. Haas BJ,
16. Majoros WH,
17. Farzad M,
18. Carlton JM,
19. Smith RK Jr,
20. Garg J,
21. Pearlman RE,
22. Karrer KM,
23. Sun L,
24. Manning G,
25. Elde NC,
26. Turkewitz AP,
27. Asai DJ,
28. Wilkes DE,
29. Wang Y,
30. Cai H,
31. Collins K,
32. Stewart BA,
33. Lee SR,
34. Wilamowska K,
35. Weinberg Z,
36. Ruzzo WL,
37. Wloga D,
38. Gaertig J,
39. Frankel J,
40. Tsao CC,
41. Gorovsky MA,
42. Keeling PJ,
43. Waller RF,
44. Patron NJ,
45. Cherry JM,
46. Stover NA,
47. Krieger CJ,
48. del Toro C,
49. Ryder HF,
50. Williamson SC,
51. Barbeau RA,
52. Hamilton EP,
53. Orias E
: Macronuclear genome sequence of the ciliate Tetrahymena thermophila, a model eukaryote. PLoS Biol 4: e286, 2006.
1. Smith DG,
2. Gawryluk RM,
3. Spencer DF,
4. Pearlman RE,
5. Siu KW,
6. Gray MW
: Exploring the mitochondrial proteome of the ciliate protozoon Tetrahymena thermophila: direct analysis by tandem mass spectrometry. J Mol Biol 374: 837-863, 2007.
1. Smith JC,
2. Northey JG,
3. Garg J,
4. Pearlman RE,
5. Siu KW
: Robust method for proteome analysis by MS/MS using an entire translated genome: demonstration on the ciliome of Tetrahymena thermophila. J Proteome Res 4: 909-919, 2005.
1. Jacobs ME,
2. DeSouza LV,
3. Samaranayake H,
4. Pearlman RE,
5. Siu KW,
6. Klobutcher L A
: The Tetrahymena thermophila phagosome proteome. Eukaryotic cell 5: 1990-2000, 2006.
1. Kilburn CL,
2. Pearson CG,
3. Romijn EP,
4. Meehl JB,
5. Giddings TH Jr,
6. Culver BP,
7. Yates JR 3rd,
8. Winey M
: New Tetrahymena basal body protein components identify basal body domain structure. J Cell Biol 178(6): 905-912, 2007.
1. Bowman GR,
2. Smith DG,
3. Michael Siu KW,
4. Pearlman RE,
5. Turkewitz AP
: Genomic and proteomic evidence for a second family of dense core granule cargo proteins in Tetrahymena thermophila. J Eukaryot Microbiol 52: 291-297, 2005.
1. Karava V,
2. Zafiriou PM,
3. Fasia L,
4. Anagnostopoulos D,
5. Boutou E,
6. Vorgias CE,
7. Maccarrone M,
8. Siafaka-Kapadai A
: Anandamide metabolism by Tetrahymena pyriformis in vitro. Characterization and identification of a 66 kDa fatty acid amidohydrolase. Biochimie 87: 967-974, 2005.
1. Lee FW,
2. Lo SC
: The use of Trizol reagent (phenol/guanidine isothiocyanate) for producing high quality two-dimensional gel electrophoretograms (2-DE) of dinoflagellates. J Microbiol Methods 73: 26-32, 2008.
1. Bradford MM
: A rapid and sensitive method for the quantitation of microgram quantities of protein utilizing the principle of protein-dye binding. Anal Biochem 72: 248-254, 1976.
1. Fountoulakis M,
2. Tsangaris G,
3. Oh JE,
4. Maris A,
5. Lubec G
: Protein profile of the HeLa cell line. J Chromatogr A 1038: 247-265, 2004.
1. Berndt P,
2. Hobohm U,
3. Langen H
: Reliable automatic protein identification from matrix-assisted laser desorption/ionization mass spectrometric peptide fingerprints. Electrophoresis 20: 3521-3526, 1999.
1. Alexandridou A,
2. Tsangaris GT,
3. Vougas K,
4. Nikita K,
5. Spyrou G
: Peptide Finder: mapping measured molecular masses to peptides and proteins. Bioinformatics 24: 2267-2269, 2008.
1. Maillet I,
2. Bernt P,
3. Malo C,
4. Rodriguez S,
5. Brunisholz RA,
6. Pragai Z,
7. Arnold S,
8. Langen H,
9. Wyss M
: From the genome sequence to the proteome and back: evaluation of E. coli genome annotation with a 2-D gel-based proteomics approach. Proteomics 7: 1097-1106, 2007.

Main menu

User menu

Search

A Two-dimensional Proteomic Profile of Tetrahymena thermophila Whole Cell Lysate

Abstract

Materials and Methods

Results

Discussion

References