Tertiary structure prediction of bromelain from Ananas Comosus using comparative modelling method

Article history: Received 21 February 2017 Received in revised form 2 October 2017 Accepted 8 October 2017 Bromelain is a general name for a family of sulfhydryl which can be found in the proteolytic enzyme group from pineapples (Ananas Comosus). This study focuses on the prediction of three dimensional structures (3D) of stem bromelain using the method of comparative modelling. The amino acid sequence of bromelain was obtained from the NCBI database was used as a tool to search for proteins with known 3D structures related to the target sequence. Suitable template was chosen based on >30% sequence similarity and lowest e-value. Based on these criteria, 1YAL was selected as the best template with 55% of sequence similarity and 9x10-52 of e-value.


Introduction
*Bromelain can be derived from stem and fruit of pineapples. Stem bromelain (EC.3.4.22.32), ananain (EC.3.4.22.31) and comosain were isolated from the stem juice of pineapples by precipitation and centrifugation methods (Gautam et al., 2010). Bromelain also is classified as a protease because it refers to the group of enzyme which belongs to the catalytic function that hydrolyses the peptide bonds of proteins. In the presence of oxidizing agents the sulfhydryl group of cysteine will be oxidized and lead to the formation of disulphide bond. The effects of the oxidation process of the cysteine group will alter the important properties in substrate binding, structure stabilization, and thus cause changes in active site which results in the loss of its catalytic activity. Besides that, bromelain draws high attention in industrial applications such as meat tenderization, baking industry, anti-browning agent, protein hydrolysate, alcohol production, textile industry and cosmetic industry because of its unique properties and rich with proteolytic activity (Arshad et al., 2014). Bromelain has also widely used in therapeutic applications such as platelet aggregation, fibrinolysis, anti-inflammatory activity, modulation of cell adhesion and antibodies (Pavan et al., 2012 industrial and pharmacology area, therefore it is essential to maintain and improve bromelain properties in conformation and catalytic activity in order to preserve its industrial and therapeutic values. Even though researchers have discovered many methods to enhance the extraction, purification, optimum condition and the specific activity of bromelain for different substrates, the mechanism of this enzyme is still in question because of the unknown three dimensional structures. Therefore, this study was carried out to predict three-dimensional (3D) structure of Bromelain from stem pineapple by comparative modelling.

Bromelain sequence retrieval
The amino acid sequence of stem bromelain was retrieved from the NCBI's protein database [Gene Bank Accession Numbers: ADY68475] (Pruitt et al., 2002). This target sequence has 291 amino acid residues.

Template selection, 3D model development, binding site prediction, and model assessments
Protein Basic Local Alignment Search Tool (BLASTp) (Altschul et al., 1997) was used to search for possible templates and the best template was selected based on the highest sequence identity. Once the suitable template was identified, pair-wise sequence alignment was carried out between the target and the template sequences using LALIGN (Huang and Miller, 1991). Next, the 3D model was generated using the comparative modelling software MODELLER (Šali and Blundell, 1993). The model then was evaluated using Ramachandran plot (Morris et al., 1992) Verified 3D (Luthy et al., 1992), and ERRAT (Colovos and Yeates, 1993). The best model was selected and subjected for the analysis of the active sites and binding pockets using COFACTOR (Roy and Zhang, 2012). All the graphics presentations were prepared using UCSF Chimera (Pettersen et al., 2004) and VMD (Humphrey et al., 1996).

Model development
Bromelain sequence required a template of known crystal structure derived from experimental method. The template was identified from BLASTp program against the protein data bank (PDB) (Berman et al., 2000) and Swiss-Prot (Apweiler et al., 2004) databases. From the BLASTp results, Chymopapain (PDB ID: 1YAL) (Maes et al., 1996) was chosen as the best template because it has the highest sequence identity which is 55%. Fig. 1 shows pair wise sequence alignment between 1YAL and target bromelain sequence. From the figure, most of the residues were conserved to each other from Ala136 -Asp291. The target residues Met1-Ser78, Leu81-Ala97, Tyr99 -Ile103, Arg105 -Asp119, Gln127 -Gly135, Cys149 and Asn201 were found to have no template residues. These missing template residues may lead to improper model development.
From the single template modelling, 100 models were predicted using MODELLER. Model 72 (Model72) was chosen as the best template and used for further evaluation. The Model72 has seven helices and two strands as shown in Fig. 2. Helices were the major part consists in the model. All the helices were abundance in the middle, while two strands were located at the Cterminal of the model. The N-terminal model (Met1 to Ala135) was not properly modelled since it doesn't have template residues.
Model72 was selected because it had the lowest Discrete Optimized Protein Energy (DOPE) (Shen and Sali, 2006) and Modeller objective function (molpdf) which were -20019.08 and 1915.35, respectively. The DOPE energy was represented in graph as shown in Fig. 3. The model had more stable energy at residues Val136 to Gly291 because it had template residues.

Model assessments
The Model72 was superimposed with the 1YAL template as shown in Fig. 4. Met1 -Val119 was the loop of Model72. This loop is independent because it has no template residues as mention in Fig. 1. I120 -D291 residues from Model72 (purple) were correctly superimposed with the template structure (blue) and 53% of the residues at this superimposed were conserved to each other.
Model72 was validated using SAVES (The Structure Analysis and Verification Server) which utilizes three different validations programs; Ramachandran Plot, ERRAT, and Verify-3D. From the Ramachandran plot (Fig. 5) analysis seven residues were found to be located in disallowed region. These residues were Gly45, Gly76, Gly134, Gly184, Asn225, Ser242, and Gly287. Based on the ERRAT and Verify-3D analyses, was found that Model72 scored 41.28% and 53.61% respectively.

Binding site prediction of model72
Model72 was selected for next analysis, which is binding site identification using the COFACTOR program. This program generates binding site prediction using two comparative methods which are binding-specific substructure and sequence similarity comparisons. Results showed the best binding pocket was pocket 1 which has confidence score 0.70 and sequence alignment 51%. The binding site consists of Gln142-Trp149, Asp187-Gly189, Ala255 and Thr279-Leu281 as shown in Fig. 6. The best binding pocket was selected based on its confidence score ranging from 0 to 1 in which the highest score indicates the most reliable binding pockets and sequence alignment is more than 30%.
It was also found that the three catalytic residues (Gln142, Cys148, and His281) located in the binding pocket.

Conclusion
In this study, 3D structure of bromelain was predicted using single template modelling. 1YAL was chosen as the template structure because it has the highest sequence identity. Based on the assessments of Model72, it was showed that this model needed further refinement at the loop region consists of Met1 -Val119. Subsequent studies on the refinement of this model are highly recommended in order to better understand the functional properties of bromelain.