CLC number: Q811.4
On-line Access: 2024-08-27
Received: 2023-10-17
Revision Accepted: 2024-05-08
Crosschecked: 2011-11-16
Cited: 5
Clicked: 5279
Xiao-li Xie, Li-fei Zheng, Ying Yu, Li-ping Liang, Man-cai Guo, John Song, Zhi-fa Yuan. Protein sequence analysis based on hydropathy profile of amino acids[J]. Journal of Zhejiang University Science B, 2012, 13(2): 152-158.
@article{title="Protein sequence analysis based on hydropathy profile of amino acids",
author="Xiao-li Xie, Li-fei Zheng, Ying Yu, Li-ping Liang, Man-cai Guo, John Song, Zhi-fa Yuan",
journal="Journal of Zhejiang University Science B",
volume="13",
number="2",
pages="152-158",
year="2012",
publisher="Zhejiang University Press & Springer",
doi="10.1631/jzus.B1100052"
}
%0 Journal Article
%T Protein sequence analysis based on hydropathy profile of amino acids
%A Xiao-li Xie
%A Li-fei Zheng
%A Ying Yu
%A Li-ping Liang
%A Man-cai Guo
%A John Song
%A Zhi-fa Yuan
%J Journal of Zhejiang University SCIENCE B
%V 13
%N 2
%P 152-158
%@ 1673-1581
%D 2012
%I Zhejiang University Press & Springer
%DOI 10.1631/jzus.B1100052
TY - JOUR
T1 - Protein sequence analysis based on hydropathy profile of amino acids
A1 - Xiao-li Xie
A1 - Li-fei Zheng
A1 - Ying Yu
A1 - Li-ping Liang
A1 - Man-cai Guo
A1 - John Song
A1 - Zhi-fa Yuan
J0 - Journal of Zhejiang University Science B
VL - 13
IS - 2
SP - 152
EP - 158
%@ 1673-1581
Y1 - 2012
PB - Zhejiang University Press & Springer
ER -
DOI - 10.1631/jzus.B1100052
Abstract: Biology sequence comparison is a fundamental task in computational biology. According to the hydropathy profile of amino acids, a protein sequence is taken as a string with three letters. Three curves of the new protein sequence were defined to describe the protein sequence. A new method to analyze the similarity/dissimilarity of protein sequence was proposed based on the conditional probability of the protein sequence. Finally, the protein sequences of ND6 (NADH dehydrogenase subunit 6) protein of eight species were taken as an example to illustrate the new approach. The results demonstrated that the method is convenient and efficient.
[1]Bai, F., Wang, T., 2005. A 2-D graphical representation of protein sequences based on nucleotide triplet codons. Chem. Phys. Lett., 413(4-6):458-462.
[2]Bai, F., Liu, Y., Wang, T., 2007. A representation of DNA primary sequences by random walk. Math. Biosci., 209(1):282-291.
[3]Feng, J., Wang, T., 2008. A 3D graphical representation of RNA secondary structures based on chaos game representation. Chem. Phys. Lett., 454(4-6):355-361.
[4]Hamori, E., 1985. Novel DNA sequence representation. Nature, 314(6012):585-586.
[5]Hamori, E., Ruskin, J., 1983. H curves, a novel method of representation of nucleotide series especially suited for long DNA sequences. J. Biol. Chem., 258(2):1318-1327.
[6]Li, C., Xing, L., Wang, X., 2008. 2-D graphical representation of protein sequences and its application to coronavirus phylogeny. BMB Rep., 41(3):217-222.
[7]Li, J., Li, F., Wang, W., 2006. Simplification of protein sequence and alignment-free sequence analysis. Prog. Biochem. Biophys., 33(12):1215-1222 (in Chinese).
[8]Liao, B., Wang, T., 2004. Analysis of similarity of DNA sequences based on 3D graphical representation. Chem. Phys. Lett., 388(1-3):195-200.
[9]Liao, B., Tan, M., Ding, K., 2005. Application of 2D graphical representation of DNA sequence. Chem. Phys. Lett., 414(4-6):296-300.
[10]Liu, N., Wang, T., 2006. Protein-based phylogenetic analysis by using hydropathy profile of amino acids. FEBS Lett., 580(22):5321-5327.
[11]Munteanu, C.B., Gonzalez-Diaz, H., Magalhaes, A.L., 2008. Enzymes/non-enzymes classification model complexity based on composition, sequence, 3D and topological indices. J. Theor. Biol., 254(2):476-482.
[12]Nandy, A., 1994. A new graphical representation and analysis of DNA sequence structure: I. Methodology and application to globin genes. Curr. Sci., 66(10):309-314.
[13]Nandy, A., 1996. Two-dimensional graphical representation of DNA sequences and intron-exon discrimination in intron-rich sequences. Comput. Appl. Biosci., 12(1):55-62.
[14]Nandy, A., Basak, S.C., 2000. Simple numerical descriptor for quantifying effect of toxic substances on DNA sequences. J. Chem. Inform. Comput. Sci., 40(4):915-919.
[15]Nandy, A., Harle, M., Basak, S.C., 2006. Mathematical descriptors of DNA sequences: development and applications. ARKIVOC, ix:211-238.
[16]Nei, M., Kumar, S., 2002. Molecular Evolution and Phylogenetics. Higher Education Press, Beijing, p.1-14 (in Chinese).
[17]Pham, T.D., Zuegg, J., 2004. A probabilistic measure for alignment-free sequence comparison. Bioinformatics, 20(18):3455-3461.
[18]Randić, M., 2003. Condensed representation of DNA primary sequences. J. Chem. Infrom. Comput. Sci., 40(1):50-56.
[19]Randić, M., 2007. 2-D Graphical representation of proteins based on physico-chemical properties of amino acids. Chem. Phys. Lett., 440(4-6):291-295.
[20]Randić, M., Krilov, G., 1997. Characterization of 3-D sequences of proteins. Chem. Phys. Lett., 272(1-2):115-119.
[21]Randić, M., Balaban, A.T., 2003. On a four-dimensional representation of DNA primary sequences. Chem. Inform. Comput. Sci., 43(2):532-539.
[22]Randić, M., Guo, X., Basak, S.C., 2001. On the characterization of DNA primary sequences by triplet of nucleic acid bases. J. Chem. Inform. Comput. Sci., 41(3):619-626.
[23]Vinga, S., Almeida, J., 2003. Alignment-free sequence comparison—a review. Bioinformatics, 19(4):513-523.
[24]Wen, J., Zhang, Y., 2009. A 2D graphical representation of protein sequence and its numerical characterization. Chem. Phys. Lett., 476(4-6):281-286.
[25]Yao, Y., Dai, Q., Li, C., He, P., Nan, X., Zhang, Y., 2008. Analysis of similarity/dissimilarity of protein sequences. Proteins, 73(4):864-871.
[26]Yao, Y., Dai, Q., Li, L., Nan, X., He, P., Zhang, Y., 2009. Similarity/dissimilarity studies of protein sequences based on a new 2D graphical representation. J. Comput. Chem., 31(5):1045-1052.
[27]Yau, S.S.T., Yu, C., He, R., 2008. A protein map and its application. DNA Cell Biol., 27(5):241-250.
[28]Zhang, C.T., Zhang, R., Ou, H.Y., 2003. The Z curve database: a graphic representation of genome sequences. Bioinformatics, 19(5):593-599.
Open peer comments: Debate/Discuss/Question/Opinion
<1>