CLC number: TP391
On-line Access: 2024-08-27
Received: 2023-10-17
Revision Accepted: 2024-05-08
Crosschecked: 0000-00-00
Cited: 0
Clicked: 5357
OU Shi-yan, KHOO Christopher S.G., GOH Dion H.. Constructing a taxonomy to support multi-document summarization of dissertation abstracts[J]. Journal of Zhejiang University Science A, 2005, 6(11): 1258-1267.
@article{title="Constructing a taxonomy to support multi-document summarization of dissertation abstracts",
author="OU Shi-yan, KHOO Christopher S.G., GOH Dion H.",
journal="Journal of Zhejiang University Science A",
volume="6",
number="11",
pages="1258-1267",
year="2005",
publisher="Zhejiang University Press & Springer",
doi="10.1631/jzus.2005.A1258"
}
%0 Journal Article
%T Constructing a taxonomy to support multi-document summarization of dissertation abstracts
%A OU Shi-yan
%A KHOO Christopher S.G.
%A GOH Dion H.
%J Journal of Zhejiang University SCIENCE A
%V 6
%N 11
%P 1258-1267
%@ 1673-565X
%D 2005
%I Zhejiang University Press & Springer
%DOI 10.1631/jzus.2005.A1258
TY - JOUR
T1 - Constructing a taxonomy to support multi-document summarization of dissertation abstracts
A1 - OU Shi-yan
A1 - KHOO Christopher S.G.
A1 - GOH Dion H.
J0 - Journal of Zhejiang University Science A
VL - 6
IS - 11
SP - 1258
EP - 1267
%@ 1673-565X
Y1 - 2005
PB - Zhejiang University Press & Springer
ER -
DOI - 10.1631/jzus.2005.A1258
Abstract: This paper reports part of a study to develop a method for automatic multi-document summarization. The current focus is on dissertation abstracts in the field of sociology. The summarization method uses macro-level and micro-level discourse structure to identify important information that can be extracted from dissertation abstracts, and then uses a variable-based framework to integrate and organize extracted information across dissertation abstracts. This framework focuses more on research concepts and their research relationships found in sociology dissertation abstracts and has a hierarchical structure. A taxonomy is constructed to support the summarization process in two ways: (1) helping to identify important concepts and relations expressed in the text, and (2) providing a structure for linking similar concepts in different abstracts. This paper describes the variable-based framework and the summarization process, and then reports the construction of the taxonomy for supporting the summarization process. An example is provided to show how to use the constructed taxonomy to identify important concepts and integrate the concepts extracted from different abstracts.
[1] Endres-Niggemeyer, B., Hertenstein, B., Villiger, C., Ziegert, C., 2001. Constructing an Ontology for WWW Summarization in Bone Marrow Transplantation (BMT). http://summitbmt.fh-hannover.de/Papers/Washington-Octo11.pdf.
[2] Hovy, E., Lin, C.Y., 1999. Automated Text Summarization in SUMMARIST. In: Maybury, M. (Ed.), Advances in Automatic Text Summarization. The MIT Press, p.71-80. http://www.isi.edu/~cyl/papers/ists97.pdf.
[3] Japanainen, P., Jarvinen, T., 1997. A Non-projective Dependency Parser. Proceedings of the 5th Conference on Applied Natural Language Processing (ANLP), Washington, DC, p.64-71. http://www.ling.helsinki.fi/~tapanain/dg/doc/anlp97/anlp97.html.
[4] Khoo, C., Ou, S.Y., Goh, D., 2002. A Hierarchical Framework for Multi-document Summarization of Dissertation Abstracts. Proceedings of the 5th International Conference on Asian Digital Libraries, Singapore, p.99-110.
[5] Mani, I., Bloedorn, E., 1999. Summarization similarities and differences among related documents. Information Retrieval, 1(1):1-23.
[6] McKeown, K., Radev, R.D., 1995. Generating Summaries of Multiple News Articles. Proceedings of the 18th Annual International ACM Conference on Research and Development in Information Retrieval (ACM SIGIR), Seattle, WA, p.74-82.
[7] Medin, D.L., Lynch, E.B., Solomon, K.O., 2000. Are there kinds of concepts? Annual Review of Psychology, 51:149-169.
[8] NISO (National Information Standards Organization), 2003. Guidelines for the Construction, Format, and Management of Monolingual Thesauri. ANSI/NISO Z39.19-1993. NISO Press, Bethesda, Maryland. http://www.niso.org/standards/standard_detail.cfm?std_id=518.
[9] Ou, S.Y., Khoo, C., Goh, D., 2003. Multi-document Summarization of Dissertation Abstracts Using a Variable-based Framework. Proceedings of the 66th Annual Meeting of the American Society for Information Science and Technology, Long Beach, CA, p.230-239.
[10] Ou, S.Y., Khoo C., Goh, D., Heng, H.H., 2004. Discourse Parsing of Sociology Dissertation Abstracts Using Decision Tree Induction. Proceedings of the 14th Annual ASIST SIG CR Workshop, Long Beach, CA.
[11] Radev, R.D., 2000. A Common Theory of Information Fusion from Multiple Text Sources Step One: Cross-document Structure. Proceedings of the 1st SIGdial Workshop on Discourse and Dialogue. http://www.sigdial.org/sigdialworkshop/proceedings/radev.pdf.
[12] Radev, R.D., Jing, H., Budzikowska, M., 2000. Centroid-based Summarization of Multiple Documents: Sentence Extraction, Utility-based Evaluation and User Studies. Workshop Held with Applied Natural Language Processing Conference/Conference of the North American Chapter of the Association for Computational Linguistics (ANLP/ANNCL), p.21-29.
[13] Wollersheim, D., Rahayu, W., 2002. Methodology for Creating a Sample Subset of Dynamic Taxonomy to Use in Navigating Medical Text Databases. Proceedings International Database Engineering and Applications Symposium (IDEAS), Edmonton, Canada, p.276-84.
Open peer comments: Debate/Discuss/Question/Opinion
<1>