E-HowNet
Concept Net
ConceptNet is an ontological system for lexical knowledge and common sense knowledge representation and processing. It is comprised of two major components, namely, a knowledge representation model and knowledge. The knowledge representation model is a frame-based entity-relation model called Extended-HowNet, which provides a universal representational framework for lexical knowledge with meaning composition and decomposition capabilities. The encoded knowledge in ConceptNet includes lexical concepts and common sense knowledge.
E-HowNet
In 1988, Dong ZhenDong proposed several important principles for building a knowledge base. He suggested that all concepts should be defined by a closed set of primitives and features which is different to, organizing concepts into synonym sets, as is the case in WordNet. This approach provides richer information for natural language processing and is more flexible for generating a new concept. Dong used over 2,000 primitives and over 200 hundred Event-role and Features to describe general concepts and map the relations among them. The following is an example.
In 2003, the Sinica CKIP group and Professor Dong began a cooperative project to build a HowNet for traditional Chinese, called HowNet-Big5. We adopted the HowNet-based meaning representation mechanism to define the word meanings of over 90,000 lexical entries in the CKIP Chinese Lexical Knowledge Base. However, we had to make some changes to the HowNet definition structure to address the capabilities of semantic composition and decomposition properties of the model. We invented and added the description of Multi-level meaning representation and complex relation/relations? Instead of using primitives, basic concepts are used as the elements to define complex concepts. As a result, HowNet-Big5 evolved into a new knowledge representation model, called Extended-HowNet. For more details, please read the article “多層次概念定義與複雜關係表達—繁體字知網的新增架構” (Chen et al. 2005).
The advantages of Extended-HowNet are as follows:
- Extended-HowNet represents concepts in more accurate way by not restricting the definition vocabulary to a closed set of primitives only, i.e., any well-defined concepts can be used to define a new concept.
- It accords with human cognition models. Since complex concepts are defined by basic concepts, the definitions are easier to understand.
- To reach a canonical representation, if a lexical concept is defined by different annotators, the different higher level definitions can be converted into same or similar primitives at the end.
- The conversion of primitives into WordNet synsets makes the representation language independent.
- The definitions in Extended-HowNet are designed for easy semantic composition, decomposition, and translation into natural language expressions.
- Complex definitions are facilitated to define kinship relations, temporal and spatial concepts, comparative notions, etc.
So far we have studied and set specific expressions for comparative constructions, interogative words, and modality in Chinese. In addition, we are working on the dedicate ontological relation. We hope the hierarchical inheritance property of the ontology would help computers understand the relations between concepts. The elaborated illustration on each semantic roles and relations are shown in our “E-HowNet Technical Report” (詞庫小組 2009).
Representation of Lexical Concepts
- E-HowNet Ontology Online. (See below)
- Every primitive of HowNet has been connected to the WordNet synsets.
- The taxonomy structures for semantic roles have benn constructed. Please see attached files: The semantic roles for Object, The semantics roles for Event, function.
- A total of 55,912 words (including 60,482 concepts) in the CKIP
Chinese Lexical Knowledge Base have been defined as being the
Extended-HowNet format. For example:
Concept Number 022872 Chinese Word 汽油彈 Chinese Pronunciation ㄑㄧˋ ㄧㄡˊ ㄉㄢˋ Chinese Pinyin qi4 you2 dan4 CKIP Category Nab English Meaning petrol_bomb Inter-Definition {weapon|武器:material={汽油}} E-HowNet Definition {weapon|武器:material={material|材料:attribute={StateLiquid|液態},telic={burn|焚燒:material={~},purpose={VehicleGo|駛}}}} - More than 80 thousand words in CKIP electronic dictionary (originally contains syntactic information only) are provided with English translations and definitions by Extended-HowNet.
常識的知識抽取
Common sense knowledge is extracted and learned by automatic parsed texts. Word association is the first step of extracting and representing knowledge in ConceptNet. For example, the sentence "We all like butterflies." can be segmented and parsed as follows:
Then three kinds of word-associations are extracted:
| experiencer | 我們 Nhaa | Head[S] | 喜歡 VK1 |
| quantify | 都 Dab | Head[S] | 喜歡 VK1 |
| Head[S] | 喜歡 VK1 | goal | 蝴蝶 Nab |
The extracted word associations can be generalized into concept associations. Words are extended into their Extended-HowNet definitions. Then concept relations can be established from the original word associations. For example, (喜歡, goal-蝴蝶) (experiencer-我們, 喜歡) is extende to:
({FondOf|喜歡}, goal-{InsectWorm|蟲:{fly|飛:agent={~}}})
(experiencer-{human|人:PersonPro={1stPerson|我},quantity={mass|眾}},{FondOf|喜歡})
The following concept relations are then derived:
experiencer- (we, human, you, John, mother...) like goal- (butterfly, insect, bee, mosquito...),...
The knowledge of lexical concepts and common sense knowledge can then be constructed.
Online Demos
E-HowNet
廣義知網承續知網(HowNet)的語意定義機制,將中央研究院詞庫小組辭典(CKIP Chinese Lexical Knowledge Base)中的九萬多詞條與知網連結,目的在建立一表達概念與概念之間,以及概念所具有之屬性間的關係的詞彙知識庫,並形成基本知識的概念網。
DemoResources
Publications
- Shu-Ling Huang, Yu-Ming Hsieh, Su-Chu Lin, Keh-Jiann Chen. “Resolving the Representational Problems of Polarity and Interaction Between Process and State Verbs”. IJCLCLP, Vol. 19, No. 2, pp. 33–52, Jun 2014.
- Shu-Ling Huang, Keh-Jiann Chen. “Semantic Analysis and Contextual Harmony of Durations”. Journal of Chinese Linguistics, Vol. 41, No. 1, pp. 118–144, Jan 2013.
- Shu-Ling Huang, Yu-Ming Hsieh, Su-Chu Lin, Keh-Jiann Chen. “Lexical Representation and Classification of Eventive Verbs — Polarity and Interaction between Process and State”. SIGHAN, Oct 2013.
- You-shan Chung, Keh-Jiann Chen. “Transitivity of a Chinese Verb-Result Compound and Affected Argument of the Result Verb”. Vol. 17, No. 2, pp. 1–20, Jun 2012.
- You-shan Chung, Keh-Jiann Chen. “Transitivity of a Chinese Verb-result Compound and Affected Argument of the Result Verb”. ROCLING, Sep 2011.
- Wei-Te Chen, Su-Chu Lin, Shu-Ling Huang, You-Shan Chung, Keh-Jiann Chen. “E-HowNet and Automatic Construction of a Lexical Ontology”. COLING, Aug 2010.
- You-shan Chung, Keh-Jiann Chen. “Analysis of Chinese Morphemes and Its Application to Sense and Part-Of-Speech Prediction for Chinese Compounds”. ICCPOL, Jul 2010.
- Ming-Hong Bai, Jia-Ming You, Keh-Jiann Chen, Jason S. Chang. “Acquiring Translation Equivalences of Multiword Expressions by Normalized Correlation Frequencies”. EMNLP, Aug 2009.
- Chia-Hung Tai, Jia-Zen Fan, Shu-Ling Huang, Keh-Jiann Chen. “Automatic Sense Derivation for Determinative-Measure Compounds under the Framework of E-HowNet”. IJCLCLP, Vol. 14, No. 1, pp. 19–44, Mar 2009.
- Shu-Ling Huang, Keh-Jiann Chen. “A Semantic Analysis of Time Intervals — Core Senses and Relational Senses of a Time Interval”. CLSW, Jul 2009.
- Ming-Hong Bai, Keh-Jiann Chen, Jason S. Chang. “Improving Word Alignment by Adjusting Chinese Word Segmentation”. IJCNLP, Jan 2008.
- Shu-Ling Huang, Keh-Jiann Chen. “Knowledge Representation and Sense Disambiguation for Interrogatives in E-HowNet”. IJCLCLP, Vol. 13, No. 3, pp. 255–278, Dec 2008.
- Chia-hung Tai, Shu-Ling Huang, Keh-Jiann Chen. “A Semantic Composition Method for Deriving Sense Representations of Determinative-Measure Compounds in E-HowNet”. ROCLING, Sep 2008.
- Shu-Ling Huang, You-Shan Chung, Keh-Jiann Chen. “E-HowNet: the Expansion of HowNet”. National HowNet Workshop, May 2008.
- Shu-Ling Huang, Yueh-Yin Shih, Keh-Jiann Chen. “Knowledge Representation for Comparative Constructions in Extended-HowNet”. Language and Linguistics, Vol. 9, No. 2, pp. 395–413, Apr 2008.
- You-Shan Chung, Shu-Ling Huang, Keh-Jiann Chen. “Modality and Modal Sense Representation in E-HowNet”. PACLIC, Nov 2007.
- Shu-Ling Huang, You-Shan Chung, Yueh-Yin Shih, Keh-Jiann Chen. “Knowledge Representation for Interrogatives in E-HowNet”. ROCLING, Sep 2007.
- Yueh-Yin Shih, Shu-Ling Huang, Keh-Jiann Chen. “Semantic Representation and Composition for Unknown Compounds in E-HowNet”. PACLIC, Nov 2006.
- Shu-Ling Huang, Yueh-Yin Shih, Keh-Jiann Chen. “The Knowledge Representation for Comparison Words in Extended-HowNet”. CLSW, May 2006.
- Yi-Jun Chen, Shu-Ling Huang, Yueh-Yin Shih, Keh-Jiann Chen. “多層次概念定義與複雜關係表達—繁體字知網的新增架構”. 漢語詞彙語義研究的現狀與發展趨勢國際學術研討會, Nov 2005.
- Yueh-Yin Shih, Shu-Ling Huang, Yi-Jun Chen, Keh-Jiann Chen. “Semantic Representation and Composition for Spatial Concepts in Extended-HowNet”. IEEE NLPKE, Oct 2005.
- Keh-Jiann Chen, Shu-Ling Huang, Yueh-Yin Shih, Yi-Jun Chen. “Extended-HowNet: A Representational Framework for Concepts”. IJCNLP, Oct 2005.
- Yi-Jun Chen, Shu-Ling Huang, Yueh-Yin Shih, Keh-Jiann Chen. “繁體字知網架構下之功能詞表達初探”. CLSW, Apr 2005.
- Jia-Ming You, Yu-Ming Hsieh. “Automatic Semantic Role Assignment for a Tree Structure”. SIGHAN, Jul 2004.
- Keh-Jiann Chen, Jia-Ming You. “A Study on Word Similarity Using Context Vector Models”. IJCLCLP, Vol. 7, No. 2, pp. 37–58, Aug 2002.
Researchers and Developers
施悅音、陳怡君、游佳明、鍾友珊、劉立群、陳維德、林素朱、黃淑齡、白明弘、謝佑明、李婕瑜、楊慕