E-HowNet

E-HowNet

Concept Net

ConceptNet is an ontological system for lexical knowledge and common sense knowledge representation and processing. It is comprised of two major components, namely, a knowledge representation model and knowledge. The knowledge representation model is a frame-based entity-relation model called Extended-HowNet, which provides a universal representational framework for lexical knowledge with meaning composition and decomposition capabilities. The encoded knowledge in ConceptNet includes lexical concepts and common sense knowledge.

E-HowNet

In 1988, Dong ZhenDong proposed several important principles for building a knowledge base. He suggested that all concepts should be defined by a closed set of primitives and features which is different to, organizing concepts into synonym sets, as is the case in WordNet. This approach provides richer information for natural language processing and is more flexible for generating a new concept. Dong used over 2,000 primitives and over 200 hundred Event-role and Features to describe general concepts and map the relations among them. The following is an example.

In 2003, the Sinica CKIP group and Professor Dong began a cooperative project to build a HowNet for traditional Chinese, called HowNet-Big5. We adopted the HowNet-based meaning representation mechanism to define the word meanings of over 90,000 lexical entries in the CKIP Chinese Lexical Knowledge Base. However, we had to make some changes to the HowNet definition structure to address the capabilities of semantic composition and decomposition properties of the model. We invented and added the description of Multi-level meaning representation and complex relation/relations? Instead of using primitives, basic concepts are used as the elements to define complex concepts. As a result, HowNet-Big5 evolved into a new knowledge representation model, called Extended-HowNet. For more details, please read the article “多層次概念定義與複雜關係表達—繁體字知網的新增架構” (Chen et al. 2005).

The advantages of Extended-HowNet are as follows:

  1. Extended-HowNet represents concepts in more accurate way by not restricting the definition vocabulary to a closed set of primitives only, i.e., any well-defined concepts can be used to define a new concept.
  2. It accords with human cognition models. Since complex concepts are defined by basic concepts, the definitions are easier to understand.
  3. To reach a canonical representation, if a lexical concept is defined by different annotators, the different higher level definitions can be converted into same or similar primitives at the end.
  4. The conversion of primitives into WordNet synsets makes the representation language independent.
  5. The definitions in Extended-HowNet are designed for easy semantic composition, decomposition, and translation into natural language expressions.
  6. Complex definitions are facilitated to define kinship relations, temporal and spatial concepts, comparative notions, etc.

So far we have studied and set specific expressions for comparative constructions, interogative words, and modality in Chinese. In addition, we are working on the dedicate ontological relation. We hope the hierarchical inheritance property of the ontology would help computers understand the relations between concepts. The elaborated illustration on each semantic roles and relations are shown in our “E-HowNet Technical Report” (詞庫小組 2009).

Representation of Lexical Concepts

  1. E-HowNet Ontology Online. (See below)
  2. Every primitive of HowNet has been connected to the WordNet synsets.
  3. The taxonomy structures for semantic roles have benn constructed. Please see attached files: The semantic roles for Object, The semantics roles for Event, function.
  4. A total of 55,912 words (including 60,482 concepts) in the CKIP Chinese Lexical Knowledge Base have been defined as being the Extended-HowNet format. For example:
    Concept Number 022872
    Chinese Word 汽油彈
    Chinese Pronunciation ㄑㄧˋ ㄧㄡˊ ㄉㄢˋ
    Chinese Pinyin qi4 you2 dan4
    CKIP Category Nab
    English Meaning petrol_bomb
    Inter-Definition {weapon|武器:material={汽油}}
    E-HowNet Definition {weapon|武器:material={material|材料:attribute={StateLiquid|液態},telic={burn|焚燒:material={~},purpose={VehicleGo|駛}}}}
  5. More than 80 thousand words in CKIP electronic dictionary (originally contains syntactic information only) are provided with English translations and definitions by Extended-HowNet.

常識的知識抽取

Common sense knowledge is extracted and learned by automatic parsed texts. Word association is the first step of extracting and representing knowledge in ConceptNet. For example, the sentence "We all like butterflies." can be segmented and parsed as follows:

parser_ex

Then three kinds of word-associations are extracted:

experiencer 我們 Nhaa Head[S] 喜歡 VK1
quantify 都 Dab Head[S] 喜歡 VK1
Head[S] 喜歡 VK1 goal 蝴蝶 Nab

The extracted word associations can be generalized into concept associations. Words are extended into their Extended-HowNet definitions. Then concept relations can be established from the original word associations. For example, (喜歡, goal-蝴蝶) (experiencer-我們, 喜歡) is extende to:

({FondOf|喜歡}, goal-{InsectWorm|蟲:{fly|飛:agent={~}}})
(experiencer-{human|人:PersonPro={1stPerson|我},quantity={mass|眾}},{FondOf|喜歡})

The following concept relations are then derived:

experiencer- (we, human, you, John, mother...) like goal- (butterfly, insect, bee, mosquito...),...

The knowledge of lexical concepts and common sense knowledge can then be constructed.

Online Demos

Visualization of E-HowNet

Visualization of E-HowNet

將 Ehownet 字詞的定義式視覺化,以好理解的方式呈現。

Demo
E-HowNet

E-HowNet

廣義知網承續知網(HowNet)的語意定義機制,將中央研究院詞庫小組辭典(CKIP Chinese Lexical Knowledge Base)中的九萬多詞條與知網連結,目的在建立一表達概念與概念之間,以及概念所具有之屬性間的關係的詞彙知識庫,並形成基本知識的概念網。

Demo
BE-HowNet

BE-HowNet

基於廣義知網系統之架構、詞條,再加上中文Wikipedia中的條目而建成的知識本體架構。

Demo
E-HowNet with Kangxi Dictionary

E-HowNet with Kangxi Dictionary

基於廣義知網系統之架構,以《康熙字典》中收錄4萬8千多字為詞條,建成的知識本體架構。提供使用者了解古漢語詞彙的使用方式。

Demo

Resources

Publications

Researchers and Developers

施悅音、陳怡君、游佳明、鍾友珊、劉立群、陳維德、林素朱、黃淑齡、白明弘、謝佑明、李婕瑜、楊慕