The VLC Net Dictionary has been developed through a combination of original entries and by incorporating the well known "Word Net" lexical database which has been developed at Princeton University and can be accessed and freely downloaded from their web site. By incorporating this database into our own we have been able at least to create a lexicon with a sufficiently large number of basic English entries to be a practical and useful reference tool. However, most of these entries do not yet have Chinese translations, and also many of the definitions suffer from the same deficiencies as those examples given in most other dictionarie (see Problems in Lexicography) in that they are not intended for second language learners.
Integrating the WordNet lexical database
WordNet was developed by the Cognitive Science Laboratory at Princeton University under the direction of Professor George A. Miller, and is described at their website as:
"an on-line lexical reference system whose design is inspired by current psycholinguistic theories of human lexical memory. English nouns, verbs, adjectives and adverbs are organized into synonym sets, each representing one underlying lexical concept. Different relations link the synonym sets."
As well as being available for use online at the WordNet website, it is also available for download and is free for anyone to take and use as they wish. In addition to including a desktop version of the program, the database itself is contained in 4 large text files grouped as verbs, nouns, adjectives and adverbs which can be adapted and customized to suit individual users requirements.
3.1 The structure and design of WordNet
In order to understand both the advantages and disadvantages of using WordNet as a "starters dictionary", we need to describe the structure of WordNet and its concept of synset hierarchies: The smallest unit is the word/sense pair, and word/sense pairs are linked through WordNet's basic relation, synonymy, which is expressed by grouping word/sense pairs into synonym sets (synsets). Each synset represents a concept, which is explained through a brief definition. Thus two words are synonyms if they have a designates relationship to the same concept. Synsets and the concepts they represent are the basic building blocks for hierarchies and other conceptual structures in WordNet. This is exemplified in the following extracts from the verbs data file:
01705817 41 v 04 reject 0 turn_down 0 turn_away 0 refuse 0 001 ! 01705579 v 0101 02 + 09 00 + 10 00 | refuse entrance or membership; "They turned away hundreds of fans"; "Black people were often rejected by country clubs"
01706473 41 v 03 banish 0 relegate 1 bar 0 002 @ 01704959 v 0000 ~ 01744242 v 0000 02 + 09 00 + 20 00 | expel, as if by official decree; "he was banished from his own country"
These extracts show how the concept of synsets is represented systematically in the WordNet database, and how the different components in the entries are identified by distinct markers. This makes it easy for a parser to identify and separate the components, and we can see that each entry is represented by a unique 9 digit number, followed immediately by its grammatical category (v); the synset list then follows, and the pipe " | " symbol is the separator which indicates the start of the concept or definition entry, and a semi-colon the end. Examples then follow the definition, enclosed in quotation marks. This system is employed throughout the WordNet data files, and it is then straighforward to write a parsing program to parse and separate these elements and copy the data into a database format that supports Structured Query Language (SQL). This latter format was chosen as the basis for the VLC Net Dictionary because of the great flexibility of SQL searches, which are ideal for searching a dictionary database. The basic algorithm for implementing this is as follows:
The result of applying this parsing algorithm is a database with a comprehensive core of lexical entries, most of which were accompanied by meaningful definitions and examples, but which also contained a number of lexicographical anomalies arising from the limitations of the WordNet classification of synsets, which we shall consider in the following section.
Editing the entries
From the example in Figure 1 below we can see the structure of the lexicon database, which as well as having fields for the English and Chinese equivalents also has fields for definition, comments and synonyms. The input method uses an HTML form to display the data entries, which can be edited and updated from any PC with internet access. The following example illustrates the process of creating lexical entries on the Edit Record form:
Figure 1: the Edit Record form

Thus for example, if the word "alcohol" is the lexical item we want to add to a new record, alcohol is entered in the box marked English. One of the Chinese equivalents for alcohol is entered in the box marked Chinese. The Chinese characters can be converted from the traditional version to the simplified version and vice versa to suit the needs of the Mainland and Hong Kong, as well as other Chinese speaking regions. The grammatical category of the entry word is entered in the Comments section together with any other relevant grammatical information such as transitivity (in the case of verbs) and plurality (in the case of nouns), as well as its collocation with other lexical items where necessary. The Examples section is designed to illustrate the definition and comments of the entry word and to show the students how the word is used in actual texts. The examples are taken from the corpus using the concordance search tool (cf. Figure 2 for one of such examples), and copy/pasted into the database. Synonym items are entered as appropriate and the Sound field is used to show a link to the corresponding audio file. Using the WWW to directly edit and add records means that editors can work from anywhere, whether in New York, Beijing or Hong Kong, simultaneously updating the lexicon. At the same time it is easy to open any number of on-line dictionaries to compare various interpretations. The following are examples of the definitions of "alcoholism" taken from some of the on-line dictionaries:
(1) WWWebster Dictionary ( Merriam Webster, online)
alcoholism n
1 : continued excessive or compulsive use of alcoholic
drinks
2 : poisoning by alcohol; especially : a
complex chronic psychological and nutritional disorder associated
with excessive and usually compulsive drinking
(2) The Wordsmyth English Dictionary-Thesaurus (online)
alcoholism
DEF: a pathological condition resulting from habitual overuse of alcoholic beverages,
(3) Random House Webster's (Random House, online)
al-co-hol-ism (al'kuh h?liz uhm, -ho-) n.
Since the VLC bilingual on-line dictionary is designed primarily for Chinese students of English, it is important to make the definitions as simple as possible, and avoid using only definitions that involve some complicated vocabulary that requires further explanations such as : "a chronic disorder characterized by dependence on alcohol, repeated excessive use of alcoholic beverages" (Random House Webster's). By contrast the VLC lexicon entry for one of the definitions for alcoholism simply defines it as: "being addicted to alcoholic drinks".
Lexicographical considerations in creating an on-line lexicon for students from a Chinese background: the problem of "alcoholism"
An important consideration that we must bear in mind is that different cultures have different culture specific items in their languages, and for such linguistic items lexicographers often have to resort to some lengthy explanations to make the concepts clear to dictionary users. For example, "fish n chips", "football hooliganism", etc., to name but two, are phrases that identify culturally specific behaviour and background awareness. Many problems in translation and explanation seem to arise where just this type of consideration has not been applied, and lexical items have been treated as being universal when they are in fact culture-specific. The example of "Alcoholism" cited above exemplifies this problem.
Almost all the available English-Chinese dictionaries have translated this word as:-
which, when translated back into English has the meaning of alcoholic poisoning, a serious medical condition which may result in death. Alcoholism on the other hand refers to a social and physiological problem which is more common in northern European countries.
Perhaps this prolonged and excessive drinking habit is less common in the Chinese culture, and therefore alcoholism is not a social problem as we know it in the West, and hence the lack of suitable vocabulary in Chinese for alcoholism. Or else, the dictionary makers in the past have simply followed previous examples that were erroneous in the first place. The following examples are taken from a couple of the most commonly used English-Chinese dictionaries illustrating this problem:
(1) The English-Chinese Dictionary (unabridged) (Shanghai Yiwen, 1996)
(2) Longman Dictionary of Contemporary English (English-Chinese) (Longman Asia, 1997)
Figure 2: a problem of translation
![]() |
But is alcoholism equivalent to alcoholic poisoning?
The problem becomes even more confusing when we find that many of the well-known English monolingual dictionaries also treat alcoholism as a medical condition and some even regard it as a disease (cf. Oxford Advanced Learners Dictionary). And so perhaps the bilingual dictionary makers might well have been influenced by the definitions provided by the monolingual dictionary writers such as the ones below.
(1) Longman Dictionary of Contemporary English third edition (Longman, 1985)
(2) The Collins Cobuild Students Dictionary Online (online, 1998)
(3) Oxford Advanced Learners Dictionary of Current English (OUP, 1989)
Perhaps the most extraordinary of all these entries if the one which we find provided by the online version of the popular Cobuild Dictionary, which defines alcoholism as "a kind of poisoning". Since our aim is to help the Chinese learners to improve their learning of English, it would obviously be misleading if our students started believing that alcoholism is some kind of English medical problem!
So after all this, in writing our own entries for the VLC lexicon, we have arrived at the following translations and definitions for alcoholism:
Figure 3: the VLC lexicon entries for "alcoholism"

And finally, we have added a separate entry for alcoholic poisoning with the appropriate Chinese equivalent which is included in the lexicon as illustrated below.
Figure 4: a new entry for "alcoholic poisoning"

We have discussed this problem in some detail not because it is an isolated problem, but because on the contrary it is typical of the problems faced in lexicography. When one considers that there are thousands of other entries which in varying degrees present similar difficulties, what we have come to refer to as "the problem of alcoholism in lexicography" exemplifies some of the difficulties faced in implementing the task of developing a comprehensive and trustworthy online bilingual lexicon.
Chris Greaves & Han Yang
[ Using ellectronic dictionaries ] [ Study Guide ] [ VLC Front Page ]