On a mailinglist, someone commented that learning Chinese, especially the written language, was a lot harder than learning Hebrew.

It isn't.

Not a lot harder. Only a little harder.

Ivrit has the usual complement of letters and numbers that one would expect. There are probably one to two dozen other characters that really need to be learned in order to write. Plus cursive forms.
All in all, about fifty or sixty symbols.

In Chinese, there are 214 basic characters or building blocks. These are the characters that cannot be broken up into simpler characters.
All of them have meanings, by the way - they're not just sounds or scratches.


All characters consist of one or more building blocks (basics, also called radicals).
The simplest basics (the signifiers) are usually the ones by which you look a word up in the dictionary.
Dictionaries are arranged from simplest basic to most complex by stroke order and stroke count.

The stroke order starts at upper right, finishes at lower left, horizontals before verticals..... plus a few more minor rules that make sense once you start writing.

[That can be seen here:
A good example is this page: which shows all the characters in that database for the ren radical (the 'basic' that means human: 人 or 亻) from simplest character (人) to most complex (儾 nàng: slow, dull; irresolute; 人 plus 22 strokes). Please go ahead and explore the Chinese-English Dictionary at your leisure: ]

There are about five hundred characters which can be analysed as pictures. These include the 214 basics. All other characters are combinations, with one element (the signifier) indicating the category of meaning (金 metalic, 木 wood, 艹 plant, 豸 beast, 言 speech, etc.) and the remaining part of the character almost always being a common phonetic element.

[For instance, all species of tree have the signifier tree (木) as part. The remainder of the character will usually be a phonetic element, yielding a combination that can be analysed as the tree with the name that sounds like the phonetic element.The metals, and many things made of metal, commonly have the eight-stroke character for gold (金), the paradigm of metals, as signifier, also with a phonetic element suggesting the pronunciation. Note that the character for gold (金) is a diagram of a mine, with a pulley at the top, an upper tunnel, and a lower tunnel, in which there may be found ingots or ores.]


Phonetic elements are usually a word in their own right. Some are simple constructions (in other words, the basics), others are more complex constructions of two or more characters (again, going back to the basic building blocks). Phonetic elements occur on the right-hand side, or on the bottom, of most combination characters.
There are approximately one hundred phonetics which occur so often that they become instantly recognizable. Another five hundred or so which are quite common, and about 1200 others (more or less) which are used because a homophonous phonetic was already utilized for another word constructed with the same signifier.

Some phonetic elements have been extremely stable over the past two-thousand years - what they sound like today as independent characters is reasonably close to how they sound in the various characters in which they are used phonetically, even if the pronunciations of modern Chinese are not the same as during Zhou and Han.

Others have deviated considerably. What may be pronounced as 'wo' independently can become 'wu', 'wa', 'go', or 'e'.

There are also phonetics which have pronunciations that seem to make no sense unless one figures out where everything went wrong. For instance, a character pronounced as 'yi' originally was borrowed as an abbreviation of a word pronounced 'dai', and subsequently that pronunciation was used phonetically for some characters just like the original pronunciation was used. Yi and dai are now both valid phonetic uses of that character, along with 'chi', 'de', and 'gung' - based on different borrowings and linguistic changes.

Fortunately, the really buggered-up phonetic elements are rare, and characters containing them infrequent.


For most Chinese people, there is seldom an exact overlap between the spoken vocabulary and the written vocabulary.
One could be a fluent speaker with less than fluent literacy, or one might know what a character means without being able to pronounce it. Knowing what a word sounds like while being ignorant of the meaning is somewhat less common.
For almost all literate Chinese, there is a large number of characters that they know well, plus a large number of characters that they recognize when they see them but might not remember exactly how to write, as well as a number of characters of which they know only the sound or only the meaning.
Furthermore, there may be many characters which they have forgotten, or never even knew.
In some cases the meaning of a word can be deduced from the context.
Never-the-less, the dictionary is the constant companion of the reader, and rare is the literate person who has not destroyed at least one dictionary by years of use.

Not all words in the spoken language have a character assigned to them. There are slang terms, dialect words, and colloquialisms that have not entered the dictionary, as well as the usual curses and unprintabilities. In so far as they are written, the characters will be constructed along standard lines, or may simply be other characters borrowed for the purpose - context will may clear that the word is not used as it should be.

Many characters are almost never used in speech. This is because they have been replaced by other words in the last several centuries, or have only limited applicability (names for types of Zhou bronze vessels being a good example), or because they sound so much like other words in the modern pronounciation that they would be confusing. For instance, there are well over five dozen characters pronounced 'shi'. Even with different tones mistakes are possible.
There are some characters which represent concepts that in speech are given with two-syllable combinations - the single syllable character may still be used in writing, but its use in speech would not be understood.


To read the newspaper, about three thousand characters are sufficient. Almost all words are either single syllables written by one of these characters, or combination words using two or more.

To read technical literature, one might need an additional few hundred or so words, depending on the field.

For the poetry of the T'ang (唐) and Sung (宋) dynasty periods, about fifteen hundred more characters would be needed, because the language has changed a bit since then.

For the classics from the Zhou (周) and Han (漢) era, perhaps another thousand words in addition to the vocabulary necessary for the poems.

If one has mastered about four to five thousand characters, one should have little problems reading Chinese for enjoyment or scholarly purposes.
With a minimum of around a thousand, one can easily figure out menus, product and store names, street signs, and simple texts.

With less than five hundred characters, one is merely a pretentious white person capable of boring other white people with the mysterioso beauty and meaningfulness of it all - while irritating Chinese people nearly beyond measure.

Even with a reasonably full vocabulary (4 to 5 thousand characters) one will not be Chinese unless one started out that way - one will still be a foreigner looking through the window, albeit a completely literate one.
This is not bad at all, and often it is far better than being a Chinese person, as one gets full credit for the effort expended and the result achieved. Much more so than if you looked Asian.

Most of the characters you will ever need are used so often, and in so many ways, that it is not hard to remember them. You will see them so frequently that once you have learned what they mean and how they are pronounced that knowledge will become instinctive.
Many characters can be learned by their similarity to others within a meaning category - tree types, metals, etcetera.
Some characters are so simple, being no more than half a dozen strokes, that you cannot avoid learning them.

A concerted effort to memorize even as few as a score of characters a day would yield a vocabulary more than sufficient to read the newspaper within the year.



linguisticaly amphibious said...

Obviously, nothing to it. However, learning some languages can be fraught with danger; viz, Hungarian.

e-kvetcher said...

Hungarian is not so bad, given a good phrase book...

The back of the hill said...

For some reason, smokers don't do well in Hungarian. I've never figured out why.
It is sad. Very sad.
Perhaps Tzipporah can shed some light on this problem.

mOOm said...

Well, I'm fluent in Hebrew but haven't progressed beyond a hundred characters in Chinese so I guess I'm a pretentious white person :) (My wife is Chinese by the way and people have only been impressed that I've learnt any Chinese) The difficulties of the two languages are different so hard to compare. Hebrew vowelization is about as difficult as English spelling. There are lots of verb forms but they're regular. So I'd say Hebrew was about as hard as English. Chinese I think is objectively harder. First there are the tones which are very hard to hear for people who have grown up speaking non-tonal languages. Then the number of synonyms is high - the "shi" problem. And then there are the characters. Of course in all languages you have to learn thousands of words but even Hebrew and Arabic job your memory with a rough idea of pronunciation that might remind you of what the word is. There are pronunciation and meaning clues in Chinese characters but they are pretty weak (probably more so with the simplified characters).


John Moxford said...

