65th IFLA Council and General

Bangkok, Thailand,
August 20 - August 28, 1999

Romanization of multiscript/multilingual materials: experiences of Malaysia

Nafisah Ahmad
National Library of Malaysia


This paper focuses primarily on the cataloguing and treatment of the materials in multiscript especially in the National Library of Malaysia. Special emphasis is given to cataloguing and treatment of materials in Jawi script, its problems and issues.



Malaysia is a federation of 14 states, 12 of which are in Peninsular Malaysia and 2 in the Borneo states of Sabah and Sarawak. The national language in Malaysia is Malay, which is also the medium of instruction in primary, secondary and tertiary education. As a former British colony, English is widely spoken and remains an important second language taught in schools. Arabic, Chinese and Tamil are also taught alongside Malay and English in religious and vernacular schools.

Romanization of Multiscript Materials

In the National Library of Malaysia the policy to romanized multiscript materials started with the computerization of its collection with the launch of MALMARC (Malaysian Machine Readable Catalogue) in 1978. MALMARC system was shared cataloguing system among university libraries and the National Library of Malaysia. In 1988, the National Library of Malaysia started it's own computerization with the acquisition of VTLS (Virginia Tech Library System) software, an integrated library system which runs on HP3000 series 950 minicomputers (in 1996 the system was upgraded and changed to HP3000 series 300. Unfortunately the system does not support the non-roman script. Therefore all data had to be romanized in order for them to be input, hence enable their bibliographical entries to be interfiled together with those written in the roman script.

Cataloguing Practice for Materials in Jawi Script

Jawi scripts is originated from the Arabic which contained particular adaptation and addition. It was introduced into the Malay World especially Malay Peninsular soon after, with the arrival of Islam. There are differences in opinion of the exact date, probably as early as 440 H (1104 A.D.). However it is believed that the Arabic script was adopted into Jawi script after 7 Hijrah/13 A.D. (Russel Jones, 1983:125). A more concrete evidence of the earliest Jawi writing is as found on the Batu Bersurat Terengganu, dated 1303 A.D. For many centuries, up to early this century, it was the dominant writing in the Malay world, widely used in the court and outside for writing. However the arrival of western influence had changed its dominance with the introduction of roman script, and today the influence of romanized scripts is widely spread as compared to Jawi script which is confined to mostly Islamic religious publications. Recently however efforts have been made by various bodies to introduce and encourage the usage of Jawi scripts into our literary scene. This has become increasingly important because usage of Jawi script would and could enhanced the study of Islam especially the Quran. Most of these materials are finding their way into larger and less specialized collection.

The use of computers in libraries has streamlined many aspects of cataloguing and accessing materials. It has provided bibliographical data to be more efficiently arranged and facilitate the sharing of bibliographical data among libraries. However this transformation posed certain problems in documenting the materials in Jawi scripts. This is due to the unavailibility of appropriate system to support the scripts. Therefore in order for the bibliographic record to be input to the system, the option taken to overcome this problem is to romanize the scripts whereby the non-roman characters is systematically been represented by the roman characters. In doing so the librarian has to overcome the problem of how to represent the characters of another script in the meaningful and efficient way. As reflected in Rule 1.0E1 of AACR2 (Anglo American Cataloguing Rules, the optimum choice is to transcribe information from the item itself in the language and script (wherever practicable) in which it appears there:

Title and statement of responsibility
Publication, distribution etc.

This is to give the patron exactly what is printed on the title page or the primary source of information.

When the card catalog was used in the library, non-roman scripts materials were easily managed similar to the roman scripts materials, with few adjustments. The information stated by Rule 1.0E could easily be transcribed either by typing them with typewriter or by hand. Even then, there was no special typewriter manufactured for Jawi script and as a substitute Arabic typewriter was used, but some of Jawi characters are missing. It must be noted that Jawi characters contained additional characters in addition to 29 original Arabic characters to accommodate the Malay words and these characters are:

(g)      (ng)      (p)      (nya)      and       (c)

Even then the task in producing the cards in multiscript is more expensive and cumbersome than producing catalog cards in all roman characters, especially in the layout because the bi-directionality involved. For example, in adopting ISBD into Malaysian National Bibliography, the Cataloguing Committee, after making detailed study of the available optional elements, decided to take or to reject them wherever applicable. Hence the ISBD (M) statement " the elements selected are presented in the prescribe ISBD (M) order and with the prescribe ISBD (M) punctuation" could not be applied in citing bibliographic records for Jawi (Arabic) publications, since it is a script written from right to left. Because of his unique characteristic of the Jawi scripts, variation in the punctuation symbols occured where the comma and semi colon are the mirror images of the comma and semicolon of used in the roman script. Therefore comma and semicolon in all Jawi entries appear as (    ) and (  ) respectively. Another variation happened to the usage of point, space, dash, space (. - ) which precede each area in the record. However these symbols were reversed as space, dash, space, point (  -  .). Other punctuation including diagonal slash (/) remained unchanged since this punctuation was similarly used in Jawi scripts. The same cataloguing practice is also applied for serial publications (ISBD (S)) and for non-book materials (ISBD (NBM)).

Romanization of Jawi Script

Technically, romanization of the Jawi script is not as complicated as romanization of other non-roman script like Chinese or Arabic or Tamil script. This is because Malay language has officially been written in the roman script and therefore there is already the existance of official roman spelling of every Malay word except in certain situation where clarification may be needed.

Romanization/transliteration of Jawi (as well as Arabic) started after 1939 with the publication of Daftar Ejaan Melayu Zaaba. Before this, Jawi orthography is far from the requested standard. Presently, the move to standardise, modernise and improve the system has been very active. Among the well accepted guides are:

  • Guidelines for Romanization of Jawi (Perpustakaan Negara Malaysia, 1995
  • Pedoman Ejaan Jawi Yang Disempurnakan (Dewan Bahasa dan Pustaka, 1987)
  • Kaedah Transliterasi Huruf Arab ke Huruf Latin (Dewan Bahasa dan Pustaka, 1984)

The marked differences as compared to other non-roman script is that Jawi/Arabic script does not have capital letter, and their characteristic will change depending on their position within word where a letter may have up to four forms: initial, medial, final and alone and all may connect on at least one side with another letter.

The same principle of romanization is also being practiced by other libraries in Malaysia where cataloguing of Jawi materials are concerned. In some libraries, however different policy is adopted where materials in Arabic are concerned.

Problems and Issues

Since all library system in use now can only accommodate the roman script, a standard romanization system of multiscript including Jawi, Chinese and Tamil script should be followed by all libraries in the country. In connection to this, the cataloguer involved should have a wide knowledge on the language including thorough understanding of its spelling and grammatical spelling rules. The library should also develop subject specialists who could assist in cataloguing especially in determining subject headings. This is important because subject headings will assists in tracing materials efficiently and effectively.

Romanization/transliteration of non-roman scripts also creates difficulty for the patron. The patron searching for a non-roman script item must have a good enough knowledge of the item that she or he is looking for. The library should produce a user guideline on romanization/transliteration of each script in order to help them in their search for materials.

Changes and progress will always happened in every language. To overcome this problem that might arise due to this advances, a library should develop an expertise in handling romanization/transliteration techniques and cataloguers themselves should able to keep abreast with this progress.


Romanization for multiscipt materials will be in practice for quite sometime, especially where romanization is the only option in automation system. With the development in computer technology and the use of the Unicode Standard in global software, it is hope that the problems faced in cataloguing materials in non-roman script could be resolved in the near future.


