   64th IFLA General Conference
   August 16 - August 21, 1998


GOO: Dutch national system for subject indexing

G.J.A. Riesthuis
University of Amsterdam
Department for Book- and Information Science
Amsterdam, Netherlands


R. Storm

Royal Library
The Hague, Netherlands


The NCC (Dutch Central Catalogue) is an integrated catalogue of about 400 Dutch libraries, including the National Library, the University Libraries and the main Public Libraries. The NCC has several search possibilities: authors, titles, title words, etc. For subject searching the Dutch "Basis-Classification" (BC) and a national word system, the "Gemeenschappelijke Trefwoordenthesaurus" (GTT) are available. In this paper the BC and the GTT are discussed. The BC is a newly developed broad classification (circa 2200 classes). Detailed subject search is possible by using the 45.000 Dutch-language terms of the GTT. The BC and the GTT can also be used in combination.


The Dutch Union Catalogue (NCC)

The NCC (Dutch Union Catalogue) is an integrated catalogue of about 400 Dutch libraries, including the Royal Library in The Hague (the National Library of The Netherlands), the University Libraries and the main Public Libraries. In the NCC the search possibilities are: title words, titles (for journals only), authors, corporate authors, an author/title key (first four letters of name of author, followed by first four letters of first word of title), subject headings and the notations of the Dutch Basic Classification (BC).

The Dutch Basic Classification (BC)

The Dutch Basic Classification (available in Dutch, English and German on the Internet: http://www.kon-bib.nl/home-fe.html) is a broad classificati-on with some 2200 classes. The Basic Classification has 48 main classes, grouped in four clusters: general, humanities, sciences and technology, social sciences. All main classes have a two-digit notation. Further subdivision is indicated by a second group of two digits. So each main class can have 100 subdivisions. Most main classes however have far less subdivisions, with three or four hierarchical levels only. The second edition was published in 1992, the third edition is in print.

The BC is used to get an overview of available literature. Most notations give too many titles to go through all of them at once, but it is a good way to see what are the latest titles in a given scientific discipline or subdiscipline. Searching on the 11th of March 1998 with 15.75 History of Asia (the lowest level available) gave 7465 hits, of which the first 9 titles were published in 1998 and the following 323 in 1997. Not all notations give as many titles as 15.75: 06.04 History of archive science returns only 7 titles, 06.72 Subject cataloguing 198 titles and 79.21 Care for the homeless 41. For most notations several hundreds will be found.

The Basic Classification, though, was never meant as a device for direct searching. It was built in the late eighties with two projects in the Dutch scientific library world in mind: co-ordination of collection management and joint subject indexing. In this joint system a subject heading or thesaurus system takes care of specific indexing and the BC gives the disciplinary context of the whole of the word entries. Today it is used for these two purposes, but in some libraries it is used as shelving system as well, for which it was not intended and is not well suited, because it is not detailed enough.

In accordance with the function of the BC most titles receive only one BC-notation. This should be the notation for the global subject of the document. It is important to class the document in the right scientific discipline. A publication about the use of remote sensing as an explorati-on device in the earth sciences should not be classified with class 74.41Air photo's, photogrammetrics, remote sensing, the only place where remote sensing is mentioned, but with 38.03 Methods and techniques of the earth sciences. Because the BC is co-ordinated with a word system the classification has only very few geographical and chronological subdivisions and will have even have less in the future. These elements are left to the word system. In the Dutch national online catalogue it is possible to use the Basic Classification to narrow a search. I can search for titles with the BC notation 15.75 and the subject descriptor "Thailand", or the title word "Thailand". In this way I can separate historical literature on Thailand from geographical literature.

GOO Subject Headings (GTT)

The national word system is named "Gemeenschappelijke Trefwoordenthe-saurus (GTT)" (Joint Subject Headings Thesaurus). In this thesaurus the following types of descriptors can be found:

  1. 1. "Normal" descriptors, i.e. terms for all concepts that do not belong to any of categories mentioned hereafter. (Rain, Sugar,
  2. Furniture)

  3. 2. Geographical descriptors (Germany, Amsterdam (city), Orinoco)

  4. 3. Names of corporate bodies (British Railways, United Nations)

  5. 4. Titles or names of works of art and culture (Othello (Shakespeare), Guernica (Picasso))

  6. 5. Descriptors for bibliographical forms (Dissertations (form), Biblio-graphies (form))

  7. 6. Descriptors for genres of fiction (Poetry (texts), Short stories (texts))

  8. 7. Names of "unica", i.e. singular events, drugs and other things that have a name (Titanic (ship))

Names of persons are no part of the GOO-thesaurus. Within the system there is a special thesaurus for personal names, used also for formal description. The GTT is in Dutch. The descriptors can be one-word terms like "Software" or more-word terms like "Economisch beleid" (Economic policy). The Dutch language can also make combinations of several words to form one compound word (like the German language can do), e.g. "Welzijnsbeleid" (welfare policy).

Rules for the use of the BC and the GTTfor the use of the BC and the GTT

A set of rules for the use of the Basic Classification and the GTT is available. A very important rule is that they are independent from each other. There is no set fixed set of descriptors for any class of the Basic Classification. Assigning descriptors is completely independent from classing in the Basic Classification. The Basic Classification's indexing is broad and global, the GTT's as specific as possible. In principle the descriptors deal with the topic of the document, not with the (scientific) discipline it belongs to. If a book has the descriptor "Sociology" the book is about sociology. If it were a sociological study about society, the descriptor would be "Societies". Also important is that the point of view or the target audience are not mentioned in descriptors. "Statistics for librarians" gets the descriptor "Statistics" but not "Librarians" or "Library science".

Consistency and ambuiguityand ambuiguity

In principle only one subject, the "global" subject of a book is indexed. Only in exceptional cases is more than one subject allowed. Today the GTT contains about 45,000 descriptors, most of them names, titles and the like, belonging to one of the last 6 categories mentioned above. Each month new descriptors, mostly names etc., are added. As far as possible, duplications of terms for the same concepts are avoided, as are different names for the same events or different titles for the same painting. Of course, with the number of descriptors already in the thesaurus, it is impossible to avoid all duplication. Special atten-tion has to be given to subjects like "Libraries" and "Education" compared with "Library education". In Dutch these descriptors would be "Bibliothe-ken" and "Opleidingen" versus "Bibliotheekopleidingen". It can be quite difficult to recognise this kind of duplication. There are special rules for when to use a compound word and when to use the separate terms, based on the ISO-norms for thesauri.

With many people working with the same classifi-cation and thesaurus, consistency is a major problem. The number of people who are indexing for the system on a regularly basis is about 150. For a branch library specialising in sociology, all mathematics is statistics, but in the mathematical library a book on "social problems of students" can be indexed as "Social problems". What is specific for one indexer can be very global for another. The problem of consistency is a real problem in the Dutch National Catalogue. It is an illusion that all books on a given subject are always indexed in the same way. It depends on the library that has indexed it and also on the person who did it. It also depends on the subject itself. The more clear cut the subject is the better. Subjects with a well-known name are less prone to inconsistency then vague subjects. A book about a city will probably get the name of that city as descriptor, but who can correctly predict which descriptor a book about the joys of falling in love will get?

In practice it is unavoidable that some books about a complex subject will get a combination of descriptors and others one compound descriptor. "Kinderen" and "Mishandeling" ["Children" and "Mistreatment"] versus "Kindermishandeling" ["Mistreatment of children"].

There are some terms that are quite ambiguous in their meaning. "Catalo-gues" has different meanings, and is not the same as "Bibliographies". Still somebody searching with the descriptor "Catalogues" will also find some bibliographies who are called catalogue by the compiler. But the descriptor "Catalogues" is reserved for books about catalogues and cataloguing. Catalogues itself should get the descriptor "Catalogues (form)". The user will not find Library catalogues, because that is another descriptor.

These problems can be avoided in many cases by first searching for books with the same subject and seeing how they were indexed in the past, but who will do that always? This problem is more serious for GTT indexing than for Basic Classification indexing. The classification is much broader and there exist a lot of directions for what belongs where. In the classifica-tion there are many directions given as to where to class given disciplines and subdisciplines. In the coming third edition there will be again more of these directions than in the second.

GOO subject headings

It may be useful to explain how subject indexing with GOO subject hea-dings is put into practice. A subject heading in the GTT is called a record, and it is, together with names of persons as an author or as a subject, something that signifi-cantly differs from a title. The different elements of the record have all ther own tags. GTT-records have the following tags:

Several of these elements can be repeated. The amount of synonyms can be especially impressive. The subject heading is to be connected immediately with the bibliograp-hic description of the book by the subject indexer. For the different types of subject headings several tags in the bibliographic description of books are available:

These tags can be repeated as well, although in principle not more than one BC-code per title is given. In most cases more than one subject heading is used. Now, how does GOO really work?

For a start, the subject indexers at the participating libraries always work directly in the Dutch national online catalogue. The indexer has one or more books that are to be indexed before him on the table. One by one the corresponding entries in the Dutch national online catalogue are checked. When an entry is already indexed by another library, connected with the right BC-code and the correct subject headings, our lucky indexer needs to do nothing at all. The book can follow its way through the library without any further delay. When the book already has some GOO-tags that nevertheless are considered to be insufficient, the entry can be adjusted. When a publication is not yet indexed at all, the right BC-code and the right subject headings have to be given to the entry. For this purpose some simple ready-to-use macros are available on the Pica IBW-computers (IBW: Intelli-gent Biblio-graphic Worksta-tion). Then, the necessary headings and codes are added to the correct tags. Once this has been done correctly, the book is indexed for all the libraries that own the book or are to own it. Since subject indexing with GOO started in this way in 1991 more than one million books have been indexed by the participating libraries.

Of course it can happen that some subject heading seems to be needed that is not yet available in the GOO Thesaurus of Subject Headings (GTT). Then the indexer must be extra alert. It is not at all difficult to add a new heading to the thesaurus, but it is very undesirable to add unneces-sary new records to the thesaurus. A new heading must really be a new heading, and not a synonym of a record that already exists, or a com-pound signifying a subject that hitherto was signified by some seperate headings together. But when one is definitely sure, after different searches in the GTT, that the addition of a new heading is really neces-sary, one has the oppurtunity to add a new record directly to the thesaurus that very instant. It is possible, and even obligatory, to use the new record right away in the right catalogue entry: new records that are not in use are deleted from the thesaurus after a relatively short period. This fact, the direct addition every day of new records to the database makes the system up to date and emphasizes the dynamic character of GOO. It is also an important difficulty that makes it impos-sible to make regular complete prints of the thesaurus. Such prints are demanded from time to time, maybe not for every day use in the libra-ries, but to give an impression of the contents of the GOO thesaurus. As new headings are added to the GTT every day, a printed edition would grow out of date too quickly. NB In the last seven years some 15,000 new headings have been added to the GTT ingevoerd, which means some three hundred new headings every month. Formal control and testing of the new subject heading are always done afterwards. If for any reason the new heading is not correct, it is adjust-ed and it gets its definite status. It is also possible, of course, that on second thoughts the new heading is not needed at all. Then it is removed from the GTT, and the catalogue entry has to be adapted, too.

The use of GOO in every day practice, very globally described here, has its complexities. These can be of a technical kind, but they can also be related to the specific subject of books or headings. For instance: how far can, and how far should one go in the adaptation of entries that are indexed by relatively unknown colleagues at other libraries? Must the principle of indexing as complete and as exact as possible really be taken seriously, and can new and more and more specific terms be added to the thesaurus without end? Should one use titles that are already indexed always as an example, even if the indexed titles differ from what is laid down the new GOO rules? What should one do when two publications that are fairly alike have been indexed in different ways? What has to be done with different subject headings that have more or less the same meaning, but that have gathered different sorts of publications? It is obvious that these kind of questions are not easy to answer. GOO is relatively young and still developing towards its definite form. Also, some small changes in the thesaurus may mean that work has to be done in some thousands of catalogue entries - and who has the time and the money to do that?

Still more experience has to be gained with GOO. This means: the national use of a general instrument for subject indexing, in different libraries, at different locations, but in the same way everywhere. When making a formal description of a book, there is only a very small chance that there will be difference of opinion about the amount of pages of which the book consists. But this is another matter in subject classification: a book about the 'essence of being' has a large risk of being indexed quite differently by diferent people in different libraries.

GOO and CvC

GOO cannot be seen apart from the so called co-ordination of collection management (CvC). Basically, CvC's aim is to adjust the collection management of the main scientific and scholarly libraries to each other, so that the money that is available for the purchase of books can be better divided and unnecessary double purchase is avoided. Before serious decisions in this matter can be made, it must of course be known which are the main topics in various library collections, and how collection management is organized. In these matters the subject codes from the BC can be used.


The group of libraries that started to use GOO in 1991 consisted of the University libraries of Groningen, Leiden, Nijmegen and Utrecht, and the Royal Library in The Hague. Soon this group grew with some new participants: the University libraries of Rotterdam, Amsterdam (UvA) and Maastricht. Meanwhile, some Provincial and College libraries have joined as well. In 1998 there are some fifteen participating libraries, including the main scientific libraries - except the pure science and technology libraries (Delft, Wage-ningen, Twente, KNAW and Eindhoven).

Structure of the organization

The responsibility for GOO is divided between the University libraries and the Royal Library on the one hand, and the Library automation firm Pica on the other. The national coordination of GOO has been done by the Royal Library since 1993. The Royal Library also provides two GOO thesaurus supervisors for the actual daily work on the GTT. The national coordinator is assisted by (and helpless without) a small coordination committee. This committe decides on many great and small matters. The libraries for their part all have their own representative in the GOO users council, primarily a forum for exchange of infomation and also the place where the library representatives and the coordination committee have regular contact with each other. The subject indexers, finally, have regular meetings with their colleagues, where the most basic and concrete agreements are made, on the level of the specific subjects or dis-ciplines.

Recent developments