History | Interdisciplinary Studies | Language & Literature | Performing Arts | Visual & Media Studies

OUTLINE PROPOSALS - FIRST DRAFTS

Recognizing and Representing Multilingual Texts

Ali-Dinar, Bass, Betteridge, Bickett, Block, Bowen, Caruso, Clinton, Cooperman, Coventry, Grendler, Jones, Knight, Parker, Palaima, Poe, Straley

The inability to render precisely the full variety of human verbal expression hinders international scholarship and collaboration. Therefore we propose a set of measures to create a more powerful and flexible computing tool for representing and recognizing writing systems from around the world at different time periods. This involves three specific areas:

  1. Creating an open ended uniform standard for character sets, one which extends the present capabilities of Unicode. For example, Unicode does not allow for the representation of accent marks in Cyrillic script.
  2. The context in which text appears is of value to researchers, therefore a digital format must provide access to data such as fonts, illustrations, performance, relative placement of print or script on a page, color, material—i.e. the type of information recorded in a description of a manuscript. For example, the standard should tell the researcher that a particular page of the 1564 printed text of Cicero's De Oratore with commentary by Paulus Manutius has a short text of Cicero dealing with history in Roman typeface followed by Manutius's commentary in italic typeface.
  3. Creating a methodology for digitization which can identify and search the full range of physical contexts in which written expression appears. Such artifacts might include books, manuscripts, inscriptions in wood, stone and other media, textiles, coins, ceramics, painting and sculpture. To accomplish this goal we require a vastly improved representation capability which will also capture writing in three-dimensional contexts. An example might be a vase with an inscription around its circumference. This methodology will allow the creation of multi-media digital archives that can be queried from the perspective of content, text as object, or the two as an integral unit.

For a pilot project, we will use legal texts from around the world which are found in few U.S. library collections. Examples identified today include:

From the Renaissance period, the collection of Bartolus of Sassoferrato (1313-1359), founder of modern jurisprudence, is printed in 10 large folio volumes from the 16th century. It is in Latin in Gothic typeface, with variegated spacing on the page, and illustrations throughout.
From the Middle East, the civil law of the Ottoman Empire, in Ottoman Turkish (which uses modified Arabic script), was the basis for the civil law of Turkey and many modern Arab states. The basic text was published in multiple volumes beginning in the late 1860's.
From Russia, the complete collection of the Laws of the Russian Empire (Polnoe sobranie zakonov rossiiskoi-imperii) is the complete collection of laws, decrees, statutes and documents produced by the Russian Government between 1650 and 1830. It is printed in an older form of Cyrillic script.
From Africa, the Sundiata Epic in Mandinka and Bambara, is based on oral tradition and is the founding myth of the Malian Empire and the Mande diaspora. Recorded performances from this work will test the multi-media aspects of this project.
We will also identify texts from other cultures and language groups, such as Armenian, Chinese, Thai, Greek, and Babylonian.

The involvement of international colleagues is essential at each step of the process.

Draft 9/23/00


Search and Filter Technology for the Humanities

Finding relevant data in the vast and rapidly expanding World Wide Web necessitates more sophisticated tools to search a multilingual and multimedia environment. Part of the strategy for addressing this challenge lies in indexing online resources within the humanities. At the same time, it is proposed that we develop a smart filtering system which mediates between an individual's research profile and the objects within peer reviewed web sites. In the first instance, this system might employ a technology that automates tailored processes to keep individuals abreast of recent developments and recommends additional resources which may fall outside the literal reading of an individual search query. Such a system would contribute to the development of coherent strategies for searching multimedia documents and integrating the resulting data.

We propose that the following organizations cooperate in the coordination, organization, and sustainability of this project:

African Studies Association
American Association for the Advancement of Slavic Studies
American Studies Association
Asian Studies Association
Association for Jewish Studies
Latin American Studies Association
Middle East Studies Association
Renaissance Society of America

The organizations will focus on using and enabling further development of existing indexes, e.g., Iter, LANIC, as the test beds for the smart filtering system. Humanists, within the organizations, will be well equipped to identify high quality content and to supply relevant terminology. Technical support would be provided by information technology specialists, computer scientists and programmers. These technical and scientific consultants will be drawn from the partner organizations and their index projects.


An International Center for Computing and the Humanities: a resource, research and residency center

Many institutions have centers to aid their scholars with computer-related projects in the humanities. Others do not. Nor is there a central clearing house for humanities computing. Therefore, we propose an open center for computing in the humanities.

The center will support with money and advice, peer-reviewed projects involving computers and the humanities. The center will be funded by multiple foundations, agencies and corporate partners. It will have a diverse resource and support staff, and a physical location in a university, library or learned society. The Center will bring together a heterogeneous range of institutions, humanities professionals, and interdisciplinary information design and communication specialists. The Center will also have a significant virtual component, offering a broad range of open support resources, ranging from collaborative project consultation to serving as a clearinghouse and support system for shareable applications. Perhaps NINCH would set it up.

An individual scholar or group of scholars or learned society could come to this center with a request for funding for a specific proposal in which a technology application addresses a humanites project. The center would provide advice and, after peer-review, funds to do the project.

Proposals could involve research, teaching and/or outreach. The projects and techniques developed would be available for international dissemination.