| |
|
|
History | Interdisciplinary Studies | Language & Literature | Performing Arts | Visual & Media Studies OUTLINE
PROPOSALS - FIRST DRAFTS

Recognizing and Representing
Multilingual Texts
Ali-Dinar, Bass,
Betteridge, Bickett, Block, Bowen, Caruso,
Clinton, Cooperman, Coventry, Grendler, Jones,
Knight, Parker, Palaima, Poe, Straley
The inability to render
precisely the full variety of human verbal
expression hinders international scholarship and
collaboration. Therefore we propose a set of
measures to create a more powerful and flexible
computing tool for representing and recognizing
writing systems from around the world at
different time periods. This involves three
specific areas:
- Creating an open ended
uniform standard for character sets, one
which extends the present capabilities of
Unicode. For example, Unicode does not
allow for the representation of accent
marks in Cyrillic script.
- The context in which
text appears is of value to researchers,
therefore a digital format must provide
access to data such as fonts,
illustrations, performance, relative
placement of print or script on a page,
color, materiali.e. the type of
information recorded in a description of
a manuscript. For example, the standard
should tell the researcher that a
particular page of the 1564 printed text
of Cicero's De Oratore with commentary by
Paulus Manutius has a short text of
Cicero dealing with history in Roman
typeface followed by Manutius's
commentary in italic typeface.
- Creating a methodology
for digitization which can identify and
search the full range of physical
contexts in which written expression
appears. Such artifacts might include
books, manuscripts, inscriptions in wood,
stone and other media, textiles, coins,
ceramics, painting and sculpture. To
accomplish this goal we require a vastly
improved representation capability which
will also capture writing in
three-dimensional contexts. An example
might be a vase with an inscription
around its circumference. This
methodology will allow the creation of
multi-media digital archives that can be
queried from the perspective of content,
text as object, or the two as an integral
unit.
For a pilot project, we
will use legal texts from around the world which
are found in few U.S. library collections.
Examples identified today include:
From the Renaissance period,
the collection of Bartolus of
Sassoferrato (1313-1359), founder of
modern jurisprudence, is printed in 10
large folio volumes from the 16th
century. It is in Latin in Gothic
typeface, with variegated spacing on the
page, and illustrations throughout.
From the Middle East, the
civil law of the Ottoman Empire, in
Ottoman Turkish (which uses modified
Arabic script), was the basis for the
civil law of Turkey and many modern Arab
states. The basic text was published in
multiple volumes beginning in the late
1860's.
From Russia, the complete
collection of the Laws of the Russian
Empire (Polnoe sobranie zakonov
rossiiskoi-imperii) is the complete
collection of laws, decrees, statutes and
documents produced by the Russian
Government between 1650 and 1830. It is
printed in an older form of Cyrillic
script.
From Africa, the Sundiata
Epic in Mandinka and Bambara, is based on
oral tradition and is the founding myth
of the Malian Empire and the Mande
diaspora. Recorded performances from this
work will test the multi-media aspects of
this project.
We will also
identify texts from other cultures and
language groups, such as Armenian,
Chinese, Thai, Greek, and Babylonian.
The involvement of
international colleagues is essential at each
step of the process.
Draft 9/23/00
Search and Filter Technology for
the Humanities
Finding relevant data in
the vast and rapidly expanding World Wide Web
necessitates more sophisticated tools to search a
multilingual and multimedia environment. Part of
the strategy for addressing this challenge lies
in indexing online resources within the
humanities. At the same time, it is proposed that
we develop a smart filtering system which
mediates between an individual's research profile
and the objects within peer reviewed web sites.
In the first instance, this system might employ a
technology that automates tailored processes to
keep individuals abreast of recent developments
and recommends additional resources which may
fall outside the literal reading of an individual
search query. Such a system would contribute to
the development of coherent strategies for
searching multimedia documents and integrating
the resulting data.
We propose that the
following organizations cooperate in the
coordination, organization, and sustainability of
this project:
- African Studies
Association
- American Association
for the Advancement of Slavic Studies
- American Studies
Association
- Asian Studies
Association
- Association for Jewish
Studies
- Latin American Studies
Association
- Middle East Studies
Association
- Renaissance Society of
America
The organizations will
focus on using and enabling further development
of existing indexes, e.g., Iter, LANIC, as the
test beds for the smart filtering system.
Humanists, within the organizations, will be well
equipped to identify high quality content and to
supply relevant terminology. Technical support
would be provided by information technology
specialists, computer scientists and programmers.
These technical and scientific consultants will
be drawn from the partner organizations and their
index projects.
An International Center for
Computing and the Humanities: a resource,
research and residency center
Many institutions have
centers to aid their scholars with
computer-related projects in the humanities.
Others do not. Nor is there a central clearing
house for humanities computing. Therefore, we
propose an open center for computing in the
humanities.
The center will support
with money and advice, peer-reviewed projects
involving computers and the humanities. The
center will be funded by multiple foundations,
agencies and corporate partners. It will have a
diverse resource and support staff, and a
physical location in a university, library or
learned society. The Center will bring together a
heterogeneous range of institutions, humanities
professionals, and interdisciplinary information
design and communication specialists. The Center
will also have a significant virtual component,
offering a broad range of open support resources,
ranging from collaborative project consultation
to serving as a clearinghouse and support system
for shareable applications. Perhaps NINCH would
set it up.
An individual scholar or
group of scholars or learned society could come
to this center with a request for funding for a
specific proposal in which a technology
application addresses a humanites project. The
center would provide advice and, after
peer-review, funds to do the project.
Proposals could involve
research, teaching and/or outreach. The projects
and techniques developed would be available for
international dissemination.
|