Nyingarn will be an online platform for digitised early sources in Australia’s Indigenous languages. It will begin in mid-2021 and is a 3-year project.
There are over 800 Australian Indigenous languages and most are no longer spoken everyday. For most of them there is very little recorded, and, where there are records, they are often only on paper in a single library. In part, this reflects the destruction of Aboriginal people and societies and the prevailing disregard for Indigenous cultures and languages at the time. There are rare examples of early settlers seeking to understand and record Indigenous languages and this project aims to build a system (Nyingarn – the Nyungar word for echidna) to discover, convert, present, ingest (accessioning items into Nyingarn), and search as many of these written sources in Australia’s Indigenous languages as possible in a new online digital platform with the text and images of the original documents. Our experience with presenting the Bates Online project is that Indigenous people want to have access to and re-use early sources in their languages.
It is the responsibility of academics who work with Indigenous people to make research materials available. These materials can be wordlists of an Indigenous language, sometimes a few words, sometimes a few hundred words. Nyingarn aims to make as many of these manuscript sources available as possible, searchable, and re-usable as textual documents. It will be a platform with a workflow to allow continual addition of new manuscripts. We use cutting-edge methods for training Optical Character Recognition (OCR) with Natural Language Processing (NLP) techniques to automate as much of the conversion to text as possible.
Nyingarn will provide specifications for users to add files after current funding is exhausted, ensuring ongoing use of the infrastructure. This, together with a commitment to open formats for the data, will make the content accessible and available for computational treatment in addition to allowing it to be downloaded for re-use in language teaching programs. An item will only be posted to the site after approval from a relevant language authority.
Research outcomes: As with many fields, the creation of primary data is essential to doing good research and Nyingarn will facilitate study of, among many other topics: language change over time; uses and range of biological taxa; distribution of songs over time; variation in languages and the diversity of languages recorded close to first contact; relationships between Indigenous people and first settlers; in addition to a range of unexpected topics that will arise by making previously inaccessible material publicly available and searchable.
Often, early records for Australian languages provide important information on social life, cultural activities, and Indigenous knowledge. Where languages are no longer spoken daily today, these records can support efforts by Indigenous people to reconnect with heritage, especially in language revival projects that are becoming more common. From a research perspective, each source is more data towards understanding the local language, its history, and its relationship to other languages.
Bruce Pascoe, the noted author and intellectual, discussing the journals of William Thomas, a priceless collection of information about Victorian Aborigines from the late 1830s, notes that they are one of the most important primary sources in Australian history. “So who published his journals? A university, a government department, his church, a private researcher? No, the Victorian Aboriginal Corporation for Languages published the papers in 2014.” He goes on to note that George Augustus Robinson’s diaries, a similarly important set of primary manuscripts, were only published in the 1980s. “While settler reminiscences, football club centenaries and books on outback toilets found plenty of researchers and publishers, two of the most important texts on Aboriginal culture waited over two hundred years.” (Pascoe 2014: i-ii)
We can characterise this as Pascoe’s challenge, and will take up the challenge by building an online platform, Nyingarn, to increase the accessibility of primary sources for Australian languages which is vital for the growing interest in language and cultural revival.
Nyingarn is funded by an Australian Research Council Linkage Infrastructure, Equipment, and Facilities grant # LE200100006.