Linguists predict unknown words using language comparison – sciencedaily

For a long time, historical linguists have used the comparative method to reconstruct earlier states of languages ​​which are not attested in written sources. The method consists of the detailed comparison of words in related descendant languages ​​and allows linguists to deduce the ancient pronunciation of words that have never been recorded in any form in detail. That the method could also be used to infer how an undocumented word in a certain language would sound, provided that at least some information about that language, as well as information about associated languages ​​is available, has been known for a long time, but therefore far never explicitly tested.

Two researchers from SOAS University in London and the Max Planck Institute for the Science of Human History recently published an article in an international journal of historical linguistics, Diachronica. In the article, they describe the results of an experiment in which they applied the traditional comparative method to explicitly predict the pronunciation of words in eight Western Kho-Bwa language varieties spoken in India. Belonging to the trans-Himalayan family (also known as the Sino-Tibetan and Tibeto-Burmese language family), these varieties have not yet been described in detail and many words have not yet been documented in the field work. The researchers began their experiment with a set of existing etymological data from western Kho-Bwa varieties that were collected during fieldwork in the Indian state of Arunachal Pradesh between 2012 and 2017. Within the set of data, the authors observed multiple gaps in which word forms for certain concepts were missing.

“When doing fieldwork, it’s inevitable that you will miss certain words. It’s a bit annoying when you observe this afterwards, but in this case, we realized that this was the perfect opportunity to test how well the language reconstruction methods work. Says Tim Bodt, the study’s first author.

The researchers set up a computer-assisted workflow to predict missing word shapes. Conventional methods are traditionally applied manually, but newer computational solutions have helped researchers increase the efficiency and reliability of the process, and all results were then verified and refined manually. To increase the transparency and validity of the experiment, they then recorded their predictions online.

“Registration is extremely important in many fields of science because it ensures that researchers adhere to good scientific practice, but as far as we know this has never been done in historical linguistics,” says Johann-Mattis List, who carried out the computer analyzes of the study.

“By saving our predictions online, we made sure that we could no longer change our predictions in light of the results we got in our subsequent verification process,” adds Bodt.

With predictions in hand, Bodt then traveled to India to verify the predicted words with native speakers of Western Kho-Bwa languages. After asking participating local language consultants to provide their words for the concepts studied, the authors compared these attested words with their previous predictions. Based on the proportion of correctly predicted sounds per word form, the predictions were correct 76% of the time, which is remarkable considering the limited amount of information that was used to predict word forms. In addition, the researchers were able to identify several reasons why some predictions did not match the actual forms attested in languages.

“The more we know about a language family in general, the better we can predict unfamiliar word forms. This is all possible, because languages ​​change their sound systems in a surprisingly regular way, ”List says. “Despite the fact that so little was known about the Western Kho-Bwa languages ​​and their linguistic history, we were able to show through our experience that regular sound changes lead to predictable word forms. In return, our experience greatly improved our understanding of Western Kho-Bwa languages ​​and their linguistic history. “

In addition to providing a concrete example of the power of historical linguistics methodology and the value of their experience for linguistic studies, the authors identify some additional benefits of word prediction in linguistic research.

“Word prediction increases the transparency and efficiency of our research and fieldwork. This is crucial given the rapid loss of language and the limited funding for descriptive language work. Moreover, it also has an educational aspect as it encourages speakers to think about their own language. heritage, ”says Bodt.

The researchers hope the results of their groundbreaking experiment will encourage other linguists, descriptive linguists and historical linguists to follow suit and make more explicit and conscious use of the regularity of sound change and predictions of shapes. of words.

Agriculture Lifestyle political