IDRE Early Career Research Day winners seek to preserve genocidal testimonies' validity through quantitative data - Institute for Digital Research and Education

By kristenventura@ucla.edu

May 29, 2020

IDRE’s Early Career Research Day recognized Anna Bonazzi and Lizhou Fan’s research, Language Use and Narrative Structures in Genocide Interviews via Digital Humanities, as one of the top four posters presented. More than 80 researchers participated in the poster session event with 40 high-quality research posters on November 20, 2019.

Bonazzi and Fan’s research aims to better understand genocide experiences by analyzing survivor tesitomonies’ language use and narrative structure with a digital humanities approach.

They hope to preserve the validity of genocide victims’ experiences by adding quantitative data to the victims’ recorded stories. Bonazzi and Fan also intend to analyze and question the conventions of testimony as a genre influenced by Holocaust experiences.

“Most of these survivors are elderly or dead, and they cannot advocate for themselves anymore,” said Bonazzi, a Ph.D. student in UCLA’s Department of Germanic Languages.

The first collection of testimonies they analyzed came from Holocaust survivors who were interviewed in 1946 by Professor David Boder from the Illinois Institute of Technology. The researchers also studied the Shoah Foundation’s video recordings of survivors of various massacres including the Armenian genocide, the Nanjing massacre, the Rwandan genocide, and the Holocaust.

Bonazzi and Fan’s findings point to larger trends in all the survivors’ speeches.

Prior “testimony genre” research used human annotators to identify key words in each individual interview. This approach is not only time-consuming, but it also limits the possible findings.

“There is a lot more in these interviews that remains invisible if you limit yourself to writing on the side ‘this is a keyword’,” Bonazzi said.

Using computational indexing through Python and R programming, they uncovered patterns in language choice, code switching and main topics.

In Boder’s early interviews, the subjects focused more on the mistreatment and deaths they experienced. Those interviewed many decades after the genocides focus their conversations on their daily lives, religion, and philosophy, indicating that survivors’ testimonies and what they consider important change over time.

Boder’s survivors also changed their languages much more frequently. These interviewees were incredibly anxious to ensure the interviewers listened to them and believed them, since the public had very little knowledge on the concentration camps at the time. If one of their stories seemed confusing or unfathomable for the interviewer, the survivor would repeat themselves in a different language or switch over to a language the interviewer was more comfortable speaking.

Bonazzi said she believes their findings can broaden the public’s understanding of these events and provide indicators of a genocide’s early stages.

“If you have a broad overview of how the events happened for everybody– the loss of liberties and the complications that started before the real genocide– you can recognize those dynamics in the present,” Bonazzi said.

Fan said he hopes that their computational model can accompany human annotators’ work in analyzing other transcripts and interviews in the future.

“There is a lot of work that hasn’t been human annotated,” said Fan, a research associate for UCLA Digital Humanities. “There are too many corpora, so maybe before human annotators have the chance, we can directly use our computation methods.”

Fan said he envisions an interactive back and forth between human annotators and the computational indexing system. The annotators would study a small portion of the corpus to give a path for the computational methods, and then the larger trends found by the computations would give better hints to annotators about what they should look out for in the interviews.

Bonazzi and Fan started working on this project in June 2019, under Professor Todd Presner, Ph.D., in the UCLA Program in Digital Humanities.