This event has passed.

Creating a Comprehensive Lexical Resource for English Using Bayesian Deep Learning and Missing Data Methodology

February 25, 2022 @ 11:30 am - 12:30 pm

Speaker: Bryor Snefjella, Ph.D.
IDRE Scholar,
Psychology Department,
University of California Los Angeles

Video Recording: https://youtu.be/RTpW1-FtBGs

Abstract: Inquiry in the language sciences makes extensive use of open-source data sets. For example, data sets of hand-annotations of words for properties such as their connotation and familiarity. Other common types of open-source resources include behavioural or neuroimageing recordings of responses to linguistic stimuli in controlled experiments, or measurements taken from massive respositories of digitized natural language use. A challenge in the language sciences is extensive missing data in extant open-source data sets. Most data sets contain information on orders of magnitude fewer words than an average speaker knows, and the words they do contain are non-randomly sampled and non-overlapping. A commonly proposed remedy to this missing data is to replace hand-annotation with machine learning. This is the approach taken by the English Lexicon Imputation Project, the first comprehensive resource of word-level annotations created in cognitive science. In this talk I present the resource, the Bayesian deep neural network used to create it, and how missing data methodology was key to overcoming the limitations of prior literature on computational linguistic resource generation. The talk should be of interest to computational social scientists, language scientists, and those interested in deep-learning and missing data methods.

About speaker: Bryor Snefjella is a postdoctoral researcher in the Psychology Department, Cognitive Area, mentored by Idan Blank, Keith Holyoak, and Hongjing Lu. Before moving to UCLA, Bryor received a PhD in Cognitive Science of Language in McMaster University in Canada. His research on language use patterns in social media has received international media attention. Check him out on his personal website, Twitter, Linkedin, and Research Gate.

Details

Date: February 25, 2022
Time:
11:30 am - 12:30 pm
Event Categories: Conferences and Seminars, Meetings
Website: https://ucla.zoom.us/meeting/register/tJ0td-ygrjsoGdEuEw571uCY1hyW2e5c_6c8

Organizer

Institute for Digital Research and Education
View Organizer Website

Creating a Comprehensive Lexical Resource for English Using Bayesian Deep Learning and Missing Data Methodology

Details

Organizer

Events List Navigation