Democratizing Machine Learning for Interdisciplinary Scholars: Report on Organizing the NLP+CSS Online Tutorial Series
Expressive Interviewing Agents to Support Health-Related Behavior Change: Randomized Controlled Study of COVID-19 Behaviors
SCITUNE: Aligning Large Language Models with Scientific Multimodal Instructions
Lexical Measurement of Teaching Qualities
Understanding the Role of Questions in Mental Health Support-Seeking Forums
How well do you know your audience? Socially-aware question generation
Surfacing Racial Stereotypes through Identity Portrayal
FIBER: Fill-in-the-Blanks as a Challenging Video Understanding Evaluation Framework
Room to Grow: Understanding Personal Characteristics Behind Self Improvement Using Social Media
Tuiteamos o pongamos un tuit? Investigating the social constraints of loanword integration in Spanish social media
GT Thesis Submission
I recently submitted my PhD thesis. Please hold your applause.
Characterizing Collective Attention via Descriptor Context in Public Discussions of Crisis Events
Queer NLP - stand and be counted
TL;DR: I want marginalized people and communities to be recognized by those who design AI systems, and for those people’s needs to be clearly communicated and understood.
Rock, Rap, or Reggaeton?: Assessing Mexican Immigrants’ Cultural Assimilation Using Facebook Data
Making “fetch” happen: The influence of social and linguistic context on nonstandard word growth and decline
NAACL 2018
I recently returned from NAACL 2018, where I presented a short paper on language variation and political identity co-authored with Yuval Pinter and my advisor Jacob Eisenstein. Check out the full paper here and the slides here! The TL;DR is that local languages such as Catalan are likely associated with political identity, and that code-switching in political situations may have different constraints than typical frameworks such as audience design would predict.
Sí o no, ¿què penses? Catalonian independence and linguistic identity on social media
Replication studies
As part of a class in computational social science, my friend Yuval and I just wrapped up a short replication study relating to politics and language variation. We replicated the main findings of Shoemark et al. (2017) “Aye or naw,” with a twist: instead of focusing on variation between Scots and English, we looked at Spanish versus Catalan. You can check out preliminary results here.
#anorexia, #anarexia, #anarexyia: Characterizing Online Community Practices with Orthographic Variation
NWAV 2017
I’ve just returned from NWAV 2017, a conference centered around language variation and change. Although I’m technically focused on social computing, my research looks at situations of language change in online communities, like what makes certain lexical innovations survive longer than others. In the work I presented at NWAV (poster here), I found that lexical innovations on Reddit are more likely to succeed when they have higher dissemination among social and linguistic contexts. Here is photographic evidence of my presentation, documented by Emily Sabo!
A Viz of Ice and Fire: Exploring Entertainment Video Using Color and Dialogue
Reading notes
This summer, I’ve been doing some reading on social science research methods. That topic is obviously very broad, but I wanted to get a better sense of “how to think like a social scientist” rather than “how to think like a sociolinguist.” I’ve been doing sociolinguistics for so long (6 years?!) that I’ve sort of gotten stuck in a bubble about how to study social phenomena, and the books that I’ve read so far have done a nice job at providing a birds-eye view of quantitative social science methods that I’ve been missing. I’m hosting the notes here if you’re interested!
What does "context" mean? Part two
As I mentioned last time, my current work is concerned with how linguistic and social context influence the likelihood of a new word’s adoption. Last time I talked about semantic context as the popularity of a word’s “nearest neighbors” and how that might play a role in word adoption.
What does "context" mean? Part one
My current research is concerned with the relationship between the social and semantic context of lexical innovations and their likelihood of adoption in the online community Reddit. The innovation “fleek” gained success due to its restricted context, i.e. the phrase “on fleek”, but this might be a rarity compared to most innovations that might gain success as a result of being used in a wide variety of contexts. Unlike “fleek,” the intensifier “af” (“as fuck”) seems to occur in a wide range of post-adjective contexts (“cool af”, “dope af”, etc.). Related to work on adoption of innovations like this and this, it seems like there is a nontrivial relationship between the linguistic context of a new word and the likelihood of that word being adopted by a community. But how do we study that relationship quantitatively? It’s not easy to come up with a universal definition of “context” apart from the generic “company that you keep” definition, and this still leaves a lot of room for interpretation.
ICWSM 2017
From May 14-18, I recently attended the International Conference for Web and Social Media in Montreal to present my work on semantic change that I conducted during my internship at the Pacific Northwest National Laboratory during summer 2016. Check out the full paper here and the poster here!