WiNLP @ EMNLP 2021


Widening NLP will be a hybrid workshop at EMNLP 2021. For in-person attendees, events will be held at the Barceló Bávaro Convention Centre in Punta Cana, the Dominican Republic. For online attendees calling in from around the world, events will be held on Underline and on Zoom.

View the accepted submissions here.

Schedule (in local time)

8:00 Poster Session A
9:00 Opening remarks
9:10 Joint Keynote speaker with Queer In AI: Dr. Jasmijn Bastings (English with Spanish translation)
9:40 Q&A for Dr. Jasmijn Bastings
9:50 Keynote speaker 2: Prof. Manuel Montes (Spanish with English translation)
10:20 Q&A for Prof. Manuel Montes
10:30 Coffee break (in-person at the venue/online in your own home)
note: there will be an open Zoom space for online attendees to socialize if they would like
11:00 Tutorial: How to write and respond to a review
12:00 Lunch (in-person at the venue/online in your own home)
note: there will be an open Zoom space for online attendees to socialize if they would like
13:00 Panel: The Peer Review Process and Widening NLP
14:20 Coffee break (in-person at the venue/online in your own home)
14:45 Keynote speaker 3: Dra. Adriana Lorena Iñiguez Carrillo
15:15 Q&A for Dra. Adriana Lorena Iñiguez Carrillo
15:25 Closing remarks
15:35 Poster session B
16:35 Social & coffee (in-person at the venue/online in gather.town)

Keynote Speakers:

Note: There will be translations of all keynotes in Spanish and English

Nota: Habrá traducciones de todas las notas clave en español e inglés.

WiNLP/ QueerInAI Keynote Speaker:

Dr. Jasmijn Bastings

Jasmijn Bastings is a researcher at Google Research Amsterdam. Her research spans natural language processing, machine learning, and explainable AI. Recently, she has been focusing on making NLP more fair, interpretable and robust. She published works on interpretable neural predictions and saliency methods, and is a co-author of Joey NMT and the Language Interpretability Tool (LIT). Jasmijn holds a PhD from ILLC, University of Amsterdam, where she worked on linguistically-informed neural machine translation, interpretability and generalization.

Talk Title: The Very Hungry Caterpillar 🐛 and The Wish for a Wider and More Dynamic NLP

Abstract: Large, parameterized, end-to-end trained models are the workhorses of today’s NLP research and applications. What could possibly be wrong with them? In my talk I identify one shortcoming—their static nature in an ever-changing world—and connect it to yet another static phenomenon in NLP: the way we disseminate our research. I argue for a more dynamic NLP, both in terms of research and publishing, and that a more dynamic NLP is a wider NLP.

WiNLP Keynote Speaker:

Prof. Manuel Montes, Laboratorio de Tecnologías del Lenguaje, Instituto Nacional de Astrofísica, Óptica y Electrónica (INAOE), México | Language Technologies Lab, National Institute of Astrophysics, Optics and Electronics | https://ccc.inaoep.mx/~mmontesg/

El Dr. Manuel Montes es investigador en la Coordinación de Ciencias Computacionales del INAOE. Su trabajo de investigación se orienta principalmente a los temas de recuperación de información, minería de texto, y análisis de autoría, sobre los cuales ha publicado alrededor de 250 artículos en revistas y congresos internacionales, y le han valido el reconocimiento de Investigador Nacional nivel II del SNI, así como de miembro de la Academia Mexicana de Ciencias.

El Dr. Montes ha sido profesor invitado en la Universidad Politécnica de Valencia, en la Universidad de Génova, y en la Universidad de Alabama en Birmingham. También es miembro fundador de la Academia Mexicana de Computación, la Asociación Mexicana de Procesamiento de Lenguaje Natural, y la Red Temática en Tecnologías del Lenguaje del CONACyT. En el contexto de las últimas dos, ha sido organizador del Taller Nacional de Tecnologías del Lenguaje (de 2004 a 2016), del Taller Mexicano sobre Detección de Plagio y Análisis de Autoría (2016-2020), la Escuela de Otoño de Tecnologías del Lenguaje (2015 y 2016), así como de tareas de evaluación sobre perfilado de autores, análisis de lenguaje agresivo, y detección de noticias falsas en el MEX-A3T (2018-2020) y MeOffendEs (2021).

Manuel Montes is Full Professor at the National Institute of Astrophysics, Optics and Electronics (INAOE) of Mexico. His research is on automatic text processing. He is author of more than 250 journal and conference papers in the fields of information retrieval, text mining and authorship analysis, which have earned him the recognition of National Researcher Level II and be member of the Mexican Academy of Sciences (AMC).

Dr. Montes has been visiting professor at the Polytechnic University of Valencia (Spain), and the University of Alabama (USA). He is also a founding member of the Mexican Academy of Computer Science (AMEXCOMP), the Mexican Association of Natural Language Processing (AMNLP), and of the Language Technology Network of CONACYT. In the context of them, he has been the organizer of the National Workshop on Language Technologies (from 2004 to 2016), the Mexican Workshop on Plagiarism Detection and Authorship Analysis (2016-2020), the Mexican Autumn School on Language Technologies (2015 and 2016), and the MEX-A3T Shared Task on author profiling, aggressiveness analysis and fake news detection in Mexican Spanish at IberLEF (2018-2021).

¿ofensas o solo malas palabras? Retos del análisis de lenguaje ofensivo en redes sociales

Fenómenos como el bullying, la homofobia, el sexismo y el racismo han trascendido a las redes sociales, y son una preocupación importante de las empresas proveedoras, así como de los gobiernos, pues pueden tener (o ya tienen) un impacto social muy negativo. Esto ha motivado el desarrollo de gran cantidad de métodos para su identificación automática, que van desde simples que consideran lexicones hasta más sofisticados basados en técnicas de aprendizaje profundo. Un desafío común para todos estos métodos es lograr distinguir entre el uso ofensivo y el coloquial e incluso cotidiano de groserías y palabras vulgares. Esta charla se enfoca en este problema específico.

De manera general, en esta charla se describirá el concepto de lenguaje abusivo, los tipos de éste, y los problemas principales para su detección por medios automáticos. Se presentarán brevemente dos métodos que hemos desarrollado en el Laboratorio de Tecnologías del INAOE para su identificación, y finalmente se relatarán los esfuerzos realizados para construir recursos para su estudio y evaluación en español de México.

Offenses or just dirty words? Challenges of offensive language identification in social networks

Phenomena such as bullying, homophobia, sexism and racism have transcended social networks, and are an important concern of supplier companies as well as governments, as they can have (or already have) a very negative social impact. This situation has motivated the development of several methods for their automatic identification, ranging from simple one based on fixed lexicons, to more sophisticated based on deep learning techniques. A common challenge for all them is to distinguish between offensive and colloquial and even everyday use of profanity and vulgar words. This talk focuses on this specific problem.

In short, in this talk I will describe the concept of abusive language, its types, and the main problems for its detection by automatic means. I will also present some methods that we have developed at INAOE’s Language Technologies Lab for their identification. Finally, I will describe some efforts made to build resources for their study and evaluation in Mexican Spanish.

WiNLP Keynote Speaker

Dra. Adriana Lorena Iñiguez Carrillo, Universidad de Guadalajara

Dra. Carrillo in front of a red background

Adriana Iñiguez is Professor at the University of Guadalajara in Mexico. She is a computer systems engineer, has a master’s degree in computing and PhD in Information Technology. She collaborates in the CUCEA Smart Cities Innovation Center of the University of Guadalajara. She has professional experience as a coordinator of university programs, in the performance of quality and relevance analysis of the study plans in the computer science area. Adriana’s primary research interests include Human-Computer Interaction, interactions with voice and artificial intelligence.

Talk Title: to be announced

Tutorial: How to Write and Respond to a Review

Speakers: WiNLP Chairs Antonios Anastasopolous and Tirthankar Ghosal

Peer review is considered as the “gatekeeper of scientific knowledge and wisdom”. Almost every researcher has to go through this process to make their research publishable and visible to the scientific community. Responding to the reviewers is one crucial job of the authors in this process. However the task could be daunting. Writing constructive and informative rebuttals to reviewers can change the reviewers perception or misconception on the paper, thereby can affect the peer review outcome. Here in this tutorial we would focus on how to write an acceptable review response. We would cover how to constructively write a rebuttal without sounding rude, how to establish your arguments taking into consideration the reviewer’s opinion, and finally how to chalk the action points once you receive the review.

Panel: The Peer Review Process and Widening NLP

This panel will bring together an international group of stakeholders who organize, engage in, and explore the peer review process in ACL. The speakers will discuss the many ways in which peer review may harm or help create a more diverse and equitable research space, from anglo-centric publishing, to anonymous review, to ensuring the author and reviewer pool represents a variety of life experiences. Looking to the future, the speakers will engage with each other and the WiNLP audience to recognize the high value of engaging with a more diverse author pool and strategize how to select for one.

Panelists:

Dr. Bahar Mehmani

Dr. Bahar Mehmani

Dr. Bahar Mehmani is Reviewer Experience Lead in the Global STM journals at Elsevier. She leads Elsevier’s peer review strategy and oversees projects related to researchers’ and academics’ pain points throughout the peer-review process. Bahar is a member of the NISO peer review taxonomy working group and the chair of the peer review committee and council member of the European Association of Science Editors (EASE). She received her PhD in Theoretical Physics from the University of Amsterdam in 2010. Before joining Elsevier, she was a postdoc researcher at Max Planck Institute for the Science of Light.   

Anna Rogers

Anna Rogers

Anna Rogers is a post-doctoral associate at the University of Copenhagen. Her main research areas are evaluation and analysis of deep learning models for NLP. She is also active in the sphere of NLP methodology, working on issues in peer review and organizing the workshop on Insights from Negative Results in NLP.

Cecilia Superchi

Cecilia Superchi

Cecilia Superchi is currently a Postdoctoral Research Associate in Clinical Epidemiology at the Université de Paris (France). She conducted her PhD within the Methods in Research on Research (MiRoR) project, an innovative doctoral training programme in the field of clinical research funded by Marie Skłodowska-Curie Actions. Her PhD focused on how to assess the quality of peer review reports in biomedical research. She obtained a BSc degree in Biological Sciences at the University of Parma (Italy) and a MSc degree in Epidemiology and Public Health at Wageningen University (Netherlands). Before starting her PhD, she worked at the Iberoamerican Cochrane Centre in Barcelona (Spain).

Orcid number: https://orcid.org/0000-0002-5375-6018

WiNLP 2021 Accepted Papers

No. TitleAuthor
7OkwuGbé: End-to-End Speech Recognition for Fon and IgboBonaventure F. P. Dossou and Chris Chinenye Emezue
8TEET! Tunisian Dataset for Toxic Speech DetectionSlim Gharbi, Hatem Haddad, Mayssa Kchaou and Heger Arfaoui
9Ara-Women-Hate: The first Arabic Hate Speech corpus regarding WomenImane Guellil, Ahsan Adeel, Faical Azouaou, Mohamed Boubred, Yousra Houichi and Akram Moumna
12Behavioral Testing of Knowledge Graph Embedding Models for Link PredictionWiem Ben Rim, Carolin Lawrence, Kiril Gashteovski, Mathias Niepert and Naoaki Okazaki
13Developing Language Technology and NLP tools for endangered languages: TorwaliNaeem Uddin Hadi
14How to Make Virtual Conferences Queer-Friendly: A GuideOrganizers of QueerInAI, A Pranav, MaryLena Bleile, Arjun Subramonian, Luca Soldaini, Danica J. Sutherland, Sabine Weber and Pan Xu
15Neutralizing Gender Bias in Neural Machine Translation by Introducing Linguistic KnowledgeKsenia Kharitonova, Marta R. Costa-jussà, Carlos Escolano, Christine Basta and Jordi Armengol-Estapé
22Developing Keyboards for the Endangered Livonian LanguageMika Hämäläinen and Khalid Alnajjar
24Nuanced Queerphobic Bias in Popular Sentiment Analysis Tools: A Data Set and EvaluationAnonymous
25Coral: An Approach for Conversational Agents in Mental Health ApplicationsHarsh Sakhrani, Saloni Parekh and Shubham Mahajan
26EM ALBERT: A Step Towards Equipping Manipuri for NLPRudali Huidrom and Yves Lepage
28Elementary-Level Math Word Problem Generation using Pre-Trained TransformersAnonymous
29Towards the Early Detection of Child Predators in Chat Rooms: A BERT-based ApproachSinchana Kumbale and Smriti Singh
30One-Shot Lexicon Learning for Low-Resource Machine TranslationAnjali Kantharuban and Jacob Andreas
32Sinhala-English Code-Mixed and Code-Switched Data ClassificationAnonymous
33“I don’t know who she is”: Discourse and Knowledge Driven Coreference ResolutionAngela Ramirez, Cecilia Li, Phillip Lee, Eduardo Zamora, Jeshwanth Bheemanpally, Marilyn Walker and Adwait Ratnaparkhi
35Idiom Extraction Method with Fine-Tuning of Pre-trained Transformers for Named Entity RecognitionNao Yamato
36Occupational Gender Stereotypes in Indic LanguagesNeeraja Kirtane and Tanvi Anand
37#WhyDidTheyStay: An NLP-Driven Approach to Analyzing the Factors that Affect Domestic Violence VictimsMarthala Kavya and Smriti Singh
38Exploring Transfer Learning Pathways for Neural Machine Back Translation of Eskimo-Aleut, Chicham, and Classical LanguagesAaron Serianni and Daniel Whitenack
39An Interpretable Representation that Visually Grounds Dialog HistoryMauricio Mazuecos, Franco M. Luque, Jorge Sánchez, Hernán Maina, Thomas Vadora and Luciana Benotti
40Automated Template Paraphrasing for Conversational AssistantsLiane Vogel and Lucie Flek
41Discovering Changes in Birthing Narratives During COVID-19Daphna Spira, Noreen Mayat, Caitlin Dreisbach and Adam Poliak
42Explorations in Transfer Learning for OCR Post-CorrectionLindia Tjuatja, Shruti Rijhwani and Graham Neubig
43A Prototype Free/Open-Source Morphological Analyser and Generator for SakhaSardana Ivanova, Francis Tyers and Jonathan N. Washington
44Towards Text Simplification for Sinhala LanguageAnonymous
46Natural Language Processing as a Tool to Identify the Reddit Particularities of Cancer Survivors Around the Time of Diagnosis and Remission: A Pilot StudyIoana R. Podină, Ana-Maria Bucur, Diana Todea, Liviu Fodor, Andreea Luca, Liviu P. Dinu and Rareș Boian
48Identifying Significant Citations via Mining Paper Full-TextMuskaan Singh and Tirthankar Ghosal
49SciBERT-based Multitasking Deep Neural Architecture to Identify Contribution Statements from Scientific ArticlesKomal Gupta and Tirthankar Ghosal
51Characterizing Test Anxiety on Social MediaEsha Julka, Olivia Kowalishin, Jalisha B. Jenifer and Adam Poliak
52Sample Selection Guided by Domain and Task for Cross-Domain Targeted Sentiment AnalysisKasturi Bhattacharjee, Rashmi Gangadharaiah and Smaranda Muresan
53Bengali Parallel Universal Dependency TreebankPritha Majumdar
54The Development of Pre-Processing Tools and Pre-Trained Embedding Models for AmharicTadesse Destaw, Abinew Ayele and Seid Muhie Yimam
55Towards Syntax-Aware Dialogue Summarization using Multi-Task LearningSeolhwa Lee, Kisu Yang, Chanjun Park, João Sedoc and Heuiseok Lim
56Detoxifying Language Models with Proximal Policy OptimizationTaaha Kazi
59Towards Personalized Descriptions of Scientific ConceptsSonia Murthy, Daniel King, Tom Hope, Daniel Weld and Doug Downey
60Building Prosody Labeled Corpus in HindiEsha Banerjee, Atul Kr. Ojha and Girish Jha
61How Well Can an Agent Understand Different Accents?Divya Tadimeti, Kallirroi Georgila and David Traum
62Monolingual Pre-Trained Language Models for TigrinyaFitsum Gaim, Wonsuk Yang and Jong C. Park
63ASQ: Automatically Generating Question-Answer Pairs using AMRsGeetanjali Rakshit and Jeffrey Flanigan
64Adverse Drug Reaction Classification of Tweets with Fusion of Text and Drug RepresentationsAndrey Sakhovskiy and Elena Tutubalina
65Detecting Gender Bias Using ExplainabilityGauri Gupta, Supriti Vijay and Krithika Ramesh
69Leveraging Ultradense Embeddings to Analyze Gender-Oriented Extremist Recruitment TargetingJatin Khilnani, Rasika Bhalerao and Tatenda Ndambakuwa