Widening NLP will be a hybrid workshop at EMNLP 2021. For in-person attendees, events will be held at the Barceló Bávaro Convention Centre in Punta Cana, the Dominican Republic. For online attendees calling in from around the world, events will be held on Underline and on Zoom.
View the accepted submissions here.
Schedule (in local time)
8:00 Poster Session A
9:00 Opening remarks
9:10 Joint Keynote speaker with Queer In AI: Dr. Jasmijn Bastings (English with Spanish translation)
9:40 Q&A for Dr. Jasmijn Bastings
9:50 Keynote speaker 2: Prof. Manuel Montes (Spanish with English translation)
10:20 Q&A for Prof. Manuel Montes
10:30 Coffee break (in-person at the venue/online in your own home)
note: there will be an open Zoom space for online attendees to socialize if they would like
11:00 Tutorial: How to write and respond to a review
12:00 Lunch (in-person at the venue/online in your own home)
note: there will be an open Zoom space for online attendees to socialize if they would like
13:00 Panel: The Peer Review Process and Widening NLP
14:20 Coffee break (in-person at the venue/online in your own home)
14:45 Keynote speaker 3: Dra. Adriana Lorena Iñiguez Carrillo
15:15 Q&A for Dra. Adriana Lorena Iñiguez Carrillo
15:25 Closing remarks
15:35 Poster session B
16:35 Social & coffee (in-person at the venue/online in gather.town)
Keynote Speakers:
Note: There will be translations of all keynotes in Spanish and English
Nota: Habrá traducciones de todas las notas clave en español e inglés.
WiNLP/ QueerInAI Keynote Speaker:
Dr. Jasmijn Bastings
Jasmijn Bastings is a researcher at Google Research Amsterdam. Her research spans natural language processing, machine learning, and explainable AI. Recently, she has been focusing on making NLP more fair, interpretable and robust. She published works on interpretable neural predictions and saliency methods, and is a co-author of Joey NMT and the Language Interpretability Tool (LIT). Jasmijn holds a PhD from ILLC, University of Amsterdam, where she worked on linguistically-informed neural machine translation, interpretability and generalization.
Talk Title: The Very Hungry Caterpillar and The Wish for a Wider and More Dynamic NLP
Abstract: Large, parameterized, end-to-end trained models are the workhorses of today’s NLP research and applications. What could possibly be wrong with them? In my talk I identify one shortcoming—their static nature in an ever-changing world—and connect it to yet another static phenomenon in NLP: the way we disseminate our research. I argue for a more dynamic NLP, both in terms of research and publishing, and that a more dynamic NLP is a wider NLP.
WiNLP Keynote Speaker:
Prof. Manuel Montes, Laboratorio de Tecnologías del Lenguaje, Instituto Nacional de Astrofísica, Óptica y Electrónica (INAOE), México | Language Technologies Lab, National Institute of Astrophysics, Optics and Electronics | https://ccc.inaoep.mx/~mmontesg/
El Dr. Manuel Montes es investigador en la Coordinación de Ciencias Computacionales del INAOE. Su trabajo de investigación se orienta principalmente a los temas de recuperación de información, minería de texto, y análisis de autoría, sobre los cuales ha publicado alrededor de 250 artículos en revistas y congresos internacionales, y le han valido el reconocimiento de Investigador Nacional nivel II del SNI, así como de miembro de la Academia Mexicana de Ciencias.
El Dr. Montes ha sido profesor invitado en la Universidad Politécnica de Valencia, en la Universidad de Génova, y en la Universidad de Alabama en Birmingham. También es miembro fundador de la Academia Mexicana de Computación, la Asociación Mexicana de Procesamiento de Lenguaje Natural, y la Red Temática en Tecnologías del Lenguaje del CONACyT. En el contexto de las últimas dos, ha sido organizador del Taller Nacional de Tecnologías del Lenguaje (de 2004 a 2016), del Taller Mexicano sobre Detección de Plagio y Análisis de Autoría (2016-2020), la Escuela de Otoño de Tecnologías del Lenguaje (2015 y 2016), así como de tareas de evaluación sobre perfilado de autores, análisis de lenguaje agresivo, y detección de noticias falsas en el MEX-A3T (2018-2020) y MeOffendEs (2021).
Manuel Montes is Full Professor at the National Institute of Astrophysics, Optics and Electronics (INAOE) of Mexico. His research is on automatic text processing. He is author of more than 250 journal and conference papers in the fields of information retrieval, text mining and authorship analysis, which have earned him the recognition of National Researcher Level II and be member of the Mexican Academy of Sciences (AMC).
Dr. Montes has been visiting professor at the Polytechnic University of Valencia (Spain), and the University of Alabama (USA). He is also a founding member of the Mexican Academy of Computer Science (AMEXCOMP), the Mexican Association of Natural Language Processing (AMNLP), and of the Language Technology Network of CONACYT. In the context of them, he has been the organizer of the National Workshop on Language Technologies (from 2004 to 2016), the Mexican Workshop on Plagiarism Detection and Authorship Analysis (2016-2020), the Mexican Autumn School on Language Technologies (2015 and 2016), and the MEX-A3T Shared Task on author profiling, aggressiveness analysis and fake news detection in Mexican Spanish at IberLEF (2018-2021).
¿ofensas o solo malas palabras? Retos del análisis de lenguaje ofensivo en redes sociales
Fenómenos como el bullying, la homofobia, el sexismo y el racismo han trascendido a las redes sociales, y son una preocupación importante de las empresas proveedoras, así como de los gobiernos, pues pueden tener (o ya tienen) un impacto social muy negativo. Esto ha motivado el desarrollo de gran cantidad de métodos para su identificación automática, que van desde simples que consideran lexicones hasta más sofisticados basados en técnicas de aprendizaje profundo. Un desafío común para todos estos métodos es lograr distinguir entre el uso ofensivo y el coloquial e incluso cotidiano de groserías y palabras vulgares. Esta charla se enfoca en este problema específico.
De manera general, en esta charla se describirá el concepto de lenguaje abusivo, los tipos de éste, y los problemas principales para su detección por medios automáticos. Se presentarán brevemente dos métodos que hemos desarrollado en el Laboratorio de Tecnologías del INAOE para su identificación, y finalmente se relatarán los esfuerzos realizados para construir recursos para su estudio y evaluación en español de México.
Offenses or just dirty words? Challenges of offensive language identification in social networks
Phenomena such as bullying, homophobia, sexism and racism have transcended social networks, and are an important concern of supplier companies as well as governments, as they can have (or already have) a very negative social impact. This situation has motivated the development of several methods for their automatic identification, ranging from simple one based on fixed lexicons, to more sophisticated based on deep learning techniques. A common challenge for all them is to distinguish between offensive and colloquial and even everyday use of profanity and vulgar words. This talk focuses on this specific problem.
In short, in this talk I will describe the concept of abusive language, its types, and the main problems for its detection by automatic means. I will also present some methods that we have developed at INAOE’s Language Technologies Lab for their identification. Finally, I will describe some efforts made to build resources for their study and evaluation in Mexican Spanish.
WiNLP Keynote Speaker
Dra. Adriana Lorena Iñiguez Carrillo, Universidad de Guadalajara
Adriana Iñiguez is Professor at the University of Guadalajara in Mexico. She is a computer systems engineer, has a master’s degree in computing and PhD in Information Technology. She collaborates in the CUCEA Smart Cities Innovation Center of the University of Guadalajara. She has professional experience as a coordinator of university programs, in the performance of quality and relevance analysis of the study plans in the computer science area. Adriana’s primary research interests include Human-Computer Interaction, interactions with voice and artificial intelligence.
Talk Title: to be announced
Tutorial: How to Write and Respond to a Review
Speakers: WiNLP Chairs Antonios Anastasopolous and Tirthankar Ghosal
Peer review is considered as the “gatekeeper of scientific knowledge and wisdom”. Almost every researcher has to go through this process to make their research publishable and visible to the scientific community. Responding to the reviewers is one crucial job of the authors in this process. However the task could be daunting. Writing constructive and informative rebuttals to reviewers can change the reviewers perception or misconception on the paper, thereby can affect the peer review outcome. Here in this tutorial we would focus on how to write an acceptable review response. We would cover how to constructively write a rebuttal without sounding rude, how to establish your arguments taking into consideration the reviewer’s opinion, and finally how to chalk the action points once you receive the review.
Panel: The Peer Review Process and Widening NLP
This panel will bring together an international group of stakeholders who organize, engage in, and explore the peer review process in ACL. The speakers will discuss the many ways in which peer review may harm or help create a more diverse and equitable research space, from anglo-centric publishing, to anonymous review, to ensuring the author and reviewer pool represents a variety of life experiences. Looking to the future, the speakers will engage with each other and the WiNLP audience to recognize the high value of engaging with a more diverse author pool and strategize how to select for one.
Panelists:
Dr. Bahar Mehmani
Dr. Bahar Mehmani is Reviewer Experience Lead in the Global STM journals at Elsevier. She leads Elsevier’s peer review strategy and oversees projects related to researchers’ and academics’ pain points throughout the peer-review process. Bahar is a member of the NISO peer review taxonomy working group and the chair of the peer review committee and council member of the European Association of Science Editors (EASE). She received her PhD in Theoretical Physics from the University of Amsterdam in 2010. Before joining Elsevier, she was a postdoc researcher at Max Planck Institute for the Science of Light.
Anna Rogers
Anna Rogers is a post-doctoral associate at the University of Copenhagen. Her main research areas are evaluation and analysis of deep learning models for NLP. She is also active in the sphere of NLP methodology, working on issues in peer review and organizing the workshop on Insights from Negative Results in NLP.
Cecilia Superchi
Cecilia Superchi is currently a Postdoctoral Research Associate in Clinical Epidemiology at the Université de Paris (France). She conducted her PhD within the Methods in Research on Research (MiRoR) project, an innovative doctoral training programme in the field of clinical research funded by Marie Skłodowska-Curie Actions. Her PhD focused on how to assess the quality of peer review reports in biomedical research. She obtained a BSc degree in Biological Sciences at the University of Parma (Italy) and a MSc degree in Epidemiology and Public Health at Wageningen University (Netherlands). Before starting her PhD, she worked at the Iberoamerican Cochrane Centre in Barcelona (Spain).
Orcid number: https://orcid.org/0000-0002-5375-6018
WiNLP 2021 Accepted Papers
No. | Title | Author | |
7 | OkwuGbé: End-to-End Speech Recognition for Fon and Igbo | Bonaventure F. P. Dossou and Chris Chinenye Emezue | |
8 | TEET! Tunisian Dataset for Toxic Speech Detection | Slim Gharbi, Hatem Haddad, Mayssa Kchaou and Heger Arfaoui | |
9 | Ara-Women-Hate: The first Arabic Hate Speech corpus regarding Women | Imane Guellil, Ahsan Adeel, Faical Azouaou, Mohamed Boubred, Yousra Houichi and Akram Moumna | |
12 | Behavioral Testing of Knowledge Graph Embedding Models for Link Prediction | Wiem Ben Rim, Carolin Lawrence, Kiril Gashteovski, Mathias Niepert and Naoaki Okazaki | |
13 | Developing Language Technology and NLP tools for endangered languages: Torwali | Naeem Uddin Hadi | |
14 | How to Make Virtual Conferences Queer-Friendly: A Guide | Organizers of QueerInAI, A Pranav, MaryLena Bleile, Arjun Subramonian, Luca Soldaini, Danica J. Sutherland, Sabine Weber and Pan Xu | |
15 | Neutralizing Gender Bias in Neural Machine Translation by Introducing Linguistic Knowledge | Ksenia Kharitonova, Marta R. Costa-jussà, Carlos Escolano, Christine Basta and Jordi Armengol-Estapé | |
22 | Developing Keyboards for the Endangered Livonian Language | Mika Hämäläinen and Khalid Alnajjar | |
24 | Nuanced Queerphobic Bias in Popular Sentiment Analysis Tools: A Data Set and Evaluation | Anonymous | |
25 | Coral: An Approach for Conversational Agents in Mental Health Applications | Harsh Sakhrani, Saloni Parekh and Shubham Mahajan | |
26 | EM ALBERT: A Step Towards Equipping Manipuri for NLP | Rudali Huidrom and Yves Lepage | |
28 | Elementary-Level Math Word Problem Generation using Pre-Trained Transformers | Anonymous | |
29 | Towards the Early Detection of Child Predators in Chat Rooms: A BERT-based Approach | Sinchana Kumbale and Smriti Singh | |
30 | One-Shot Lexicon Learning for Low-Resource Machine Translation | Anjali Kantharuban and Jacob Andreas | |
32 | Sinhala-English Code-Mixed and Code-Switched Data Classification | Anonymous | |
33 | “I don’t know who she is”: Discourse and Knowledge Driven Coreference Resolution | Angela Ramirez, Cecilia Li, Phillip Lee, Eduardo Zamora, Jeshwanth Bheemanpally, Marilyn Walker and Adwait Ratnaparkhi | |
35 | Idiom Extraction Method with Fine-Tuning of Pre-trained Transformers for Named Entity Recognition | Nao Yamato | |
36 | Occupational Gender Stereotypes in Indic Languages | Neeraja Kirtane and Tanvi Anand | |
37 | #WhyDidTheyStay: An NLP-Driven Approach to Analyzing the Factors that Affect Domestic Violence Victims | Marthala Kavya and Smriti Singh | |
38 | Exploring Transfer Learning Pathways for Neural Machine Back Translation of Eskimo-Aleut, Chicham, and Classical Languages | Aaron Serianni and Daniel Whitenack | |
39 | An Interpretable Representation that Visually Grounds Dialog History | Mauricio Mazuecos, Franco M. Luque, Jorge Sánchez, Hernán Maina, Thomas Vadora and Luciana Benotti | |
40 | Automated Template Paraphrasing for Conversational Assistants | Liane Vogel and Lucie Flek | |
41 | Discovering Changes in Birthing Narratives During COVID-19 | Daphna Spira, Noreen Mayat, Caitlin Dreisbach and Adam Poliak | |
42 | Explorations in Transfer Learning for OCR Post-Correction | Lindia Tjuatja, Shruti Rijhwani and Graham Neubig | |
43 | A Prototype Free/Open-Source Morphological Analyser and Generator for Sakha | Sardana Ivanova, Francis Tyers and Jonathan N. Washington | |
44 | Towards Text Simplification for Sinhala Language | Anonymous | |
46 | Natural Language Processing as a Tool to Identify the Reddit Particularities of Cancer Survivors Around the Time of Diagnosis and Remission: A Pilot Study | Ioana R. Podină, Ana-Maria Bucur, Diana Todea, Liviu Fodor, Andreea Luca, Liviu P. Dinu and Rareș Boian | |
48 | Identifying Significant Citations via Mining Paper Full-Text | Muskaan Singh and Tirthankar Ghosal | |
49 | SciBERT-based Multitasking Deep Neural Architecture to Identify Contribution Statements from Scientific Articles | Komal Gupta and Tirthankar Ghosal | |
51 | Characterizing Test Anxiety on Social Media | Esha Julka, Olivia Kowalishin, Jalisha B. Jenifer and Adam Poliak | |
52 | Sample Selection Guided by Domain and Task for Cross-Domain Targeted Sentiment Analysis | Kasturi Bhattacharjee, Rashmi Gangadharaiah and Smaranda Muresan | |
53 | Bengali Parallel Universal Dependency Treebank | Pritha Majumdar | |
54 | The Development of Pre-Processing Tools and Pre-Trained Embedding Models for Amharic | Tadesse Destaw, Abinew Ayele and Seid Muhie Yimam | |
55 | Towards Syntax-Aware Dialogue Summarization using Multi-Task Learning | Seolhwa Lee, Kisu Yang, Chanjun Park, João Sedoc and Heuiseok Lim | |
56 | Detoxifying Language Models with Proximal Policy Optimization | Taaha Kazi | |
59 | Towards Personalized Descriptions of Scientific Concepts | Sonia Murthy, Daniel King, Tom Hope, Daniel Weld and Doug Downey | |
60 | Building Prosody Labeled Corpus in Hindi | Esha Banerjee, Atul Kr. Ojha and Girish Jha | |
61 | How Well Can an Agent Understand Different Accents? | Divya Tadimeti, Kallirroi Georgila and David Traum | |
62 | Monolingual Pre-Trained Language Models for Tigrinya | Fitsum Gaim, Wonsuk Yang and Jong C. Park | |
63 | ASQ: Automatically Generating Question-Answer Pairs using AMRs | Geetanjali Rakshit and Jeffrey Flanigan | |
64 | Adverse Drug Reaction Classification of Tweets with Fusion of Text and Drug Representations | Andrey Sakhovskiy and Elena Tutubalina | |
65 | Detecting Gender Bias Using Explainability | Gauri Gupta, Supriti Vijay and Krithika Ramesh | |
69 | Leveraging Ultradense Embeddings to Analyze Gender-Oriented Extremist Recruitment Targeting | Jatin Khilnani, Rasika Bhalerao and Tatenda Ndambakuwa |