Evgeniia Tokarchuk | NAACL 2024: my experience and highlights

NAACL 2024: my experience and highlights

Jun 28 2024 · 5 min read

#naacl #naacl 2024 #conference #mexico

It was an amazing week in Mexico city at NAACL 2024! I am happy I met so many brilliant researchers and had so many meaningful conversations. I am grateful to be a part of such a community. It’s a shame I was unable to join all the talks and posters I wanted to attend (sadly cannot be in two places at the same time, I need a time-turner ha-ha). Also, special thanks to Amsterdam folks for keeping me company 😊 In this post I want to share my experience and paper highlights.

Before you start, you are more than welcome to read our paper so you get more context. Share your ideas and question via e dot tokarchuk at uva dot nl

I’ve presented my work during the first oral session (machine translation track). I was SUPER nervous. When I started I realised they haven’t used my recent slides (even though I uploaded them a night before as organizers asked.) I recovered quickly cause honestly I have only changed some visuals, content was exactly the same. My hands and legs were shaking. It was ok at the end, however I think I could have done better. My friends told me I looked super calm and nailed it though ❤️ I got a few questions and they were interesting, I even had a chance to make a teaser for my next paper.

And then I was done and could enjoy the conference at the fullest!

Before we start, some interesting statistics and photos from the conference social event:

Fruits selection during coffee breaks (much appreciated, saying as a person with milk allergy)

Aaand back to research stuff!

Highlights

Here is my paper’s and talks highlights. Note that my highlights are aligned with my specific research interests. It doesn’t mean they are better than any other papers/talks, I just found them interesting/relevant for my work! Some just had an amazing presentations and I would love to remember some tips for the future.

Talks:

Harnessing the Power of LLMs to Vitalize Indigenous Languages Claudio Santos Pinhanez

NAACL keynote Very nice view on studying indigenous languages from the perspective of the person outside the indigenous community. “Nothing about us without us”. It gave a lot of food for thoughts.

From Insights to Actions: The Role of Analysis Work in NLP, Marius Mosbach (Keynote Workshop on Insights from Negative Results in NLP)

Interesting analysis of the usage of analysis paper’s results on NLP works + call for actions

Dual Edges of Innovation: Risks and benefits of LLMS in LMICs David Restrepo (Keynote, The 6th Clinical Natural Language Processing Workshop)

Talk on how LLMs can be used for clinical data. Main question: can we use one foundation model for every country and every language?

Papers (both oral and posters):

A Preference-driven Paradigm for Enhanced Translation with Large Language Models Dawei Zhu, Sony Trenous, Xiaoyu Shen, Dietrich Klakow, Bill Byrne, Eva Hasler

For some reasons I haven’t heard about DPO until recent. Interesting approach to overcome plateau in NMT performance while adding fine-tuning data to LLMs.

Direct Preference Optimization for Neural Machine Translation with Minimum Bayes Risk Decoding Guangyu Yang, Jinghong Chen, Weizhe Lin, Bill Byrne

Combination of DPO and MBR in training. Interesting approach to improve translation quality.

On the True Distribution Approximation of Minimum Bayes-Risk Decoding Atsumoto Ohashi, Ukyo Honda, Tetsuro Morimura, Yuu Jinnai

Analysis work on the true distribution approximation of MBR decoding.

Finding Replicable Human Evaluations via Stable Ranking Probability Parker Riley, Daniel Deutsch, George Foster, Viresh Ratnakar, Ali Dabirmoghaddam, Markus Freitag

How to achieve stable human evaluation results by using different assignments procedures.

SemRoDe: Macro Adversarial Training to Learn Representations that are Robust to Word-Level Attacks Brian Formento, Wenjie Feng, Chuan-Sheng Foo, Anh Tuan Luu, See-Kiong Ng

Using the robustness term based on distance between the original and adversarial examples in the training objective.

Reinforcement Learning for Edit-Based Non-Autoregressive Neural Machine Translation Hao Wang, Tetsuro Morimura, Ukyo Honda, Daisuke Kawahara

Application of rl for Levenshtein Transformer. Investigation on episodic and sentence level rewards.

Visual Grounding Helps Learn Word Meanings in Low-Data Regimes Chengxu Zhuang, Evelina Fedorenko, Jacob Andreas

Amazing talk. Even though I am not that much into visual grounding, it was interesting to see how it affects the learning of word meanings. It got best paper award (and I believe it was well-deserved).

SpeedE: Euclidean Geometric Knowledge Graph Embedding Strikes Back Aleksandar Pavlović, Emanuel Sallinger

Study on how Euclidean space can be effective for knowledge graph embeddings (comparison with hyperbolic space).

GPT-who: An Information Density-based Machine-Generated Text Detector Saranya Venkatraman, Adaku Uchendu, Dongwon Lee

Interesting approach to detect not only machine-generated text, but also identify the model itself.

Retrieving Examples from Memory for Retrieval Augmented Neural Machine Translation: A Systematic Comparison Maxime Bouthors, Josep Crego, François Yvon

Explore various text-based retrieval metrics (string matching) for retrieval augmented NMT.

Multilingual Machine Translation with Large Language Models: Empirical Results and Analysis Wenhao Zhu, Hongyi Liu, Qingxiu Dong, Jingjing Xu, Shujian Huang, Lingpeng Kong, Jiajun Chen, Lei Li

Exploring the performance of various multilingual LLMs for multilingual NMT.

Got a picture from my colleague, caught in the wild

Mexcio City

For dessert, some photos from Mexico City. I had a chance to visit the city for the first time, and it was amazing!

Museo Nacional de Antropología, Mexico City