��ñ��

Spring 2021

Date	Event	Speaker	Abstract/Details
01/27/2021	Planning, Extending Multilingual BERT to Low-Resource Languages	Zihan Wang	��
02/10/2021	SCIL UMR practice talk	Martha Palmer	��
02/24/2021	Practice talk	Sarah Moeller	��
03/03/2021	Garfinkel and NLP - a discussion of challenges for Natural Language Understanding	Clayton Lewis	��
03/10/2021	Reducing Confusion in Active Learning	Antonis Anastasapolous	Active learning (AL) uses a data selection algorithm to select useful training samples to minimize annotation cost. This is now an essential tool for building low-resource syntactic analyzers such as part-of-speech (POS) taggers. Existing AL heuristics are generally designed on the principle of selecting uncertain yet representative training instances, where annotating these instances may reduce a large number of errors. However, in an empirical study across six typologically diverse languages (German, Swedish, Galician, North Sami, Persian, and Ukrainian), we found the surprising result that even in an oracle scenario where we know the true uncertainty of predictions, these current heuristics are far from optimal. Based on this analysis, we pose the problem of AL as selecting instances which maximally reduce the confusion between particular pairs of output tags. Extensive experimentation on the aforementioned languages shows that our proposed AL strategy outperforms other AL strategies by a significant margin. We also present auxiliary results demonstrating the importance of proper calibration of models, which we ensure through cross-view training, and analysis demonstrating how our proposed strategy selects examples that more closely follow the oracle data distribution.
03/17/2021	ACL paper discussion	Led by Jon Cai & Sarah Moeller	(Conflict with DARPA KAIROS PI Meeting, no Martha, Susan, Piyush, Akanksha or Ghazaleh)
03/31/2021	Capstone Projects	��	��
04/14/2021	Toward Broad and Deep Language Processing for Intelligent Systems	Marjorie McShane	The early vision of AI included the goal of endowing intelligent systems with human-like language processing capabilities. This proved harder than expected, leading the vast majority of natural language processing practitioners to pursue less ambitious, shorter-term goals. Whereas the utility of human-like language processing is unquestionable, its feasibility is quite justifiably questioned. In this talk, I will not only argue that some approximation of human-like language processing is possible, I will present a program of R&D that is working on making it a reality. This vision, as well as progress to date, is described in the book Linguistics for the Age of AI (MIT Press, 2021), whose digital version is open access through the MIT Press website.
04/21/2021	Multimodal SRL	Abhidip Bhattacharyya	Scene understanding is a critical goal of Computer Vision and object recognition is an important element. Attention based encoder-decoder architectures have improved the performance of many vision-language models. However, the semantics of the images has largely been overlooked in designing these systems. As a result, vision-language systems recommend one fixed semantic interpretation for a particular image. Hence an image will always produce or retrieve a fixed description. This contrasts with the variety of expressions humans can generate when describing the same scene. To bridge this description gap we use semantic role labels (SRL) as our semantic cues for both images and text. Semantic roles enable richer representation of an image and the corresponding text in the shared space. With help of SRL we are able to achieve better performance in cross-modal retrieval. Also SRL enables generating diverse descriptions for a given image.
04/28/2021	Proposal: Event Coreference in Text & Graphs	Rehan Ahmed	��
05/05/2021	Proposal	Skatje Myers	��