Spring 2021

Date

Event

Speaker

Abstract/Details

01/27/2021Planning, Extending Multilingual BERT to Low-Resource LanguagesZihan WangÌý
02/10/2021SCIL UMR practice talkMartha PalmerÌý
02/24/2021Practice talkSarah MoellerÌý
03/03/2021Garfinkel and NLP - a discussion of challenges for Natural Language UnderstandingClayton LewisÌý
03/10/2021Reducing Confusion in Active LearningAntonis AnastasapolousActive learning (AL) uses a data selection algorithm to select useful training samples to minimize annotation cost. This is now an essential tool for building low-resource syntactic analyzers such as part-of-speech (POS) taggers. Existing AL heuristics are generally designed on the principle of selecting uncertain yet representative training instances, where annotating these instances may reduce a large number of errors. However, in an empirical study across six typologically diverse languages (German, Swedish, Galician, North Sami, Persian, and Ukrainian), we found the surprising result that even in an oracle scenario where we know the true uncertainty of predictions, these current heuristics are far from optimal. Based on this analysis, we pose the problem of AL as selecting instances which maximally reduce the confusion between particular pairs of output tags. Extensive experimentation on the aforementioned languages shows that our proposed AL strategy outperforms other AL strategies by a significant margin. We also present auxiliary results demonstrating the importance of proper calibration of models, which we ensure through cross-view training, and analysis demonstrating how our proposed strategy selects examples that more closely follow the oracle data distribution.
03/17/2021ACL paper discussionLed by Jon Cai & Sarah Moeller(Conflict with DARPA KAIROS PI Meeting, no Martha, Susan, Piyush, Akanksha or Ghazaleh)
03/31/2021Capstone ProjectsÌýÌý
04/14/2021Toward Broad and Deep Language Processing for Intelligent SystemsMarjorie McShaneThe early vision of AI included the goal of endowing intelligent systems with human-like language processing capabilities. This proved harder than expected, leading the vast majority of natural language processing practitioners to pursue less ambitious, shorter-term goals. Whereas the utility of human-like language processing is unquestionable, its feasibility is quite justifiably questioned. In this talk, I will not only argue that some approximation of human-like language processing is possible, I will present a program of R&D that is working on making it a reality. This vision, as well as progress to date, is described in the book Linguistics for the Age of AI (MIT Press, 2021), whose digital version is open access through the MIT Press website.
04/21/2021Multimodal SRLAbhidip BhattacharyyaScene understanding is a critical goal of Computer Vision and object recognition is an important element. Attention based encoder-decoder architectures have improved the performance of many vision-language models. However, the semantics of the images has largely been overlooked in designing these systems. As a result, vision-language systems recommend one fixed semantic interpretation for a particular image. Hence an image will always produce or retrieve a fixed description. This contrasts with the variety of expressions humans can generate when describing the same scene. To bridge this description gap we use semantic role labels (SRL) as our semantic cues for both images and text. Semantic roles enable richer representation of an image and the corresponding text in the shared space. With help of SRL we are able to achieve better performance in cross-modal retrieval. Also SRL enables generating diverse descriptions for a given image.
04/28/2021Proposal: Event Coreference in Text & GraphsRehan AhmedÌý
05/05/2021ProposalSkatje MyersÌý