Prithvijit Chattopadhyay

I am a 1st year CS PhD student at Georgia Tech, advised by Prof. Judy Hoffman. I also closely collaborate with Prof. Devi Parikh and Prof. Dhruv Batra. I recently earned my Masters in Computer Science (focus on Machine Learning) from Georgia Tech, advised by Prof. Devi Parikh. Prior to joining Georgia Tech, I was working as a Research Assistant in the Computer Vision Machine Learning and Perception Lab (CVMLP) at Virginia Tech, advised by Prof. Devi Parikh and Prof. Dhruv Batra. I earned my Bachelors in Electrical Engineering in 2016 from Delhi Technological University, India.

In the past couple of years, I have had the fortune to intern / conduct research at Deep Learning Group, Microsoft Research Redmond (Summer 2018) mentored by Hamid Palangi; Robotics Research Lab, IIIT Hyderabad (Winter 2014) mentored by Dr. K. Madhava Krishna and Indian Association for the Cultivation of Science (IACS), Kolkata (Summer 2014) mentored by Dr. Soumitra Sengupta on a diverse set of topics - ranging from vision & language to robotics to theoretical physics.

I occasionally play the percussion instrument Tabla. I am very passionate about movies. I love to break them down shot-by-shot and analyze them. Every single frame is important.

Email  /  CV  /  Google Scholar  /  LinkedIn  /  Github  /  Twitter  /  Instagram

profile photo
Research

The problems that I work on generally lie at the intersection of Computer Vision and Natural Language Processing. More specifically, I am interested in developing intelligent systems (including applications of Machine Learning and Reinforcement Learning) that

  • can perceive and reason based on multimodal sensory information
  • are interpretable so that predictions made by such systems can be explained
  • are transferable so that they can be adapted across different domains with ease and limited supervision

Representative papers are listed under Papers.

Achievements
  • Recognized as one of the highest-scoring reviewers for NeurIPS 2019!
  • Recognized as an outstanding reviewer for ICLR 2019!
  • Recognized to be among the top 20 percent highest scoring reviewers for NeurIPS 2018!
  • Awarded the College of Computing's MS Research Award at Georgia Tech!
  • Our team won VT Hacks 2017, a Major League Hacking Event, 2017!
  • Our undergraduate team, DTU-AUV, qualified for the semi-finals at AUVSI Robosub 2014!
  • Awarded Merit Scholarships from 2012-2014 for undergraduate academic performance!
  • Selected for KVPY and INSPIRE Fellowships, 2012 for undergraduate studies in basic sciences!
  • Placed among the top 1 percent students in the country in INPhO 2012!
  • Selected for rigorous mathematical training camps conducted by mathematicians from Bhabha Atomic Research Center (BARC) and Indian Institute of Science (IISc) in 2012!
  • Selected for CSIR Programme on Youth Leadership in Science, 2010!
News
Papers (* joint first authors)
3DSP Improving Generative Visual Dialog by Answering Diverse Questions
Vishvak Murahari, Prithvijit Chattopadhyay, Dhruv Batra, Devi Parikh, Abhishek Das
EMNLP, 2019 (Poster); Visual Question Answering and Dialog Workshop, CVPR 2019 (Poster)
arxiv / code

While generative visual dialog models trained with self-talk based RL perform better at the associated downstream task, they suffer from repeated interactions -- resulting in saturation in improvements as the number of rounds increase. To counter this, we devise a simple auxiliary objective that incentivizes Q-Bot to ask diverse questions, thus reducing repetitions and in turn enabling A-Bot to explore a larger state space during RL i.e., be exposed to more visual concepts to talk about, and varied questions to answer.

3DSP DS-VIC: Unsupervised Discovery of Decision States for Transfer in RL
Nirbhay Modhe, Prithvijit Chattopadhyay, Mohit Sharma, Abhishek Das, Devi Parikh, Dhruv Batra, Ramakrishna Vedantam
arxiv Preprint, 2019; Task-Agnostic RL (TARL) Workshop, ICLR 2019 (Poster)
arxiv / TARL'19 Preliminary Version

We learn to identify decision states, namely the parsimonious set of states where decisions meaningfully affect the future states an agent can reach in an environment. We utilize the VIC framework, which maximizes an agent's 'empowerment', i.e. the ability to reliably reach a diverse set of states -- and formulate a sandwich bound on the empowerment objective that allows identification of decision states. Unlike previous work, our decision states are discovered without extrinsic rewards -- simply by interacting with the world. Our results show that our decision states are: (1) often interpretable, and (2) lead to better exploration on downstream goal-driven tasks in partially observable environments.

3DSP EvalAI: Towards Better Evaluation Systems for AI Agents
Deshraj Yadav, Rishabh Jain, Harsh Agrawal, Prithvijit Chattopadhyay, Taranjeet Singh, Akash Jain, Shiv Baran Singh, Stefan Lee, Dhruv Batra
arxiv Preprint, 2019
arxiv / code

We introduce EvalAI, an open source platform for evaluating and comparing machine learning (ML) and artificial intelligence algorithms (AI) at scale. EvalAI is built to provide a scalable solution to the research community to fulfill the critical need of evaluating machine learning models and agents acting in an environment against annotations or with a human-in-the-loop. This will help researchers, students, and data scientists to create, collaborate, and participate in AI challenges organized around the globe.

3DSP Choose Your Neuron: Incorporating Domain-Knowledge through Neuron-Importance
Ramprasaath R. Selvaraju*, Prithvijit Chattopadhyay*, Mohamed Elhoseiny, Tilak Sharma, Dhruv Batra, Devi Parikh, Stefan Lee
ECCV, 2018 (Poster); Continual Learning Workshop, NeurIPS 2018 (Poster); Visually Grounded Interaction and Language (ViGIL) Workshop, NeurIPS 2018 (Poster)
arxiv / blogpost / code

We introduce a simple, efficient zero-shot learning approach -- NIWT -- based on the observation that individual neurons in CNNs have been shown to implicitly learn a dictionary of semantically meaningful concepts (simple textures and shapes to whole or partial objects). NIWT learns to map domain knowledge about "unseen" classes onto this dictionary of learned concepts and optimizes for network parameters that can effectively combine these concepts - essentially learning classifiers by discovering and composing learned semantic concepts in deep networks.

3DSP Do explanation modalities make VQA Models more predictable to a human?
Arjun Chandrasekaran*, Viraj Prabhu*, Deshraj Yadav*, Prithvijit Chattopadhyay*, Devi Parikh
EMNLP, 2018 (Poster)
arxiv

A rich line of research attempts to make deep neural networks more transparent by generating human-interpretable 'explanations' of their decision process, especially for interactive tasks like Visual Question Answering (VQA). In this work, we analyze if existing explanations indeed make a VQA model -- its responses as well as failures -- more predictable to a human.

3DSP Evaluating Visual Conversational Agents via Cooperative Human-AI Games
Prithvijit Chattopadhyay*, Deshraj Yadav*, Viraj Prabhu, Arjun Chandrasekaran, Abhishek Das, Stefan Lee, Dhruv Batra, Devi Parikh
HCOMP, 2017 (Oral)
arxiv / code

We design a cooperative game - GuessWhich - to measure human-AI team performance in the specific context of the AI being a visual conversational agent. GuessWhich involves live interaction between the human and the AI and is designed to gauge the extent to which progress in isolated metrics for AI (& AI-AI teams) transfers to human-AI collaborative scenarios.

3DSP It Takes Two to Tango: Towards Theory of AI's Mind
Arjun Chandrasekaranu*, Deshraj Yadav*, Prithvijit Chattopadhyay*, Viraj Prabhu*, Devi Parikh
Chalearn Looking at People Workshop, CVPR, 2017 (Oral)
arxiv / code

To effectively leverage the progress in Artificial Intelligence (AI) to make our lives more productive, it is important for humans and AI to work well together in a team. In this work, we argue that for human-AI teams to be effective, in addition to making AI more accurate and human-like, humans must also develop a theory of AI's mind (ToAIM) - get to know its strengths, weaknesses, beliefs, and quirks.

3DSP Counting Everyday Objects in Everyday Scenes
Prithvijit Chattopadhyay*, Ramakrishna Vedantam*, Ramprasaath R. Selvaraju, Dhruv Batra, Devi Parikh
CVPR, 2017 (Spotlight)
arxiv / code

We study the numerosity of object classes in natural, everyday images and build dedicated models for counting designed to tackle the large variance in counts, appearances, and scales of objects found in natural scenes. We propose a contextual counting approach inspired by the phenomenon of subitizing - the ability of humans to make quick assessments of counts given a perceptual signal, for small count values.


(Design and CSS courtesy: Jon Barron and Amlaan Bhoi)