CV
[Curriculum Vitaé]
Education
- Ph.D. in Computer Science, University of Buffalo, The State University of New York, June 2018
- M.S. in Computer Science, University of Buffalo, The State University of New York, June 2012
- Topic: A Cold Start Recommendation System Using Item Correlation and User Similarity [Report]
- Advisor: Rohini Srihari
- GPA: 4.0 out of 4.0 [Transcript] | Department rank: 1 out of 555
- B.Tech. in Computer Science, National Institute of Technology, Rourkela, May 2005
- Specialization: Discrete Mathematics and Algorithms
- Advisor: Bibhudatta Sahoo
- Cumulative Score: 77% (First class with honors) [Transcript] | Joint Entrance Exam rank: 22 out of 400,000
Honors
- Completed NLP / NLU and RL courses as part of AI certification from Stanford University (Sunnyvale, CA - 2022)
- Was invited to and attended the prestigious 2022 CIFAR DLRL School and OxML 2022 (Sunnyvale, CA - 2022)
- Completed NLP certification from NVIDIA DLI and Full Stack DL certification (Sunnyvale, CA - 2021)
- Reviewer for ICLR (2021 - present), ACL (2021 - present) and NeurIPS (2021 - present) (Sunnyvale, CA - 2021)
- Was invited to and attended the prestigious Theory of Reinforcement Learning program (Sunnyvale, CA - 2020)
- Reviewer for ICML (2020 - present), ACL (2020 - present) and EMNLP 2021 (Sunnyvale, CA - 2020)
- Was invited to and attended the prestigious Foundations of Deep Learning program (Berkeley, CA - 2019)
- Won a NSF Junior Researcher Award to attend CBMS Conference on Sparse Recovery (Las Cruces, NM - 2017)
- Became a NVIDIA GPU Educator (Santa Clara, CA - 2016)
- Won a NSF Student Travel Award to attend IEEE Big Data 2015 (Santa Clara, CA - 2015)
- Won a rare Research Assistant-ship covering my second year as a Masters student (Buffalo, NY - 2013)
- Won the Star Performer of the Month award in Cognizant (Kolkata, India - 2008)
- Scored 99 percentile in Zonal, Discipline & National categories of National IT Aptitude Test (Rourkela, India - 2004)
- Subsequently won a Bhavishya Jyoti Scholarship for above (Rourkela, India - 2004)
Skills
Python | PyTorch | TensorFlow | Keras | C/C++ | Apache MapReduce | Scala | CUDA | Hive
Experience
- ML/LLM Engineer, Meta (Jan 2024 - Present)
- Ranking & Recommendation @ Meta AI
- Senior AI Scientist/Engineer, LinkedIn (Jul 2021 - Sep 2023)
- Conditional label generation using LLMs
- Built prompt generation pipelines for large scale LLM inference to assist in conditional label generation via in-context learning.
- Worked towards instruction fine-tuning and pre-training of in-house LLMs.
- Special Interest Group (SIG)
- Built a novel unsupervised GNN framework which learns holistic member embeddings via incorporating edge based features in the graph convolution, which when used as seed both accelerated model training speed and improved model performance for clients.
- Developed a novel strategy for using offline RL methods to build Task-oriented dialogue agents. [Slides]
- Standardization/Oribi/Groups
- POC for Education, Degree and Field of Study (FoS) sub-domains in Standardization team.
- Tech Lead for SIG/Oribi teams (10+ engineers), wherein work with product managers to convert business/product requirements into practical/scalable technical solutions, applying different ML/DL, GNN and NLP techniques to solve related problems.
- Led firefighting efforts to quickly resolve P0 issues affecting 725K+ and 183K members which resulted in $5M+ revenue gain.
- Improved average coverage of education taxonomy from 74% to 77.2%, which measures to be +5%.
- Built relevance-based models which significantly improved group post contributions (+19.23%) and consumption (+22.18%).
- Scientist I, Amobee (Mar 2020 - Jul 2021)
- Developed a novel bidding strategy based on Win Price (WP) estimation
- Developed and productionized a novel bidding strategy using nonlinear ML based approaches for estimating WP.
- Built a Factorization Machine (FM/FFM) based ML pipeline for usage in production
- Led efforts to build a FM/FFM based ML pipeline using a novel sparse matrix formulation that can handle high modality features.
- Incorporating user embeddings into existing ML/DL models to improve performance
- Trained BERT/GAN based generative models to construct user embeddings for usage by our existing models.
- Research Scientist, Criteo AI Lab (Jul 2018 - Dec 2019)
- Improve Click-through and Sales prediction
- Enhanced our existing production Click-through and Sales prediction pipeline using nonlinear ML techniques. Improved stability of our new models significantly from +50% to +5%. A/B test using new models resulted in +3-6% uplift in long-term RexT on all platforms.
- Theoretical aspects of Deep Learning (DL) (working with Noureddine El Karoui)
- Working towards understanding kernel and manifold specific aspects of theoretical deep learning.
- Resolving the posterior-collapse issue in Seq2Seq learning
- Developed a quantization based approach towards resolving the posterior-collapse issue. [Paper]
- Research Assistant, The Research Foundation for SUNY (Jan 2018 - May 2018)
- Parallelized Hierarchical Clustering (worked with Haimonti Dutta)
- Worked towards developing a novel parallel hierarchical clustering algorithm using activization strategies.
- Kernel Manifold Learning (worked with Varun Chandola)
- Developed novel Manifold Learning techniques motivated from Gaussian Processes. [Paper]
- Research Scientist Intern, Criteo Research (May 2017 - Dec 2017)
- Cross-domain Query-Product (QP) modeling(worked with Suju Rajan)
- Developed a robust QP model across retailer domains via Domain Adaptation and Optimal Transport based approaches. [Poster]
- Research Assistant, The Research Foundation for SUNY (Jan 2017 - May 2017)
- Representation learning via DL/NLSDR (worked with Varun Chandola / Nils Napp / Jaroslaw Zola)
- Interpreting complex nonlinear processes using DL/NLSDR methods. [Paper][Demo 1][Demo 2]
- Incorporating complex constraints for sparse Logistic Regression (working with Varun Chandola)
- Worked towards solving the sparse Logistic Regression problem with hierarchical tree-based constraints.
- Teaching Assistant, University of Buffalo, The State University of New York (Sep 2016 - Dec 2016)
- Teaching Assistant for CSE 574 Machine Learning.
- Machine Learning Algorithm Design Intern, BD Biosciences (Jun 2016 - Aug 2016)
- Fast Clustering of Flow Cytometry (FC) data
- Upscaled BD’s clustering framework for high dimensional FC data upto ~16x. [Poster][Slides]
- Teaching Assistant, University of Buffalo, The State University of New York (Jan 2016 - May 2016)
- Teaching Assistant for CSE 574 Machine Learning.
- Research Assistant, University of Buffalo, The State University of New York (Jun 2013 - Dec 2015)
- Nonlinear Spectral Dimensionality Reduction (worked with Varun Chandola / Jaroslaw Zola / Nils Napp)
- Developed scalable NLSDR methods in a streaming setting. [Paper]
- Social Network Modeling (worked with Varun Chandola)
- Developed the xKPGM model for social network modeling. [Paper]
- Variance Reduction techniques in Distributed Optimization (worked with Haimonti Dutta / Varun Chandola)
- Worked towards developing novel variance reduction techniques for the ERM problem.
- Understanding Rumor Propagation in Social Networks (worked with Shambhu Upadhyaya / Varun Chandola)
- Worked towards modeling rumor propagation in social networks.
- Volcanic Flow Prediction (worked with Abani Patra / Varun Chandola / Paul Bauman)
- Developed a novel Gaussian Process based model for prediction of volcanic flow using GPUs.
- Teaching Assistant, University of Buffalo, The State University of New York (Sep 2012 - May 2013)
- Teaching Assistant for CSE 510 Introduction to Robotic Algorithms.
- Research Assistant, The Research Foundation for SUNY (Jun 2011 - Aug 2012)
- Localization via Entropy Reduction (worked with Robert Platt)
- Developed a novel active localization technique via sequential reduction of entropy using OpenRAVE/ROS. [Report][Demo]
- AIRS (worked with Rakesh Nagi)
- Name2Face (worked with Bina Ramamurthy)
- Built Name2Face, a cloud application consuming Microsoft cloud services.
- Research Assistant, University of Buffalo, The State University of New York (Nov 2010 - May 2011)
- Persistent URLs (PURLs) (worked with Alan Ruttenberg)
- Enhanced the PURLs implementation to include more intelligence.
- Protege 4.1 plugin development (worked with Alan Ruttenberg)
- Built plugins for Protege 4.1 for processing Ontologies.
- Associate, Cognizant (Jun 2005 - Jul 2010)
- Developed ExProc, a tool for processing excel documents.
- Built SuperAgent 4.0, a tool for making reservations which interacts with the Novasol and Cuendet servers.
- Developed Universal Agent Tool along with my team, a tool which aimed at merging operations for various CRS.
- Contributed to a white paper Using Venn Diagrams to capture Business Requirements.
Additional
Publications
Mahapatra, Suchismit, and Varun Chandola. "Learning manifolds from non-stationary streams." Journal of Big Data 11, no. 1 (2024): 42.
Mahapatra, Suchismit, Vladimir Blagojevic, Pablo Bertorello, and Prasanna Kumar. "New Methods & Metrics for LFQA tasks." arXiv preprint arXiv:2112.13432 (2021).
Doan, Khoa D., Saurav Manchanda, Suchismit Mahapatra, and Chandan K. Reddy. "Interpretable graph similarity computation via differentiable optimal alignment of node embeddings." In Proceedings of the 44th International ACM SIGIR Conference on Research and Development in Information Retrieval, pp. 665-674. 2021.
Zhao, Yang, Ping Yu, Suchismit Mahapatra, Qinliang Su, and Changyou Chen. "Discretized Bottleneck: Posterior-Collapse-Free Sequence-to-Sequence Learning." (2020).
Mahapatra, Suchismit, and Varun Chandola. "S-Isomap++: Multi manifold learning from streaming data." In 2017 IEEE International Conference on Big Data (Big Data), pp. 716-725. IEEE, 2017.
Schoeneman, Frank, Suchismit Mahapatra, Varun Chandola, Nils Napp, and Jaroslaw Zola. "Error metrics for learning reliable manifolds from streaming data." In Proceedings of the 2017 SIAM International Conference on Data Mining, pp. 750-758. Society for Industrial and Applied Mathematics, 2017.
Mahapatra, Suchismit, and Varun Chandola. "Modeling graphs using a mixture of Kronecker models." In 2015 IEEE international conference on big data (big data), pp. 727-736. IEEE, 2015.
Talks
December 12, 2017
Talk at 2017 IEEE International Conference on Big Data, Boston, Massachusetts
April 28, 2017
Talk at 2017 SIAM International Conference on Data Mining, Houston, Texas
November 04, 2016
Talk at UB Department of Computer Science and Engineering, Buffalo, New York
August 15, 2016
Talk at BD Biosciences, San Jose, California
October 30, 2015
Talk at 2015 IEEE International Conference on Big Data, Santa Clara, California
May 04, 2012
Talk at iRobot Corporation, Bedford, Massachusetts
Teaching
October 10, 2019
class at ML Boot Camp, Criteo Research, Palo Alto Research Center
October 01, 2019
class at ML Boot Camp, Criteo Research, Palo Alto Research Center
April 11, 2019
class at ML Boot Camp, Criteo Research, Palo Alto Research Center
April 02, 2019
class at ML Boot Camp, Criteo Research, Palo Alto Research Center
October 03, 2018
class at ML Boot Camp, Criteo Research, Palo Alto Research Center