I'm a researcher at OpenAI where I work on large language models. Previously, I was a Principal Research Scientist at Salesforce Research in Palo Alto where I worked on both research and applications of Deep Learning and Natural Language Processing techniques. I received my PhD from Northwestern University in 2017 under the supervision of Prof. Jorge Nocedal and Prof. Andreas Waechter. For my PhD, I focused on efficiently finding solutions to Mathematical Optimization problems which are nonsmooth or stochastic. This includes several problems in Machine Learning and Deep Learning. |
Over the years, I have been fortunate to be able to work on a broad set of topics such as:
Multitask Learning (decaNLP)
Machine Translation (Weighted Transformer)
Paraphrasing (Unsupervised Paraphrasing)
Subword Tokenization (Char2Subword)
Model Security (Multilingual Stealing)
Extending Transformer Language Models to Protein Generation (ProGen)
Generalization and Deep Learning (Large Batch Training and Sharp Minima, PAC-Bayes Approach, Path Sampling)
Mathematical Optimization (Hybrid SGD-Adam Optimizer, L1-norm Minimization, Nonsmooth L-BFGS, Nonmonotone SGD, adaQN)
Medical AI (Breast Cancer Detection)
My Google Scholar contains a list of all my publications.
I am also interested in machine learning systems and tooling. I have also authored an engineering blog on efficient serving of BERT-like models.
Email: keskar.nitish@gmail.com