I am a final-year PhD candidate at the University of New South Wales, Sydney, Australia, affiliated with the ARC Training Centre in Data Analytics for Resources and Environment (DARE).
I am supervised by Dr. Rohitash Chandra, Dr. Sahani Pathiraja, and Professor Lucy Marshall. My research focuses on the synergy of process-based hydrological models with machine learning to advance environmental process modeling. I have developed hybrid and Bayesian deep learning approaches for rainfall–runoff modeling, flood forecasting, cyclone prediction, and groundwater flow emulation.
Alongside my PhD, I worked as a Climate Science Support Officer at the Bureau of Meteorology, implementing multivariate bias correction for climate projections and modernizing legacy workflows. Previously, I held roles as a Data Scientist and Machine Learning Engineer, designing and deploying scalable ML systems across various domains.
FEB 2023 - APR 2025
Implemented multivariate bias correction method (MRNBC) for the Australian Climate Service. Developed Python wrapper for FORTRAN-based bias correction and implemented parallel compute framework using Dask on NCI Gadi for large-scale climate datasets.
MAR 2022 - AUG 2022
Improved customer retention by identifying key drivers of repeat behavior. Built and deployed regressive tree models for predicting and optimizing logistics costs in online retail. Interpreted model predictions using Explainable AI methods (SHAP and LIME).
NOV 2019 - NOV 2021
Worked extensively with machine learning and data mining tools on Big Data technologies. Developed Deep Learning models for data quality issues detection including referential integrity failure and outlier detection using TensorFlow, Elasticsearch, and Hadoop.
JAN 2019 - NOV 2019
End-to-end development and deployment of Deep Learning pipelines in Apache MxNet using Kubernetes and Docker. Specialized in computer vision problems including object detection, tracking, and human pose estimation.
Deep reinforcement learning
Hierarchical RL approach for teaching maze-solving to humanoid robots using MuJoCo simulation.
Environmental Modelling & Software, 169, 105831 (2023)
A novel hybrid approach combining process-based hydrological models with deep learning for improved streamflow prediction.
Environmental Modelling & Software (2023): 105654
Implementation of variational Bayesian methods for uncertainty quantification in cyclone prediction using RNNs.
Applied Soft Computing (2022): 109528
Novel approach combining swarm optimization with tempered MCMC for training Bayesian neural networks.
Engineering Applications of Artificial Intelligence, 94, 103700 (2020)
Accelerated Bayesian inference using surrogate models to reduce computational overhead in neural network training.
ARC Training Centre in Data Analytics for Resources and Environments (DARE)
Full scholarship for PhD studies at UNSW Sydney
3-month funded internship at University of Sydney
Selected participant, Brisbane, Queensland
Atmos Techfest, BITS Pilani Hyderabad Campus - 2017
Shortlisted participant, LVPEI Hyderabad (MIT Media Labs) - 2016
I'm always open to discussing new opportunities, research collaborations, or interesting projects in data science and machine learning.
arpit.kapoor@unsw.edu.au
Sydney, Australia