Ege C. Kaya
PhD Candidate in Electrical and Computer Engineering, Purdue University
I work on reinforcement learning, optimization, and stochastic decision-making, with an emphasis on mathematically grounded algorithms for distributional RL, multi-objective and risk-sensitive decision-making, stochastic approximation, and robust discrete optimization.
Research
My current research develops theoretical foundations and algorithms for reinforcement learning and stochastic decision-making. I am especially interested in settings where the object of interest is richer than a scalar expected return, including distributional RL, multi-objective RL, risk-sensitive decision-making, and coupled-dynamics environments.
A second line of work studies stochastic approximation and optimization methods that arise in reinforcement learning, including categorical distributional temporal-difference learning, average-reward distributional RL, non-expansive fixed-point problems, and non-convex stochastic optimization under weak noise assumptions.
I am also currently learning and working on practical aspects of modern large language models, including decoder-only transformers, causal language modeling, supervised fine-tuning, post-training, evaluation, and the role of reinforcement learning and reward modeling in preference optimization.
Earlier in my PhD, I worked on robust and submodular optimization, with applications to sensor selection, multi-task subset selection, federated learning, and distributed online optimization.
Selected Papers and Preprints
- Joint MDPs and Reinforcement Learning in Coupled-Dynamics Environments UAI 2026 Oral Uncertainty in Artificial Intelligence, 2026. Oral presentation. Top 2.2% of 1,087 submissions. arXiv
- Stochastic Dominance Driven First-Order Policy Optimization for Multi-Objective Reinforcement Learning Uncertainty in Artificial Intelligence, 2026.
- A Finite-Iteration Theory for Asynchronous Categorical Distributional Temporal-Difference Learning Preprint, 2026. arXiv
- Quotient-Categorical Representations for Bellman-Compatible Average-Reward Distributional Reinforcement Learning Preprint, 2026. arXiv
- Lower Bounds and Proximally Anchored SGD for Non-Convex Minimization Under Unbounded Variance Preprint, 2026. arXiv
- Randomized Greedy Methods for Weak Submodular Sensor Selection with Robustness Considerations Automatica, 2025.
- Localized Distributional Robustness in Submodular Multi-Task Subset Selection IEEE Transactions on Signal Processing, 2024.
Earlier Work
- Equitable Client Selection in Federated Learning via Truncated Submodular Maximization. IEEE CDC, 2024.
- Relative Entropy Regularization for Robust Submodular Multi-Task Subset Selection. Allerton, 2023.
- Communication-Constrained Exchange of Zeroth-Order Information with Application to Collaborative Target Tracking. ICASSP, 2023.
- High Probability Guarantees for Submodular Maximization via Boosted Stochastic Greedy. Asilomar, 2023.
- Communication-Efficient Zeroth-Order Distributed Online Optimization: Algorithm, Theory, and Applications. IEEE Access, 2023.