
Sudip Bhujel, PhD Student
Computer Science
sudipbhujel[at]uky[dot]edu
Reinforcement Learning Problems
The project implements various reinforcement learning problems using OpenAI Gym. It includes CartPole, LunarLander and other environments.
Privacy-Preserving RLHF for LLM-Driven Game Agents
PPRLHF is a research project that fine-tunes Large Language Models (LLMs) using Reinforcement Learning from Human Feedback (RLHF) combined with Differential Privacy to ensure privacy-preserving training. The system generates preference pairs from TextWorld game trajectories, trains a reward model, and then uses Proximal Policy Optimization (PPO) with LoRA adapters to align the LLM's behavior to optimal gameplay—demonstrating how RLHF and differential privacy can be integrated for secure and effective LLM alignment.
Regular Expression Compiler
A complete regular expression compiler that parses custom regex syntax, builds an Abstract Syntax Tree, and generates standalone C code for pattern matching. Supports advanced features like character classes, quantifiers, alternation, conjunction, negation, and Unicode.
Voting System with AI and Blockchain
Voting system with AI and Blockchain project implements Face Recognition to enable two factor authentication and provide voting action interface through Blockchain. It has smart contract to store voter's, candidate's details, result and perform voting action.