Recurrent Reinforcement Learning with Applications to Financial Trading

This project implements both recurrent and standard feedforward Proximal Policy Optimization (PPO) algorithms and compares their performance on two standard control environments. We develop a stock trading environment and train a recurrent PPO agent to maximize return on a basic stock trading task.

In collaboration with Anubhav Guha

Smooth reward of modified Half Cheetah experiment, averaged across 5 random starting seeds. The recurrent model outperforms the feedforward model.


Yearly returns of the recurrent policy, as compared to the yearly returns of each individual stock in the portfolio. For each valid day in the given trading range, the environment is started and run forward for 243 trading days.

An ODE to the Bicycle Model

In a high speed autonomous race, it is critical that the AI system controlling the vehicle has an accurate model of vehicle dynamics to make acceleration and steering commands. This project explored learning-based techniques (gradient descent parameter estimation and neural networks) to improve the accuracy and speed of a commonly-used dynamical model, the bicycle model.

In collaboration with Emily Wu, David Koplow, and Mark Olchanyi

Recipe Ingredient Recommender

Food is one of the universal languages. In this project, we explored an algorithm that draws on the conventional wisdom of existing dishes. While apples and cinnamon frequently appear together on dishes, strawberries and mayonnaise do not. Given a list of ingredients, this project uses topic modeling and neural networks to suggest additional ones that a chef may add to their dish.

In collaboration with James Peraire and Lucy Halperin


Photo Painterly Rendering

For the artistically challenged like me, painting a masterpiece seems out of reach... That's why this project explores computer-generated paintings. It implements an algorithm to render photographs into a seemingly painted image.

Air Instrument + Audio Synthesizer

As DJ YYY, I have experimented with music and mash-ups. In this project, I constructed a light-manipulated instrument using integrated circuits and analog light sensors. Also, I integrated this with software performing digital signal processing to add synthesizer effects (echo, reverb, and clipping) to audio input.

MIT Confession Scraper

When MIT Confessions was not posting enough, someone created MIT Timely Confessions. As a hard working MIT student, there was just simply not enough time in the day to check both accounts everyday. Therefore, this project is a scraper that can extract content from both pages and send unread confessions to designated recipients in a daily email.