Yi (Jerry) Li

I am a fourth-year undergraduate student at the University of California San Diego, pursuing a double major in Data Science at the Halicioglu Data Science Institute (HDSI) and Cognitive Science with a specialization in Machine Learning and Neural Computation at the CogSci Department, along with a minor in Computer Engineering at the Electrical and Computer Engineering Department.

My primary research interest is centered on the development and optimization of multimodal large language models (LLMs), and the application of advanced ML & AI techniques to the biomedical and healthcare sectors. Currently, I am working in Professor Zhuowen Tu's Machine Learning, Perception, and Cognition Lab (mlPC), mentored by Dr. Yifan Xu on a Visual LLM project. I am also working in the Qualcomm Institute Research Organization, contributing to a Data Analysis in Public Health project, advised by Dr. Ganz Chockalingam and Dr. Marie-Laure Charpignon. Prior to this, I am fortunate to work in Professor Terry Sejnowski's Computational Neurobiology Laboratory (CNL), mentored by Dr. Margot Wagner on an applied deep learning project.

I have served as a Teaching Assistant for several courses across the Computer Science and Engineering department and the Halicioglu Data Science Institute (HDSI), helped over 1,000 individual students understand course materials, and worked with 4 teaching faculty members to enhance course experiences for students. I was primarily mentored by Professor Marina Langlois for teaching.

Email  /  GitHub  /  HuggingFace  /  Google Scholar  /  LinkedIn

profile photo

Research

I'm interested in large language models, computer vision, machine learning, and the application of artificial intelligence.

project image

BLIVA: A Simple Multimodal LLM for Better Handling of Text-Rich Visual Questions


Wenbo Hu*, Yifan Xu*, Yi Li, Weiyue Li, Zeyuan Chen, Zhuowen Tu
AAAI 2024, 2023
website / arxiv / code

We introduces BLIVA, an augmented version of InstructBLIP with Visual Assistant. BLIVA incorporates the query embeddings from InstructBLIP and also directly projects encoded patch embeddings into the LLM, a technique inspired by LLaVA. This approach ensures that the model captures intricate details potentially missed during the query decoding process. Empirical evidence demonstrates that our model, BLIVA, significantly enhances performance in processing text-rich VQA benchmarks (up to 17.76% in OCR-VQA benchmark) and in undertaking typical VQA benchmarks (up to 7.9% in Visual Spatial Reasoning benchmark), comparing to our baseline InstructBLIP. BLIVA demonstrates significant capability in decoding realworld images, irrespective of text presence.




Teaching

Below is a comprehensive list of classes where I previously served as a teaching assistant. Instructors are listed in the order I collaborated with them. Where available, evaluations from both instructors and students are attached.

teaching image

Recommender Systems and Web Mining


Julian McAuley
UCSD CSE 158 FA23
website

Modern techniques for data mining and predictive analytics are the focus. The course emphasizes analyzing real-world datasets, creating functional systems, and applying contemporary machine learning research concepts in practical scenarios.

teaching image

Programming and Basic Data Structures for Data Science


Marina Langlois
UCSD DSC 20 WI22
website

This course offers insight into the foundational structures of the programs, algorithms, and languages essential to data science. Building on the computational concepts from DSC 10, it further introduces students to abstraction techniques. Taught in Python, the curriculum delves into topics like recursion, advanced functions, function composition, object-oriented design, interpreters, classes, and basic data structures such as arrays, lists, and linked lists.

teaching image

Principles of Data Science


Suraj Rampure
UCSD DSC 10 FA21
evaluation / website

The introductory course in data science familiarizes students with data exploration, statistical inference, and forecasting. Python is introduced as the go-to language for handling tabular data, generating visuals, and running simulations. By engaging in homework tasks and projects, students hone their analytical prowess using real-world datasets from diverse fields.

teaching image

Principles of Data Science


Justin Eldridge
UCSD DSC 10 S121
evaluation / website

The introductory course in data science familiarizes students with data exploration, statistical inference, and forecasting. Python is introduced as the go-to language for handling tabular data, generating visuals, and running simulations. By engaging in homework tasks and projects, students hone their analytical prowess using real-world datasets from diverse fields.




Industry

Update soon. (collaborated projects with industry people)




Other Projects - Deep Learning

These include coursework, projects, and other research-related tasks not intended for publication.

project image

Automatic Image Annotation


Yi Li, Weiyue Li, Linghang Kong, Yibo Wei, Shuangmu Wu
Deep Learning, UCSD, 2022
paper / code

In this study, an algorithm was designed using PyTorch to caption images using various Recurrent Neural Network (RNN) models, including LSTM, Vanilla RNN, and a custom ‘Architecture 2’. These models were trained on a subset of the COCO Image Captioning Task dataset due to GPU limitations. Model performance was evaluated using metrics such as cross entropy loss, BLEU-1, and BLEU-4 scores. The optimal Vanilla RNN model yielded a BLEU-1 score of 68.3% and BLEU-4 score of 8.9% with stochastic sampling at 0.001 temperature. Similarly, ‘Architecture 2’ with a hidden size of 1024 achieved comparable BLEU scores. Key findings reveal the hidden size’s significant impact on performance, while embedding size showed lesser influence. LSTM and Architecture 2 models with advanced gating mechanisms outperformed Vanilla RNN. Introducing images at every time step was beneficial when evaluated using the BLEU score

project image

Optimization and Evaluation of Multi-layer Neural Networks: Exploring Regularization, Learning Rates, and Topologies


Yi Li, Weiyue Li, Linghang Kong
Deep Learning, UCSD, 2022
paper / code

In this study, we implemented a multi-layer neural network that features forward and backward propagation, several regularization techniques, and momentum-based optimization. Our goal was to classify Japanese Hiragana handwritten characters from the KMNIST dataset using a softmax output layer. We employed one-fold cross-validation to assess the model’s performance and incorporated various regularization methods. Our most effective model utilized ReLU activation functions and achieved an accuracy of 0.8688. After making further adjustments to the architecture, including changes to the layer count and hidden units, we observed a test set accuracy of 0.8626.




Other Projects - Reinforcement Learning

These include coursework, projects, and other research-related tasks not intended for publication.

project image

Enhancing Gomoku Gameplay with Monte Carlo Tree Search: Implementation and Evaluation


Yi Li
Reinforcement Learning, UCSD, 2023
video / code

The Monte Carlo Tree Search (MCTS) is implemented to enhance the artificial intelligence of the game Gomoku, a classic board game where players aim to form a line of five pieces of their color. The MCTS not only decides on an action but also provides a table showcasing winning rates for all potential actions. Two primary evaluation metrics are used: one measures the accuracy of the winning rate table and the other pits the AI against a random-play agent.

project image

Optimizing Blackjack Gameplay: An Exploration of Reinforcement Learning Strategies


Yi Li
Reinforcement Learning, UCSD, 2023
code

This study focuses on the implementation and comparison of reinforcement learning algorithms in the game of Blackjack. Utilizing a game engine, based on a simplified version of standard Blackjack rules, this investigation covers three reinforcement learning strategies: Monte Carlo (MC) policy evaluation, Temporal-Difference (TD) policy evaluation, and Q-Learning. The primary objective is to understand the effectiveness of these techniques in deriving an optimal playing strategy.




Other Projects - Search Algorithm

These include coursework, projects, and other research-related tasks not intended for publication.

project image

Optimizing 2048 Game Performance: An Exploration of Expectimax Search and Heuristic Evaluations


Yi Li
Search Algorithm, UCSD, 2023
video / code

In this study, we explore the development and optimization of a game AI for the 2048 puzzle game, employing the expectimax search algorithm. The proposed AI model treats the player as a “max player,” making decisions that maximize potential outcomes, while the computer acts as a “chance player”, placing a 2-tile in a random open spot. The AI’s decision-making relies on a depth-3 game tree, wherein the tree is structured with alternating player and computer moves, culminating in terminal nodes that evaluate the game state score. For further optimization, we suggest potential improvements to the search depth and the evaluation function.

project image

Terrain-Aware Pathfinding in Grid Worlds: A Comparative Study of Search Algorithms and Thematic Test Cases


Yi Li
Search Algorithm, UCSD, 2023
code

In this study, we present a comprehensive approach to pathfinding in a grid world environment. Given a grid with specific terrains such as grass and puddles, our objective is to determine optimal paths from a starting point to a goal. We approach the problem by implementing and evaluating four distinct search algorithms: Depth First Search (DFS), Breadth First Search (BFS), Uniform Cost Search (UCS), and A* Search with Manhattan Distance as the heuristic. The program leverages the PyGame library for visualization and is set up in a Python virtual environment.




Other Projects - General Algorithm

These include coursework, projects, and other research-related tasks not intended for publication.

project image

Backtracking and SAT Solving: Innovations in Sudoku Puzzle Solutions


Yi Li
General Algorithm, UCSD, 2023
code

The study revolves around the development and optimization of a constraint solver tailored for Sudoku puzzles, deploying backtracking search as its core mechanism. A heuristic, specifically focusing on selecting an unassigned variable with the smallest domain, is proposed to refine the decision-making process. Different testing scenarios are introduced, from basic propagation-only tests to complex ones that necessitate both propagation and search methodologies. Moreover, an advanced exploration in the SAT solving sphere is presented. This exploration entails encoding Sudoku as propositional logic formulas in conjunctive normal forms and then leveraging state-of-the-art SAT solvers, such as PicoSAT and cryptominisat, for solution derivation. This approach not only illuminates the intricacies of Sudoku solving but also underscores potential methodologies for addressing its challenges.




Other Projects - Recommender System

These include coursework, projects, and other research-related tasks not intended for publication.

project image

Recipe Recommender System


Yi Li, Weiyue Li, Xiaoyue Wang, Ruoyu Hou
Recommender System, UCSD, 2022
paper / code

Personalization is becoming increasingly crucial for user experiences across various websites. In this paper, we first conduct exploratory data analysis on datasets sourced from food.com. Subsequently, we implemented several recommendation system models to recommend recipes to users, predict ratings using sentiment analysis, and forecast recipe categories. We established both a baseline and an enhanced model for each prediction task to facilitate comparisons. After optimizing the hyperparameters of our enhanced models, we attained a testing accuracy of 73.87% for the recipe recommendation model, a testing MSE of 0.96 for rating predictions, and a testing accuracy of 94.43% for category predictions. Furthermore, we propose future research directions to investigate and ascertain the reasons for the observed disparities between the baseline and enhanced models.




Other Projects - Computer Graphic

These include coursework, projects, and other research-related tasks not intended for publication.

project image

Shadow Mapping


Yi Li, Xiaoyue Wang
Computer Graphic, UCSD, 2022
video / code

In the evolution of Computer Graphic technology, accurately rendering 3D scenes is a central challenge. Shadows are critical in imparting realism to these scenes; their absence can render visual outputs incomplete or unnatural. This project delves into the “Shadow Mapping” algorithm, originally introduced by Lance Williams. By harnessing the extensive built-in capabilities of the OpenGL library, we employed this algorithm, emphasizing distant light as the primary light source. The methodology follows a two-step process: building the shadow map and its subsequent application. Our approach primarily centered around depth, leading to the creation of a “DepthShader” for storing shadow mapping information. Preliminary results showcased significant success in shadow representation, notably with shadows of certain objects. However, shadows from some structures, such as table legs, showed slight misalignments. Though time constraints posed limitations, future refinements could explore strategies like the Stratified Poisson Sampling of shadow map and the perspective shadow map algorithm. This investigation accentuates the importance of shadows in computer graphics, emphasizing the fine balance between algorithmic precision and visual realism.


Design and source code from Jon Barron's website. Forked from the Jekyll varient by Leonid Keselman.