about me

I am a third-year Ph.D. student at the University of Southern California under the advisement of Dr. Jesse Thomason in the GLAMOR Lab. My work leverages linguistic theory to learn robust sign language models.

5
publications

I have published computer science and linguistics research in the following areas:

  • scholarly document understanding
  • sign language linguistics
  • affective computing

6
years teaching

I have taught Computer Science—from elementary to graduate—since 2016. Topics include:

  • website and app development
  • programming fundamentals
  • computer architecture

7
research talks

Outside of teaching, I enjoy sharing my research to peers in forums such as:

  • Association for Computational Linguistics
  • Information Sciences Institute
  • USC Viterbi School of Engineering
  • Rhodes College Symposium

9
projects

Understanding Knowledge and Intent

  • evaluating machine common-sense in language models
  • predicting research reproducibility
  • finding pragmatic differences between disciplines

Understanding Face and Gesture

  • self-supervised ASL morpheme identification
  • understanding the grammar of facial expressions
  • causes and effects of ASL/English mixing

Understanding Emotion

  • emulating microskills in psychotherapy
  • emulating emotion-focused therapy
  • natural text generation with mixed emotion

selected projects

Identifying Sign Language Morphemes

I use state-of-the-art pose estimation and facial expression recognition to produce linguistically-informed representations of American Sign Language phonemes such as handshape and movement. I then use semi- and self-supervised learning techniques to approximate the meaning of phoneme combinations.

Predicting Research Reproducibility

Many research papers social and behavioral science do not reproduce, constituting a crisis for the field. This NSF-funded project seeks to model and predict paper reproducibility. My contributions increase pipeline performance by leveraging psycholinguistic features such as emotionality and coherency, as well as pragmatic features like document structure.

Evaluating Machine Common Sense

Pretrained language models like RoBERTa claim state-of-the-art performance on comprehension, inference, and generation. But do they have the most basic level of common sense? This work operationalizes common sense via cloze testing, and measures how disparate a language model's answers are. For example, "Leopards have ___ on their bodies." is supplied with "scars" and "tattoos" before "spots", indicating significant confusion as to what leopards look like.

résumé

Education

  • University of Southern California, Los Angeles, CA Ph.D. Student (3rd year), Computer Science (3.51/4.00 GPA)
  • Rhodes College, Memphis, TN Bachelor of Science, Computer Science (3.55/4.00 GPA)

Work Experience

Research Assistant
Fall 2019 – Present
  • Developed novel methods for (a) quantifying language model knowledge gaps and (b) abstract reasoning with knowledge graphs.
  • Developed a pipeline to predict research quality based on textual features. Entailed extracting and interpreting pragmatic and psycholinguistic features.
  • Established a novel research group to study American Sign Language morphology and how it can be used for translating unseen signs and complex grammatical constructions.
  • Student representative for diversity, equity, and inclusion policy
Teaching Assistant
Spring 2020 – Present
  • Lead a lab and hold office hours for CSCI102 “Fundamentals of Programming” and CSCI104 “Data Structures and Object Oriented Design”.
  • Manage the suite of automatic tests for compiling and grading each assignment
Teaching Assistant
Fall 2016 – Spring 2019
  • Tutor students in COMP231 “Introduction to Systems Programming”
Instructor
Fall 2016 – Fall 2019
  • Teach the fundamentals of programming, website development, and app development to 5th-12th grade students across 7 schools.

Skills

  • data & extraction: MySQL, JSON, Scrapy
  • programming: Python, C/C++, C#, Java, Jupyter
  • ai & ml: PyTorch, HuggingFace, SciKit Learn, TensorFlow, reinforcement learning
  • nlp: RegEx, spaCy, NLTK, WordNet, LIWC, GloVe, transformers, LSTMs
  • analysis: cluster analysis, probabilistic soft logic, factor analysis
  • prototyping: HTML/CSS, Javascript, Visual Studio
  • other & non-technical: psycholinguistics, computer vision, multimodal interaction

Contact

I am currently looking for internship opportunities for Summer 2023. If my portfolio interests you, please email me at kezarlee[at]gmail[dot]com or click the button below.