Projects

Linguistics Lab | I-Complexity

This repository contains a set of scripts and functions that were written for the i-complexity project of the Brain Cognition and Language Lab at the University of California, San Diego. These scripts allow for the manipulation of data from Unimorph that contain word forms and their corresponding dimensions and cases. The goal was to allow for easy manipulation when changing the in-column sort order of word dimensional features.

Every few days, I receive a Daily Digest from Medium.com containing suggested articles. While I enjoy this feature, many articles come with titles reeking of click-bait. I created this script in order to make the sifting process more efficient by summarizing every article in the email. This makes it pretty to understand which articles you might actually want to read, and which you can pass on. I made this with the intent of having it be the first thing people open when they start their day, so the info is presented right in their web browser, along with links to LinkedIn, Bloomberg, and Gmail.

Brand Perception Using Spotify and NLP (15 min.)

Here I collect data and identify metrics that can be used to identify “coolness” within a brand based off spotify audio features, and song lyric data. As a side goal, I sought to develop a framework that could establish a methodology for a web app that will allow users to determine their own coolness based off of Spotify data. In order to evaluate myself, I have chosen to treat myself as a brand by using audio features as performance metrics to be evaluated. I was able to identify 5 traits that suggested my “brand” is perceived as cool, 2 that showed I was un-cool, and 3 that are to be determined. To improve my brand, I should listen to more music that is associated with high status, containing themes that encompass money, status, and iconicity. Additionally, I should listen to more popular music.

Health Inspection Score Inflation (10 min.)

Going to a restaurant shouldn’t result in getting you sick. In examining the Wake County health department’s health inspections, I found that as critical health violations increased….. so did health inspection ratings. Not only that, but the common violations are ALSO the ones most often repeated. Violations such as improper food handling, unclean surfaces, and unsafe heating/cooling are all happening on a consistent basis. By looking at trends within the inspectors theselves, I found that some individuals are more likely than others to inflate scores, but this trend is present regardless of experience level. I present Wake County with suggestions to increase objectivity and integrity in ratings in order to provide accurate ratings to underperforming restaurants.

Dubai Public Transit and Climate Change

In this notebook I analyze Dubai’s vulnerability to both direct and indirect damages resulting from climate change. Most notably is network complexity, which can lead to convoluted bus routes, increased wait times, and increased damage due to high use. I identify several Dubai bus stations that are at high risk of failure due to network complexity, as they carry a high proportion of Dubai RTA’s weight. I suggest that Dubai RTA update the infrastructure at these stations, as well as distribute the number of routes that stop at these stations to alleviate load and thus complexity. Such modifications will lead to a climate resilient model of public transportation.

Content Based Recommender

In my free time, I like to watch a bit of anime, but with so many shows to choose from it can be hard to pick something to watch. Here, I use a content-based approach towards building an anime recommender. Relevant metrics included a long list of genres, member ratings, and episode length. Once the model was built, I found that my model performed very close to that of the original database that the data was obtained from, despite not having access to user-specific data.

The Infamous Titanic Dataset (5-10 min.)

I’ve been pretty busy over the summer with my work at the UCSD Brain and Cognition Lab and partcipating in the Facebook Data Challenge. On top of that I’ve been doing a lot of self-studying to try and pick up a few new skills that I can apply to my work. I recently ran into some free time and I was thinking about how funny/infamous the Titanic dataset is in pockets of the Data Science community. Next thing I knew: I had predicted which passengers aboard the Titanic will survive and which will perish. Definitely lacks impact, but this one was just for fun while I finish up projects in my professional work!

Image Editor and Generator

This is an older project, in which I built a couple of functions that allow the user to distort their images and add text on top of them. The idea was to replicate the “deep fried meme” look that was popular at the time. I implemented functionality for users to add their own captions in different types of fonts to any of their images so it’s easy and fun to customize. These functions also serve as a great streamlined approach to just add text to images as well, free of any deep frying! It’s something fun to play around with, and you can even save your favorite creations!

Contact