About – CRIME IN LOS ANGELES

Research Question: “How does the socioeconomic status of neighborhoods in Los Angeles influence crime rates, and how has the COVID-19 pandemic impacted the relationship between socioeconomic status and crime?”

Sources

Our main source of data was from the data.gov website, and can be found here: https://catalog.data.gov/dataset/crime-data-from-2020-to-present. This was our main source of data for our project, and what we based the majority of our research questions off of. It lists off each of the individual crimes from Los Angeles dating back to 2020. More information can be found in the data overview section. We also used Google Scholar and the UCLA library in order to perform external research on our research questions, and further provide academic sources to enhance our arguments.

Processing

In order to make our visualizations, we used several different applications. For one, we utilized R in order to completely clean our data, such as using functions to automate processes, and even using R for some of the visualizations. We then utilized Tableau to create some of the bar charts and maps that we used throughout our project. Voyant was used for our word clouds, and Google Sheets was also implemented for several of our charts.

Presentation

Our presentation was splitting our narrative into several different pages. We wanted to start our website by giving off crucial information about the background of LA, and then slowly ease the reader into the four main pages, which follow a general sequential order. Each of the pages explains a part of the narrative, and our last page (the conclusion) is a call-to-action that encapsulates the entire argument of the website.

Bios

Meera Srinivasan is a senior at UCLA studying Sociology with an interest in education policy. In the future, she hopes to work in policy that addresses major gaps in education access within the U.S. Meera served as our project manager and helped guide our project in the right direction. She also worked on the analysis of Central and West L.A. where she dove into the socioeconomic factors that influence LA’s crime rates.

Kevin Hamakawa is a sophomore at UCLA studying Statistics with an interest in Data Science. His main role in the project was working with the data, whether it be cleaning the data for easier use, distributing segments of the data to the different team members, or experimenting with the visualization process of the data.

Abigail Cardenas is a junior at UCLA studying Spanish Community and Culture with an interest in the creator space and influencer marketing. Her role in this project serves as the content specialist, assuring there is a clear narrative and the audience has an understanding of the project. Abby specifically worked on assessing crime rates prior to the start of Covid, during, and the trends that surfaced through this time.

Izak Bunda is a sophomore at UCLA studying Computer Science with an interest in software engineering. His main role in the project was to facilitate the website creation and maintenance. Izak specifically worked on the background pages: LAPD Divisions, Race and Injustice, and Economics of LA.

Haochen He Haochen He is a junior at UCLA studying Cognitive Science and Global Studies with an interest in using technology to solve global issues. He serves as the data visualization specialist in this project to communicate the information effectively to the public. Haochen led the research and narrative on “Deep Dive on Types of Crimes”.

Data Overview, Cleaning, and Critique

Our dataset was found from data.gov, in which we decided to utilize a dataset that contained crime data from 2020 to the present. The link to the website can be found here: https://catalog.data.gov/dataset/crime-data-from-2020-to-present.

Overview: The dataset contained exactly 703,807 rows (each row being an individual crime) and 28 columns. To provide a quick breakdown of what these 28 columns included, some of the variables were the date reported of the crime, the location of the crime, the type of the crime, basic demographic statistics of the victim, the type of weapon used in the crime, etc..

Cleaning: One thing we quickly noticed was the large amount of variables in our dataset, and we wanted to reduce this and remove any unnecessary columns in order to reduce our data. We decided to ultimately stick with 15 columns to perform our analysis on, and utilized R in order to “clean” some of the formatting of the dates and location. For example, the “date” data was presented in DateTime (date + time) format, with the time always being 12:00, making this value fairly redundant. Therefore we decided to ultimately remove this “12:00” from our date columns, resulting in an easier analysis for our time data. Furthermore, we decided to combine the Latitude and Longitude columns, which would allow us to graph our individual crimes on software such as Palladio and Tableau.

Critique: Unfortunately, our dataset was limited in a multitude of ways. For one, the dates only were given from 2020 to the present, and because we really wanted to analyze the impact of COVID-19 on the rate of crime, this dataset only gave us around 3 months of data for our “pre-COVID” comparison. In addition, we were given a lot of data on the victim of the crime, but we were not actually given any data on the perpetrator of the crime. As a result, we were not able to perform any helpful analysis on any of the criminals, which may have helped us come to a stronger solution to the amount of crime in Los Angeles. Overall, the data was lacking in several different areas that would have given us an option to explore several more factors.