APRIL 25TH: GRIND TIME

My presentation for my CIS capstone happened today and I’m finally free. It’s time for me to lock in and finish writing this report. While I didn’t get around to rewriting my first report, I’m looking forward to putting the advice I received on it into action for report number 2. The main one I want to focus on is ensuring that any pictures or figured I include relevant to the paper and useful to furthering the understanding of readers. I have a tendency of just including pictures in each section because they feel barren too me, but that’s not very effective when trying to persuade readers.

APRIL 18TH: STRUCTURING A REPORT

Now that I have a subset of the datasets for each group, I’ve been running statics for them like type of protest (violent, non-violent, etc) or size of the protest. I’m in the process of figuring out what I want the overall focus of my report to be and what data I can use to accomplish that.

The main idea I have right now is seeing if I can relate the reasoning for protests to geographical location, like do protests revolving around education happen in states with a higher education standard such as Mass.

APRIL 11TH: SPLITTING THE DATA

I’ve decided how to group the different data points based on the actor’s present in the protest and now need to divide the datasets. Normally I’d do this through fancy indexing, but the actor columns I’m using for grouping contain string data which doesn’t interact nicely with fancy indexing. Due to this, I’ll need to run through the datasets manually with loops looking for substrings within each actor entry. This will take some time but ultimately lead to multiple curated data frames for analysis.

APRIL 4TH: GROUPING PROTESTERS

It seems like a sentiment analysis on the notes for each event is a dead end. While it’s technically possible, the notes lack many words that carry strong positive or negative connotations. Upon further inspection, the notes column doesn’t actually contain articles about the protests, but rather facts about it organized into a paragraph.

Because of this I’ve decided to switch my focus over to grouping the types of people and organizations associated with protests. Once I have this, there are many different relationships I can look into such as who participates in the most violent protests or which group amasses the largest number of total protesters.

MARCH 28TH: PROTEST MAPPING

Other than doing basic statistics for the datasets, I’ve been trying to map the locations for each of the riots. The United States was easy since I could basically reuse my code from the beginning of the semester, but mapping the Indian dataset has been difficult as the geospatial files for Python are documented very well.  Outside of that, I also want to look into what types of corporations are associated with the majority of protests.

Lastly, I’ve just been revising my report on the Police Shootings data.

MARCH 21ST: A NEW FRONTIER

This week marks the second half of the semester and we’re moving on to a different dataset. The data that we’re looking at is for protests, bothe peaceful and violent, in the United States and India. I’ve only just started looking at the data, but one thing that sticks out to me is the notes column. I don’t know how successful it will be because the notes I’ve seen so far are very fact driven, but I’m curious about running a sentiment analysis on the notes.

February 28th: Writing a Report

As the deadline of the report approaches, I’ve fully switched my focus over to writing it. One thing I wanted to try and include where statistical tests, mainly to compare the different groupings I made of the states for my gun law research.

  1. On the topic of gun laws, this week I’ve been looking into an interesting question: Does a lower proportion of crimes where the victims were armed with guns mean that the people who would’ve committed a crime didn’t or are they just commiting the crime but armed with something different? This was actually a question my mother asked when I talked about the data with her last weekend. I’ve been trying to answer it using the average frequency of police shootings for each gun law grouping.

FEBRUARY 21ST: STATE GROUPINGS

This week has been very slow because we had no Tuesday classes this week. The one major change I’ve made is rather than grouping by coast I’ve decided to use predetermined regions in the US like the North East and Mid West. I’ve just been doing general statistics like mean, standard deviation, and so on this week for each of the different regions, but plan on doing something more specific. What that more specific thing is is something I’ll decide this weekend.

FEBRUARY 14TH: COAST TO COAST

After finishing grouping the States by gun regulations, I compared the rates of shootings where the victim was armed with a gun between the three groups. I’ll hold off on staying the outcome until I write my report.

My new goal focuses on the Eastern United States and the Western United States. When I was looking at the map of the US I created, it seemed like the East Coast had a majority of black police shooting victims while the West Coast had a majority Hispanic and I’d like to confirm or deny this. I also noticed the West Cost shootings were more tightly grouped around major cities which I might also look into.