SF Crime PSET 1
- Introduction
- Getting the Data
- Descriptive Statistics
- Geographic Information Systems
- Discussion Questions
# imports
import requests
from datascience import *
import matplotlib.pyplot as plt
import datetime
import folium
import time
import json
import os
from branca.colormap import linear
import branca.colormap
import pandas as pd
%matplotlib inline
1. Introduction
For this lab, we will be working with the San Francisco Police Department’s Incident Database. The dataset contains up-to-date information on incidents reported to the SFPD. Each observation is tagged with information about the incident’s location, type of infraaction, and date/time. In this lab you will:
- Download the data through an Application Programming Interface (API)
- Explore the data with summary and descriptive statistics
- Map the incidents
Make sure to start early and ask lots of questions! The dataset, along with other publicaly available data, is available at: https://data.sfgov.org/Public-Safety/Police-Department-Incidents/tmnf-yvry
2. Getting the Data
Write code that pulls the data into your environment with an API call. Here is the link to the API: https://data.sfgov.org/resource/PdId.json
sf_police = os.path.join('Police_Department_Sample_Incidents.csv')
df = pd.read_csv(sf_police)
df
incidntnum | category | descript | dayofweek | date | time | pddistrict | resolution | address | x | y | location | pdid | |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|
0 | 30325834 | OTHER OFFENSES | POSSESSION OF BURGLARY TOOLS | Wednesday | 03/19/2003 | 01:06:00 | MISSION | ARREST, BOOKED | 3100 Block of CESAR CHAVEZ ST | -122.412594 | 37.748166 | (37.7481660056223, -122.412594354828) | 3.032583e+12 |
1 | 150119979 | OTHER OFFENSES | TRAFFIC VIOLATION | Sunday | 02/08/2015 | 03:26:00 | INGLESIDE | ARREST, BOOKED | MISSION ST / COLLEGE AV | -122.424702 | 37.735370 | (37.7353699090873, -122.424702079332) | 1.501200e+13 |
2 | 130507219 | OTHER OFFENSES | RESISTING ARREST | Thursday | 06/20/2013 | 15:22:00 | CENTRAL | ARREST, CITED | 200 Block of SACRAMENTO ST | -122.398405 | 37.794412 | (37.7944120086129, -122.39840549202) | 1.305072e+13 |
3 | 40459865 | OTHER OFFENSES | LOST/STOLEN LICENSE PLATE | Wednesday | 04/21/2004 | 11:00:00 | INGLESIDE | NONE | 600 Block of 28TH ST | -122.436797 | 37.744751 | (37.7447510901057, -122.436796627403) | 4.045987e+12 |
4 | 70067529 | FORGERY/COUNTERFEITING | CHECKS, MAKE OR PASS FICTITIOUS | Friday | 01/05/2007 | 12:00:00 | INGLESIDE | NONE | 1600 Block of BURROWS ST | -122.419412 | 37.724888 | (37.7248875198607, -122.41941248624) | 7.006753e+12 |
5 | 106103162 | LARCENY/THEFT | PETTY THEFT OF PROPERTY | Thursday | 08/19/2010 | 14:05:00 | BAYVIEW | NONE | 1300 Block of REVERE AV | -122.385545 | 37.728980 | (37.728979731984, -122.385545453301) | 1.061032e+13 |
6 | 120276214 | ASSAULT | BATTERY WITH SERIOUS INJURIES | Saturday | 04/07/2012 | 03:05:00 | PARK | ARREST, BOOKED | 500 Block of CENTRAL AV | -122.444569 | 37.774618 | (37.7746181158166, -122.444568876114) | 1.202762e+13 |
7 | 170096642 | LARCENY/THEFT | PETTY THEFT SHOPLIFTING | Friday | 02/03/2017 | 15:00:00 | MISSION | NONE | 400 Block of CASTRO ST | -122.435150 | 37.761760 | (37.761759724359806, -122.43515009981229) | 1.700966e+13 |
8 | 40822096 | OTHER OFFENSES | DRIVERS LICENSE, SUSPENDED OR REVOKED | Monday | 07/19/2004 | 13:00:00 | TENDERLOIN | ARREST, CITED | EDDY ST / HYDE ST | -122.415885 | 37.783516 | (37.7835160563924, -122.415885065795) | 4.082210e+12 |
9 | 100487776 | LARCENY/THEFT | PETTY THEFT FROM A BUILDING | Friday | 05/07/2010 | 20:30:00 | SOUTHERN | NONE | 400 Block of MINNA ST | -122.407387 | 37.781069 | (37.7810688918781, -122.407387172098) | 1.004878e+13 |
10 | 160459555 | ROBBERY | ROBBERY OF A SERVICE STATION WITH A GUN | Sunday | 06/05/2016 | 18:35:00 | TARAVAL | NONE | 1100 Block of JUNIPERO SERRA BL | -122.472389 | 37.717501 | (37.7175010456614, -122.4723890410333) | 1.604596e+13 |
11 | 120107590 | LARCENY/THEFT | GRAND THEFT BICYCLE | Friday | 02/03/2012 | 19:00:00 | RICHMOND | NONE | 600 Block of 25TH AV | -122.484596 | 37.777213 | (37.7772129804302, -122.484595739169) | 1.201076e+13 |
12 | 100847910 | VANDALISM | MALICIOUS MISCHIEF, VANDALISM | Monday | 09/13/2010 | 00:01:00 | MISSION | NONE | 100 Block of GRANDVIEW AV | -122.440412 | 37.755446 | (37.7554456064661, -122.440411847156) | 1.008479e+13 |
13 | 30802688 | LARCENY/THEFT | PETTY THEFT FROM LOCKED AUTO | Saturday | 07/05/2003 | 16:00:00 | NORTHERN | NONE | FRANKLIN ST / ELLIS ST | -122.422654 | 37.783620 | (37.7836201224122, -122.422654162991) | 3.080269e+12 |
14 | 60098528 | NON-CRIMINAL | TARASOFF REPORT | Thursday | 01/26/2006 | 09:15:00 | TENDERLOIN | NONE | 300 Block of EDDY ST | -122.413791 | 37.783837 | (37.7838365565348, -122.413790972781) | 6.009853e+12 |
15 | 130149833 | MISSING PERSON | MISSING JUVENILE | Wednesday | 02/20/2013 | 09:00:00 | PARK | LOCATED | 100 Block of BELVEDERE ST | -122.449329 | 37.767774 | (37.7677738874748, -122.449328648219) | 1.301498e+13 |
16 | 41322974 | LARCENY/THEFT | GRAND THEFT FROM A BUILDING | Thursday | 11/18/2004 | 17:00:00 | CENTRAL | NONE | 900 Block of MASON ST | -122.410846 | 37.792316 | (37.7923158747647, -122.410845624227) | 4.132297e+12 |
17 | 100174597 | LARCENY/THEFT | PETTY THEFT SHOPLIFTING | Sunday | 02/21/2010 | 17:20:00 | SOUTHERN | ARREST, CITED | 800 Block of MARKET ST | -122.407634 | 37.784189 | (37.7841893501425, -122.407633520742) | 1.001746e+13 |
18 | 130034347 | OTHER OFFENSES | CONSPIRACY | Sunday | 01/13/2013 | 01:03:00 | TENDERLOIN | ARREST, BOOKED | 100 Block of ELLIS ST | -122.408271 | 37.785494 | (37.7854941424186, -122.408270724034) | 1.300343e+13 |
19 | 150034220 | VEHICLE THEFT | STOLEN AUTOMOBILE | Sunday | 01/11/2015 | 23:00:00 | INGLESIDE | NONE | 1200 Block of ALEMANY BL | -122.432269 | 37.730139 | (37.7301394309025, -122.432268894425) | 1.500342e+13 |
20 | 101033821 | DRUG/NARCOTIC | POSSESSION OF BASE/ROCK COCAINE FOR SALE | Saturday | 11/06/2010 | 22:00:00 | TENDERLOIN | ARREST, BOOKED | OFARRELL ST / JONES ST | -122.412971 | 37.785788 | (37.7857883766888, -122.412970537591) | 1.010338e+13 |
21 | 130473234 | SUSPICIOUS OCC | SUSPICIOUS OCCURRENCE | Saturday | 06/08/2013 | 18:00:00 | RICHMOND | UNFOUNDED | 300 Block of 7TH AV | -122.465490 | 37.781907 | (37.7819066124774, -122.465490381382) | 1.304732e+13 |
22 | 96024454 | LARCENY/THEFT | GRAND THEFT FROM LOCKED AUTO | Monday | 04/06/2009 | 14:30:00 | INGLESIDE | NONE | 1500 Block of TREAT AV | -122.412506 | 37.745680 | (37.7456802881475, -122.41250614556) | 9.602445e+12 |
23 | 70430788 | LARCENY/THEFT | GRAND THEFT FROM PERSON | Thursday | 04/26/2007 | 13:15:00 | NORTHERN | NONE | 600 Block of EDDY ST | -122.418382 | 37.783258 | (37.7832583770949, -122.418382008607) | 7.043079e+12 |
24 | 171022519 | LARCENY/THEFT | LOST PROPERTY, GRAND THEFT | Monday | 12/18/2017 | 18:45:00 | CENTRAL | NONE | CHESTNUT ST / POWELL ST | -122.411587 | 37.803962 | (37.80396234588253, -122.41158662771912) | 1.710225e+13 |
25 | 160131305 | LARCENY/THEFT | PETTY THEFT FROM LOCKED AUTO | Saturday | 02/13/2016 | 16:00:00 | SOUTHERN | NONE | HARRIET ST / FOLSOM ST | -122.406151 | 37.778081 | (37.77808112789121, -122.40615148216322) | 1.601313e+13 |
26 | 100587291 | SUSPICIOUS OCC | SUSPICIOUS OCCURRENCE | Saturday | 06/26/2010 | 03:55:00 | MISSION | NONE | 3000 Block of 16TH ST | -122.421083 | 37.764911 | (37.764910844226, -122.421082850193) | 1.005873e+13 |
27 | 50103165 | LARCENY/THEFT | GRAND THEFT FROM LOCKED AUTO | Wednesday | 01/26/2005 | 07:45:00 | CENTRAL | NONE | 300 Block of CLAY ST | -122.398573 | 37.795292 | (37.7952923975164, -122.398573051955) | 5.010317e+12 |
28 | 80437166 | OTHER OFFENSES | DRIVERS LICENSE, SUSPENDED OR REVOKED | Saturday | 04/26/2008 | 17:16:00 | BAYVIEW | ARREST, CITED | CESAR CHAVEZ ST / KANSAS ST | -122.402063 | 37.749398 | (37.7493979879214, -122.402063377785) | 8.043717e+12 |
29 | 110218113 | OTHER OFFENSES | MISCELLANEOUS INVESTIGATION | Wednesday | 03/16/2011 | 10:30:00 | PARK | NONE | 700 Block of CLAYTON ST | -122.448266 | 37.767909 | (37.7679091107584, -122.448266122025) | 1.102181e+13 |
... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... |
9970 | 90773615 | VEHICLE THEFT | STOLEN AUTOMOBILE | Wednesday | 07/29/2009 | 07:00:00 | SOUTHERN | NONE | 6TH ST / STEVENSON ST | -122.409696 | 37.781754 | (37.7817540588712, -122.409696355558) | 9.077362e+12 |
9971 | 171009573 | SUSPICIOUS OCC | SUSPICIOUS OCCURRENCE | Thursday | 12/14/2017 | 09:03:00 | MISSION | NONE | DUBOCE AV / WOODWARD ST | -122.420948 | 37.769979 | (37.76997927943881, -122.42094803971935) | 1.710096e+13 |
9972 | 30696867 | LARCENY/THEFT | GRAND THEFT FROM LOCKED AUTO | Sunday | 06/08/2003 | 02:30:00 | BAYVIEW | NONE | 400 Block of BARNEVELD AV | -122.403360 | 37.741657 | (37.7416571720538, -122.403359514965) | 3.069687e+12 |
9973 | 66005553 | LARCENY/THEFT | GRAND THEFT FROM LOCKED AUTO | Wednesday | 01/25/2006 | 22:00:00 | SOUTHERN | NONE | 800 Block of FOLSOM ST | -122.402246 | 37.781089 | (37.7810888153345, -122.402246029097) | 6.600555e+12 |
9974 | 41309477 | WARRANTS | WARRANT ARREST | Tuesday | 11/16/2004 | 13:20:00 | MISSION | ARREST, BOOKED | 16TH ST / HOFF ST | -122.420580 | 37.764997 | (37.7649968622982, -122.420580440119) | 4.130948e+12 |
9975 | 130576957 | SUSPICIOUS OCC | SUSPICIOUS OCCURRENCE | Thursday | 07/11/2013 | 23:00:00 | SOUTHERN | NONE | 9TH ST / FOLSOM ST | -122.411612 | 37.773768 | (37.7737679567236, -122.411612378034) | 1.305770e+13 |
9976 | 61239785 | MISSING PERSON | MISSING ADULT | Monday | 11/20/2006 | 05:30:00 | BAYVIEW | LOCATED | 1400 Block of PHELPS ST | -122.394439 | 37.736444 | (37.7364438996732, -122.394438859914) | 6.123979e+12 |
9977 | 130969346 | ASSAULT | AGGRAVATED ASSAULT OF POLICE OFFICER,BODILY FORCE | Friday | 11/15/2013 | 15:41:00 | SOUTHERN | ARREST, BOOKED | 800 Block of BRYANT ST | -122.403405 | 37.775421 | (37.775420706711, -122.403404791479) | 1.309693e+13 |
9978 | 130270018 | LARCENY/THEFT | PETTY THEFT FROM A BUILDING | Thursday | 03/21/2013 | 01:02:00 | PARK | NONE | 4100 Block of 17TH ST | -122.437906 | 37.762376 | (37.7623763079838, -122.437905731311) | 1.302700e+13 |
9979 | 160168693 | VANDALISM | MALICIOUS MISCHIEF, VANDALISM | Friday | 02/26/2016 | 15:25:00 | TARAVAL | ARREST, BOOKED | 2300 Block of NORIEGA ST | -122.488808 | 37.753752 | (37.75375194458355, -122.48880789335784) | 1.601687e+13 |
9980 | 40988529 | LARCENY/THEFT | PETTY THEFT WITH PRIOR | Sunday | 08/29/2004 | 17:00:00 | TENDERLOIN | ARREST, BOOKED | 100 Block of OFARRELL ST | -122.407244 | 37.786565 | (37.7865647607685, -122.407244087032) | 4.098853e+12 |
9981 | 70576586 | NON-CRIMINAL | LOST PROPERTY | Wednesday | 03/07/2007 | 08:40:00 | SOUTHERN | NONE | 800 Block of BRYANT ST | -122.403405 | 37.775421 | (37.775420706711, -122.403404791479) | 7.057659e+12 |
9982 | 136168471 | LARCENY/THEFT | GRAND THEFT FROM LOCKED AUTO | Saturday | 09/14/2013 | 13:00:00 | SOUTHERN | NONE | 200 Block of INTERSTATE80 HY | -122.365565 | 37.809671 | (37.8096707013239, -122.365565425353) | 1.361685e+13 |
9983 | 110861637 | RECOVERED VEHICLE | RECOVERED VEHICLE - STOLEN OUTSIDE SF | Tuesday | 10/25/2011 | 14:43:00 | BAYVIEW | NONE | 1300 Block of INGALLS ST | -122.382589 | 37.730465 | (37.7304652920239, -122.382588718232) | 1.108616e+13 |
9984 | 40345587 | VEHICLE THEFT | STOLEN AUTOMOBILE | Wednesday | 03/24/2004 | 21:00:00 | MISSION | NONE | 24TH ST / POTRERO AV | -122.406338 | 37.753004 | (37.7530042877269, -122.406338412693) | 4.034559e+12 |
9985 | 140819054 | OTHER OFFENSES | DRIVERS LICENSE, SUSPENDED OR REVOKED | Sunday | 09/28/2014 | 18:14:00 | RICHMOND | ARREST, CITED | FULTON ST / 30TH AV | -122.489539 | 37.772325 | (37.772324696627, -122.489538601712) | 1.408191e+13 |
9986 | 120709920 | BURGLARY | BURGLARY, HOT PROWL, FORCIBLE ENTRY | Wednesday | 09/05/2012 | 10:06:00 | TARAVAL | NONE | 100 Block of ELVERANO WY | -122.461329 | 37.730631 | (37.7306307008778, -122.461329068423) | 1.207099e+13 |
9987 | 90239322 | OTHER OFFENSES | HARASSING PHONE CALLS | Wednesday | 03/04/2009 | 17:00:00 | CENTRAL | NONE | 500 Block of GREEN ST | -122.407932 | 37.799577 | (37.7995771094892, -122.407931930502) | 9.023932e+12 |
9988 | 91140174 | ASSAULT | BATTERY | Thursday | 11/05/2009 | 00:20:00 | BAYVIEW | NONE | 17TH ST / TEXAS ST | -122.395810 | 37.765189 | (37.7651891001345, -122.395810281469) | 9.114017e+12 |
9989 | 80279742 | MISSING PERSON | MISSING ADULT | Friday | 03/07/2008 | 23:30:00 | TENDERLOIN | LOCATED | 300 Block of MASON ST | -122.409661 | 37.786439 | (37.7864394524764, -122.409660751795) | 8.027974e+12 |
9990 | 170927865 | ASSAULT | BATTERY, FORMER SPOUSE OR DATING RELATIONSHIP | Tuesday | 11/14/2017 | 11:21:00 | SOUTHERN | ARREST, BOOKED | 800 Block of BRYANT ST | -122.403405 | 37.775421 | (37.775420706711, -122.40340479147905) | 1.709279e+13 |
9991 | 160689633 | OTHER OFFENSES | VIOLATION OF RESTRAINING ORDER | Thursday | 08/25/2016 | 09:00:00 | MISSION | ARREST, BOOKED | 400 Block of ALVARADO ST | -122.431019 | 37.753944 | (37.753943582970585, -122.43101882896221) | 1.606896e+13 |
9992 | 80917287 | LARCENY/THEFT | GRAND THEFT FROM LOCKED AUTO | Thursday | 08/28/2008 | 20:00:00 | NORTHERN | NONE | GOLDEN GATE AV / FILLMORE ST | -122.431952 | 37.779565 | (37.7795647498663, -122.431952436364) | 8.091729e+12 |
9993 | 40813841 | ASSAULT | AGGRAVATED ASSAULT WITH A DEADLY WEAPON | Saturday | 07/17/2004 | 01:57:00 | MISSION | ARREST, BOOKED | 16TH ST / HOFF ST | -122.420580 | 37.764997 | (37.7649968622982, -122.420580440119) | 4.081384e+12 |
9994 | 146231278 | LARCENY/THEFT | GRAND THEFT FROM LOCKED AUTO | Wednesday | 11/05/2014 | 09:45:00 | CENTRAL | NONE | NORTHPOINT ST / LARKIN ST | -122.422009 | 37.805496 | (37.8054960276478, -122.422008950077) | 1.462313e+13 |
9995 | 140224390 | OTHER OFFENSES | DRIVERS LICENSE, SUSPENDED OR REVOKED | Sunday | 03/16/2014 | 18:30:00 | INGLESIDE | ARREST, CITED | PERSIA AV / MADRID ST | -122.432805 | 37.721618 | (37.7216179113708, -122.432804728335) | 1.402244e+13 |
9996 | 140531101 | VEHICLE THEFT | STOLEN AUTOMOBILE | Thursday | 06/26/2014 | 18:00:00 | SOUTHERN | NONE | 0 Block of GOUGH ST | -122.421206 | 37.772531 | (37.7725305162673, -122.421205551035) | 1.405311e+13 |
9997 | 70416857 | BURGLARY | BURGLARY OF RESIDENCE, UNLAWFUL ENTRY | Monday | 04/23/2007 | 09:15:00 | TARAVAL | ARREST, BOOKED | 1700 Block of 44TH AV | -122.503257 | 37.753978 | (37.7539775965639, -122.503257080708) | 7.041686e+12 |
9998 | 130481259 | DRUG/NARCOTIC | SALE OF MARIJUANA | Tuesday | 06/11/2013 | 16:07:00 | SOUTHERN | ARREST, BOOKED | 900 Block of MARKET ST | -122.408595 | 37.783707 | (37.7837069301545, -122.408595110869) | 1.304813e+13 |
9999 | 146101245 | LARCENY/THEFT | GRAND THEFT FROM UNLOCKED AUTO | Sunday | 05/18/2014 | 00:45:00 | SOUTHERN | NONE | HARRISON ST / 5TH ST | -122.401846 | 37.779032 | (37.7790324136251, -122.401846367522) | 1.461012e+13 |
10000 rows × 13 columns
#ts = Table(data.labels)
import pandas as pd
from sodapy import Socrata
# Unauthenticated client only works with public data sets. Note 'None'
# in place of application token, and no username or password:
client = Socrata("data.sfgov.org", None)
# Example authenticated client (needed for non-public datasets):
# client = Socrata(data.sfgov.org,
# MyAppToken,
# userame="user@example.com",
# password="AFakePassword")
# First 2000 results, returned as JSON from API / converted to Python list of
# dictionaries by sodapy.
results = client.get("tmnf-yvry", limit=2000)
# Convert to pandas DataFrame
results_df = pd.DataFrame.from_records(results)
---------------------------------------------------------------------------
ModuleNotFoundError Traceback (most recent call last)
<ipython-input-17-c5d4879f0181> in <module>()
1 #ts = Table(data.labels)
2 import pandas as pd
----> 3 from sodapy import Socrata
4
5 # Unauthenticated client only works with public data sets. Note 'None'
ModuleNotFoundError: No module named 'sodapy'
import pandas as pd
df = pd.DataFrame(recs)
df.head()
0 | |
---|---|
0 | code |
1 | error |
2 | message |
3 | data |
4 | code |
data = Table.from_df(df.drop('location', axis=1))
data
---------------------------------------------------------------------------
NameError Traceback (most recent call last)
<ipython-input-10-cf486b52f37a> in <module>()
1 # data = Table.from_df(df.drop('location', axis=1))
----> 2 data
NameError: name 'data' is not defined
min(df['date'])
# making a table out of our json
# DO NOT USE
#data = Table.from_records(json_response)
#data.show(3)
data['y'] = data['y'].astype('float')
data['x'] = data['x'].astype('float')
---------------------------------------------------------------------------
NameError Traceback (most recent call last)
<ipython-input-9-8d418cf325f3> in <module>()
----> 1 data['y'] = data['y'].astype('float')
2 data['x'] = data['x'].astype('float')
NameError: name 'data' is not defined
QUESTION: What are the advantages to downloading data this way, instead of with a point-and-click action?
3. Descriptive Statistics
Plot the number of incidents per year from 2000-2017 (choose the appropriate type of plot). Have crime rates increased or decreased in general?
# creating a year column from the first four characters of the 'date' column
data['year'] = pd.DatetimeIndex(df['date']).year
data
agg_on_year = data.group('year')
agg_on_year.show()
agg_on_year.plot('year', 'count')
Looking just at 2017, what proportion of the total does each type of crime constitute? Use at least one table and at least one plot to support your answer.
agg_by_crime = data.where('year', 2017).group('category')
agg_by_crime['proportion'] = [count / sum(agg_by_crime.column('count')) for count in agg_by_crime.column('count')]
agg_by_crime.sort('proportion', descending=True)
agg_by_crime.sort('count', descending=True).barh('category', 'proportion')
Is there a relationship between day of week, time, and whether an incident occurs? Bonus: Is there a relationship between day/time and particular types of incidents?
data.group('dayofweek').barh('dayofweek')
# making an hour column that can be grouped on
data['hour'] = [int(t[:2]) for t in data['time']]
data.group('hour').bar('hour')
Bonus: Are there any other interesting relationships in the data?
4. Geographic Information Systems (GIS)
Plot individual incidents in 2017 as points on a map of San Francisco. Does crime seem randomly distributed in space, or do incidents tend to cluster close together? Propose an explanation for your conclusion. Bonus: Shade the points by type of crime.
Hint: Use the basemap
extension to the matplotlib
package!
twentyeighteen = data.where('year' == 2017).sample(1000)
twentyeighteen['y'] = twentyeighteen['y'].astype('float')
twentyeighteen['x'] = twentyeighteen['x'].astype('float')
mp = folium.Map(location=[37.7749, -122.4194])
for coords in list(zip(twentyeighteen['y'], twentyeighteen['x'])):
folium.Marker(
location=coords
).add_to(mp)
mp
from folium.plugins import HeatMap
mp = folium.Map(location=[37.7749, -122.4194])
HeatMap(list(zip(twentyeighteen['y'], twentyeighteen['x']))).add_to(mp)
mp
Merge the incidents data with either a Shapefile or GeoJSON file with information on the boundaries of neighborhoods in San Francisco.
The neighborhood data is available here: https://data.sfgov.org/Geographic-Locations-and-Boundaries/Analysis-Neighborhoods/p5b7-5n3h
The API endpoint: https://data.sfgov.org/resource/xfcw-9evu.json
*.geojson
#import requests
#r = requests.get(url='https://data.sfgov.org/resource/xfcw-9evu.json')
sf_neighborhoods = os.path.join('SF Find Neighborhoods.geojson')
geo_json_data = json.load(open(sf_neighborhoods))
m = folium.Map([37.7749, -122.4194], zoom_start = 12)
m
# might be too big bc won't display
m = folium.Map(
location=[37.7749, -122.4194], zoom_start = 12
)
folium.GeoJson(geo_json_data
).add_to(m)
m
Construct a choropleth map, coloring in each neighborhood by how many incidents it had in 2018. Bonus: Construct several maps that explore differences by day of week, time of year, time of day, etc.
twentyeighteen = twentyeighteen.to_df()
import geopandas as gpd
import shapely
shapely.speedups.enable()
twentyeighteen_spatial_points = gpd.GeoDataFrame(twentyeighteen.drop(['x', 'y'], axis=1),
crs={'init': 'epsg:4326'},
geometry=twentyeighteen.apply(lambda row: shapely.geometry.Point((row.x, row.y)), axis=1))
sf_polygons = gpd.GeoDataFrame.from_features(geo_json_data['features'])
sf_polygons.crs = {'init' :'epsg:4326'}
sf_spatial = gpd.sjoin(sf_polygons, twentyeighteen_spatial_points, how="inner", op="intersects")
crime_neighborhood = pd.DataFrame(sf_spatial).reset_index()
crime_neighborhood.head(5)
crime_neighborhood_agg = crime_neighborhood.groupby('name').size().reset_index()
crime_neighborhood_agg.head(5)
crime_neighborhood_agg.columns = ['neighborhood', 'crimes']
crime_neighborhood_agg.head(5)
Do you notice any patters? Are there particular neighborhoods where crime concentrates more heavily?
m = folium.Map(
location=[37.7749, -122.4194], zoom_start = 12
)
m.choropleth(
geo_data=geo_json_data,
data=crime_neighborhood_agg,
columns=['neighborhood', 'crimes'],
key_on='feature.properties.name',
fill_color='OrRd',
threshold_scale=[10, 60, 100, 140],
highlight=True
)
m
5. Discussion Questions
Based on the evidence from this lab assignment, why do you think “hot spots” policing became more popular in the last few decades? What are the pros and cons to this kind of approach?
What other sorts of data would help improve your analysis?
def append_and_follow(t, link, n):
if n == 0:
print('Next link (if you want to continue)', link)
return t
time.sleep(3)
print(n)
r = requests.get(link)
js = r.json()
return append_and_follow(t.append(Table.from_records(js['value'])), js['@odata.nextLink'], n-1)
starter = Table(('__id', 'address', 'category', 'date', 'dayofweek', 'descript', 'incidntnum',
'location', 'pddistrict', 'pdid', 'resolution', 'time', 'x', 'y'))
starting_url = 'https://data.sfgov.org/api/odata/v4/tmnf-yvry'
c = append_and_follow(starter, starting_url, 6)
c