Skip to content

finished assignment #2

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 2 commits into
base: master
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
72 changes: 63 additions & 9 deletions point_pattern.py
Original file line number Diff line number Diff line change
Expand Up @@ -33,7 +33,8 @@ def read_geojson(input_file):
"""
# Please use the python json module (imported above)
# to solve this one.
gj = None
with open(input_file, 'r') as f:
gj = json.load(f)
return gj


Expand All @@ -56,9 +57,15 @@ def find_largest_city(gj):
population : int
The population of the largest city
"""
city = None
temp = gj['features']
city = ""
max_population = 0

for i in temp:
if (i['properties']['pop_max'] > max_population):
max_population = i['properties']['pop_max']
city = i['properties']['name']

return city, max_population
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I image that something along these lines, but more robust will be part of your twitter work?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Kind of, so far I am pulling all the tweets from the the Twitter API and parsing the JSON for a handful of attributes and then writing them to a .csv (one column per tweet). However, when I start pulling from the Twitter Stream, it might be beneficial to just save summary stats from the JSON instead of every individual tweet, since I am not interested in everything from the stream data.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I pulled maybe 30 days of geoloacated data off the twitter stream. The approach I landed on was to use MongoDB (it has a good Python API) to store the data (the raw tweet). Then I analyzed the data in an iPython notebook using a combination of MongoDB database queries and raw Python. That might be a viable workflow that preserves the raw data?



Expand All @@ -74,7 +81,16 @@ def write_your_own(gj):
Do not forget to write the accompanying test in
tests.py!
"""
return
#Finds how many megacities there are in the geoJSON
temp = gj['features']
megacities = 0

for i in temp:
if(i['properties']['megacity'] == 1):
megacities += 1


return megacities

def mean_center(points):
"""
Expand All @@ -93,8 +109,16 @@ def mean_center(points):
y : float
Mean y coordinate
"""
x = None
y = None

x = 0
y = 0

for point in points:
x += point[0]
y += point[1]

x = x / len(points)
y = y / len(points)

return x, y

Expand All @@ -119,7 +143,20 @@ def average_nearest_neighbor_distance(points):
Measure of Spatial Relationships in Populations. Ecology. 35(4)
p. 445-453.
"""
mean_d = 0

nearest = []

for i, point in enumerate(points):
nearest.append(None)
for point2 in points:
if point is not point2:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This works for this dataset, but what if the points are coincident?

dist = euclidean_distance(point, point2)
if nearest[i] == None:
nearest[i] = dist
elif nearest[i] > dist:
nearest[i] = dist

mean_d = sum(nearest) / len(points)

return mean_d

Expand All @@ -139,17 +176,34 @@ def minimum_bounding_rectangle(points):
Corners of the MBR in the form [xmin, ymin, xmax, ymax]
"""

first = True
mbr = [0,0,0,0]

for point in points:
if first:
first = False
mbr[0] = point[0]
mbr[1] = point[1]
mbr[2] = point[0]
mbr[3] = point[1]

if point[0] < mbr[0]:
mbr[0] = point[0]
if point[1] < mbr[1]:
mbr[1] = point[1]
if point[0] > mbr[2]:
mbr[2] = point[0]
if point[1] > mbr[3]:
mbr[3] = point[1]

return mbr


def mbr_area(mbr):
"""
Compute the area of a minimum bounding rectangle
"""
area = 0

area = (mbr[1] - mbr[3]) * (mbr[0] - mbr[2])
return area


Expand All @@ -173,7 +227,7 @@ def expected_distance(area, n):
The number of points
"""

expected = 0
expected = 0.5 * (area / n) ** 0.5
return expected


Expand Down
2 changes: 1 addition & 1 deletion tests/tests.py
Original file line number Diff line number Diff line change
Expand Up @@ -33,7 +33,7 @@ def test_write_your_own(self):
point_pattern.py.
"""
some_return = point_pattern.write_your_own(self.gj)
self.assertTrue(False)
self.assertEqual(some_return, 55)

class TestIterablePointPattern(unittest.TestCase):
"""
Expand Down