基于Python的地理空间分析(六):GeoDB数据可视化分析

引言

前面已经介绍了如何安装PostGIS、创建数据表、添加数据并执行基本的空间查询。我们将在此基础上进一步学习如何使用地理空间数据库进行分析和地图可视化。我们将采用美国新墨西哥州Albuquerque的犯罪数据作为例子完成一系列的分析和图表报告。

具体而言,在这篇Notebook里,我们打算创建一个关于犯罪的仪表盘(crime dashboard),大概可以分为三步:

  1. 创建数据库
  2. 创建数据分析工具
  3. 可视化分析结果

创建犯罪案件数据库

如前所述,我们采用的是美国新墨西哥州Albuquerque市的开放数据(http://www.cabq.gov/abq-data/ ),该市提供了犯罪事件、区域指挥(Area commands)和巡逻区(beats)等相关数据集,我们将通过联合使用这三个数据集进行初步的分析。

注:关于数据集中Layer不同字段的说明见 http://coagisweb.cabq.gov/arcgis/rest/services/public/APD_Incidents/MapServer/0

创建数据表

分别为犯罪事件、区域指挥和巡逻区创建相应的数据表,导入所需要的库:

1
2
3
4
import psycopg2 # connect to PostGIS
import requests # grab data
import datetime # convert geogson
from shapely.geometry import Point, Polygon, MultiPolygon, mapping
1
2
3
4
5
6
7
8
9
10
# create a connection
connection = psycopg2.connect(database='pythonspatial',
user='postgres', password='postgres')

cursor = connection.cursor()

proxies = {
'http': 'socks5://127.0.0.1:13579',
'https': 'socks5://127.0.0.1:13579',
}
1
2
3
4
5
6
7
8
9
10
11
12
13
14
# create the areacommand table with name and geometry field
# the area command filed has a lenth of 20 in the ArcServer service
cursor.execute('create table areacommand (id SERIAL PRIMARY KEY, name VARCHAR(20),'
'geom GEOMETRY)')

# create the beats table
cursor.execute('CREATE TABLE beats (id SERIAL PRIMARY KEY, beat VARCHAR(6),'
'agency VARCHAR(3), areacomm VARCHAR(15), geom GEOMETRY)')

# create the incidents table
cursor.execute('CREATE TABLE incidents (id SERIAL PRIMARY KEY, address VARCHAR(72), '
'crimetype VARCHAR(255), date DATE, geom GEOMETRY)')

connection.commit()

填充数据

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
# grab the area commands and insert them into table
url='http://coagisweb.cabq.gov/arcgis/rest/services/public/adminboundaries/MapServer/8/query'
params={"where":"1=1","outFields":"*","outSR":"4326","f":"json"}

# query the url passing parameters
r=requests.get(url,params=params, proxies=proxies)
data=r.json()

# insert the data
for acmd in data['features']:
polys=[]

# loop through the rings
# A polygon contains an array of rings and a spatialReference.
# Each ring is represented as an array of points.
for ring in acmd['geometry']['rings']:
polys.append(Polygon(ring))
p=MultiPolygon(polys)
name=acmd['attributes']['Area_Command']

# uses Shapely to convert the coordinates to WKT (p.wkt).
cursor.execute("INSERT INTO areacommand (name, geom) VALUES ('{}',ST_GeomFromText('{}'))".format(name, p.wkt))

connection.commit()

上面SQL查询的关键是INSERT INTO table (field, field) VALUES (value,value)命令,其中潜逃了ST_GeometryFromText()命令。ST_GeometryFromText和ST_GeomFromText一样,根据WKT描述的对象返回一个具体的geometry对象,WKT可以是POLYGON、MULTIPOLYGON、LINESTRING等。

此外,关于ESRI ArcServer查询的参数命令,可以查询: http://coagisweb.cabq.gov/arcgis/sdk/rest/index.html#/Query_Map_Service_Layer/02ss0000000r000000/

采用同样的套路,向beats表写入数据:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
url='http://coagisweb.cabq.gov/arcgis/rest/services/public/adminboundaries/MapServer/9/query'
params={"where":"1=1","outFields":"*","outSR":"4326","f":"json"}
r=requests.get(url,params=params, proxies=proxies)
data=r.json()

for acmd in data['features']:
polys=[]
for ring in acmd['geometry']['rings']:
polys.append(Polygon(ring))

p=MultiPolygon(polys)

beat = acmd['attributes']['BEAT']
agency = acmd['attributes']['AGENCY']
areacomm = acmd['attributes']['AREA_COMMA']

cursor.execute("INSERT INTO beats (beat, agency,areacomm,geom) VALUES('{}','{}','{}', ST_GeomFromText('{}'))".format(beat,agency,areacomm,p.wkt))

connection.commit()

最后向incidents表写入数据:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
url='http://coagisweb.cabq.gov/arcgis/rest/services/public/APD_Incidents/MapServer/0/query'
params={"where":"1=1","outFields":"*","outSR":"4326","f":"json"}
r=requests.get(url,params=params, proxies=proxies)
data=r.json()
for a in data["features"]:
address=a["attributes"]["BlockAddress"]
crimetype=a["attributes"]["IncidentType"]
if a['attributes']['date'] is None:
pass
else:
date = datetime.datetime.fromtimestamp(a['attributes']['date'] /1e3).date()

try:
p=Point(float(a["geometry"]["x"]),float(a["geometry"]["y"]))
cursor.execute(("INSERT INTO incidents (address,crimetype,date, geom) VALUES('{}','{}','{}',"
"ST_GeomFromText('{}'))").format(address,crimetype,str(date), p.wkt))
except KeyError:
pass

connection.commit()

与areacommand和beats表相比,在incidents表写入数据过程中,我们做了一定程度的数据清洗,因为可能有些记录缺少date(直接pass)或者坐标系信息(用try catch语句去捕获异常)。

现在我们已经完成了数据的导入工作,接下来我们就可以在PostgreSQL数据库中进行数据的查询并进一步分析和呈现数据。

数据查询结果可视化

之前,我们已经从地理空间数据库中查询数据,并返回WKT格式的数据,但是仅仅是一系列坐标点却并不是想象中的地图。接下来我们使用ipyleaflet和Jupyter实现查询结果的地图可视化。

ipyleaflet实现了在Jupyter中交互式操作地图,具体的代码、用例,可以访问:https://github.com/ellisonbg/ipyleaflet

准备工作

1
2
3
4
5
6
7
8
9
10
11
12
13
14
# Install ipyleaflet
!pip install ipyleaflet

# enable the extension for notbook
# can be skipped for notebook 5.3 and above
!jupyter nbextension enable --py --sys-prefix ipyleaflet

# enable widgetsnbextension for notebook
!jupyter nbextension enable --py --sys-prefix widgetsnbextension

# for jupyter lab
!jupyter labextension install jupyter-leaflet

!jupyter labextension install @jupyter-widgets/jupyterlab-manager

注意:以上命令实际应当在命令行中运行(去掉感叹号!)

重启过Jupyter之后就可以使用了。

ipyleaflet交互式操作地图

1
2
3
4
5
6
7
import psycopg2
from shapely.geometry import Point, Polygon, MultiPolygon
from shapely.wkb import loads
from shapely.wkt import dumps, loads
import datetime
import json
import ipyleaflet as ipl
1
2
3
4
5
connection = psycopg2.connect(database='pythonspatial', user='postgres', password='postgres')
cursor = connection.cursor()
cursor.execute('SELECT name, ST_AsGeoJSON(geom) from areacommand')
c = cursor.fetchall()
c[0][0]
'FOOTHILLS'

之前我们用ST_AsText()命令查询PostgreSQL数据库返回geometry数据,由于GeoJSON更加便于地图呈现,所以我们采用ST_AsGeoJSON()命令。以上代码返回了areacommand表中的所有记录,每条记录包括namegeometry两个字段。

ST_AsTextST_AsGeoJSON只是PostGIS获得geometry的17种方法里的两种,更多方法参见:https://postgis.net/docs/reference.html#Geometry_Accessors

接下来就该创建地图并显示获得的GeoJSON数据了:

1
2
3
4
5
6
7
8
9
10
center = [35.106196, -106.629515]
# the zoom level—the higher the number, the closer the zoom.
zoom = 9
map = ipl.Map(center=center, zoom=zoom)

for x in c:
layer = json.loads(x[1])
layer_geojson = ipl.GeoJSON(data=layer)
map.add_layer(layer_geojson)
map

ipyleaflet控件在转换为markdown文件时无法显示,所以我用folium来作图。

1
2
3
4
5
6
import folium
map = folium.Map(center, zoom_start=9,control_scale=True)
for x in c:
layer = json.loads(x[1])
folium.GeoJson(layer).add_to(map)
map