以全球城市数据的GeoPackage(GeoPackage - SQLite-based format for geospatial data)文件为例,下载链接https://geodata.ucdavis.edu/gadm/gadm4.1/gadm_410-gpkg.zip
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
|
import geopandas as gpd
from sqlalchemy import create_engine
gdf = gpd.read_file("gadm_410.gpkg")
# 创建数据库引擎
db_string = "postgresql://postgres:xxxxxx@127.0.0.1:5432/geo"
engine = create_engine(db_string)
total = len(gdf)
chunk_size = 1000
# 数据量大,全量写入会报内存错误,分批写入
for i in range(0, total, chunk_size):
sub_gdf = gdf.iloc[i:i+chunk_size]
# print(i,end='\t,')
if i==0:
sub_gdf.to_postgis(name="world_city", con=engine, if_exists="replace", index=False)
else:
sub_gdf.to_postgis(name="world_city", con=engine, if_exists="append", index=False)
|
1
2
|
LOAD spatial;
CREATE TABLE world_city AS SELECT * FROM ST_Read('gadm_410.gpkg');
|
或者不导入,以DuckDB为桥梁,直接查询gpkg数据库文件
1
2
|
LOAD spatial;
SELECT UID FROM ST_Read('gadm_410.gpkg') WHERE ST_Contains(geom, ST_Point(76.3132597490079, 12.551483941107392)) OR ST_Contains(geom, ST_Point(98.28512001168576, 38.06662865215472));
|
这是一个伪命题,gpkg文件本身就是sqlite3数据库文件,可直接使用sqlite3打开,只是要进行spatial查询,需要先加载Spatialite扩展,作者通过多次尝试,发现查询结果不对,以 WHERE ST_Contains(geom, ST_Point(76.3132597490079, 12.551483941107392)) OR ST_Contains(geom, ST_Point(98.28512001168576, 38.06662865215472))为条件查询,结果返回了所有记录,而不是2条记录。迷之原因