I have a 7GB GeoJson file that I would like to load into a PostGIS database. I have tried using ogr2ogr but it fails because the file is too big for ogr2ogr to load into memory and then process.
Are there any other alternatives for loading this geojson file into PostGIS?
The ogr2ogr error I get is:
ERROR 2: CPLMalloc(): Out of memory allocating -611145182 bytes. This application has requested the Runtime to terminate it in an unusual way. Please contact the application's support team for more information.
Answer
The sample that you sent shows that it may be possible to manually split the file using an editor like notepad++
1)For each chunk create a header:
{"type":"FeatureCollection","features":[
2)After the header place many features:
{"geometry": {"type": "Point", "coordinates": [-103.422819, 20.686477]}, "type": "Feature", "id": "SG_3TspYXmaZcMIB8GxzXcayF_20.686477_-103.422819@1308163237", "properties": {"website": "http://www.buongiorno.com", "city": "M\u00e9xico D.F. ", "name": "Buongiorno", "tags": ["mobile", "vas", "community", "social-networking", "connected-devices", "android", "tablets", "smartphones"], "country": "MX", "classifiers": [{"category": "Professional", "type": "Services", "subcategory": "Computer Services"}], "href": "http://api.simplegeo.com/1.0/features/SG_3TspYXmaZcMIB8GxzXcayF_20.686477_-103.422819@1308163237.json", "address": "Le\u00f3n Tolstoi #18 PH Col. Anzures", "owner": "simplegeo", "postcode": "11590"}},
3) Finish the chunk with:
]}
EDIT - Here is python code that will split the file in pieces of defined size (in number of features):
import sys
class JsonFile(object):
def __init__(self,file):
self.file = open(file, 'r')
def split(self,csize):
header=self.file.readline()
number=0
while True:
output=open("chunk %s.geojson" %(number),'w')
output.write(header)
number+=1
feature=self.file.readline()
if feature==']}':
break
else:
for i in range(csize):
output.write(feature)
feature=self.file.readline()
if feature==']}':
output.write("]}")
output.close()
sys.exit("Done!")
output.write("]}")
output.close()
if __name__=="__main__":
myfile = JsonFile('places_mx.geojson')
myfile.split(2000) #size of the chunks.
No comments:
Post a Comment