There are three methods to convert the output of a Make XY Event Layer to a Feature Class:
- Copy Features
- Feature to Point
- Feature Class to Feature Class
Which one is the fastest method? I need to convert a huge amount of points stored in a csv table (1,5 GB file) and I wonder what's the best way to do it.
I will use Background geoprocessing on a 64-bit machine.
Answer
Here's an example of technique I mentioned in the my comment to your question. Reading a CSV and populating a feature class with its values can be easy. The script assumes the Latitude field is the 1st column and Longitude field is the 2nd column. You can tweak the coordinate system in the code to something other than WGS84, tweak the field types, etc.
# Author: John K. Tran
# Contact: jtran20@masonlive.gmu.edu
import arcpy
import os
import csv
print "arcpy imported"
"""
Input CSV can look like:
Lat,Lon,First Name,Last Name
12.34,56.78,Joe,Smith
0.98,7.65,Jane,Doe
65.43,43.21,Bob,Sagat
Make sure 'Lat' and 'Lon' fields are the 1st and 2nd column in the CSV respectively.
"""
incsv = r"C:\Users\MyName\Desktop\My Docs\Sample Data\SampleCSV.csv" # Change this to the path of your CSV file.
outfc = r"C:\Users\MyName\Desktop\My Docs\Sample Data\SampleGDB.gdb\SampleFC" # Change this to the path of your output FC.
spatialref = arcpy.SpatialReference(4326) # Create the spatial reference object as WGS84. Can modify if desired.
if not arcpy.Exists(outfc): # Create the output feature class if needed.
arcpy.CreateFeatureclass_management(os.path.dirname(outfc), os.path.basename(outfc), "POINT", None, None, None, spatialref)
csv.register_dialect("xls", delimiter=",", lineterminator="\n") # Register the dialect for native CSV syntax in Microsoft Excel.
f = open(incsv, "r")
reader = csv.reader(f, dialect = "xls")
headers = reader.next() # Read the first line as the header names.
for header in headers[2:]: # Add fields for remaining columns if needed. Default is TEXT field.
arcpy.AddField_management(outfc, header, "TEXT")
cursor = arcpy.da.InsertCursor(outfc, ['SHAPE@XY'] + headers[2:]) # Create InsertCursor.
count = 0
for row in reader:
if count % 1000 == 0:
print "processing row {0}".format(count)
Ycoord = row[0] # Make sure 'Lat' is in the 1st column.
Xcoord = row[1] # Make sure 'Lon' is in the 2nd column.
newrow = [(float(Xcoord), float(Ycoord))] + row[2:]
cursor.insertRow(newrow) # Insert point in FC for each row in CSV.
count += 1
del cursor
f.close()
And example of a CSV that can be read is:
Lat,Lon,FirstName,LastName,Tone,Magnitude
26.61,-81.49,John,Smith,Dark,6.0
26.68,-81.47,Jim,Jones,Light,9.0
20.01,-155.87,Steve,Bruin,Medium,3.9
25.43,-80.45,Bob,Johnson,Dark,2.75
26.05,-80.11,Bill,Mayer,Light,26.6
100,000 CSV records took about 3-5 seconds for me to run.
No comments:
Post a Comment