dbf - Why are Shapefiles limited to 2GB in size?

Thursday, 16 March 2017

dbf - Why are Shapefiles limited to 2GB in size?

I'm hoping somebody can clarify for me why .shp are limited to a 2GB file size? Having read through the ESRI considerations and technical description, I cannot understand why it exists.

Since they use dBASE for the .dbf component of the multifile format, it must abide by dBASE limits which have a maximum file size of 2GB. Although, that points to the same question, why does that limit exist? Does it have something to do with these formats being created when 32-bit OS' were widely used? If so, how does that influence the limit? I've seen posts regarding this as 2^(31-1) which is ~2.1GB but that just means 32-bit addressing is used, but I am not sure how it fits here. Other posts mention that these formats use 32-bit offsets, specifically "32-bit offsets to 16-bit words", but I don't follow that either.

Answer

You're asking several History of Computing questions here. All the reasons you've listed are true. The maximum file size on the OS was 2GB. The maximum integer size was 2GB. The maximum file offset in the OSes was 2GB. But once those weren't obstacles, Esri explicitly stated that it has a 2GB limit. Isn't that enough of a reason?

There are scads of new formats that out-perform shapefile. File geodatabase is so much better that I haven't created an output shapefile this decade. But I've used input shapefiles because that was what was available, and I've generated new shapefiles with turn-of-the-millennium tools, because that's what was available then.

Has computing changed? Of course it has. Can you hack the shapefile format to 4Gb or 8Gb? Yes, but not without being non-conformant. And it's the conformance that is shapefile's greatest strength, and violating conformance is what will destroy whatever utility remains of the format.

Blog

Thursday, 16 March 2017

dbf - Why are Shapefiles limited to 2GB in size?

No comments:

Post a Comment

arcpy - Changing output name when exporting data driven pages to JPG?