Saturday 26 October 2019

spatial database - Organization and tidiness of multiple copies of layers?



Back in the days when I was in the university I had a "Organization and tidiness" problem – I was unorganized and kept my layers in different folders without distinct names and hence had multiple copies of each layer.


Ever since I started working, I've improved a lot – I keep special folders with special subfolders. I name my layers according to a system which lets me be a bit more neat, but as I still have to manage multiple copies of layers (As Autocad and ArcGIS have their differences when dealing with non-Latin languages, I have to keep a copy adjusted for each program), I'd like to hear from your experiences and maybe learn a few tips from you:



  1. How do you organize your layers? How do name them? By name, date, contents, customer?

  2. How do you organize or deal with multiple copies (more acute: how do you update several copies at once)?



Note: I'm talking from the analyst/DBA POV and not from a web-developer's/web-manager's POV (I'm talking about organizing the layers for myself and maybe two more GIS workers, not more).



Answer



This is a wicked problem. We've tried various systems, which have all worked to varying degree for a time, and eventually grown unwieldly and started to fall apart as more and edge cases which don't quite fit are encountered. That said, each of the systems we've used is way better than nothing, proving the maxim that any system is better than no system.


Here is a thumbnail overview of our current practice:


Put everything except rasters into a file geodatabase, the fewer the better. Don't nest feature classes under feature datasets unless they are related in some manner (e.g. hydro>streams, hydro>lakes, hydro>wetlands, etc.). This leads to a big long list at the top of the fgdb but that is an acceptable evil.


Create layer files for all the feature classes and organize that instead, this gives a lot of freedom to name as needed, using unsupported characters etc.*, and ability to move and rename as circumstances change. It also allows duplication without redundancy, for example one set of layers grouped according to nominal scale (50k, 250k...), another by region (AK, YT...) , a third by theme (caribou, land use, transportation...), and a fourth by client while the datastore itself remains unchanged.


For duplicates use shortcuts instead of the layer files themselves, otherwise there are too many things to update when things change. Configure ArcCatalog to show shortcuts: *Tools > Options > file types: .lnk (Limitations: preview & metadata don't work, you can't follow the shortcut to its source in ArcCatalog. This can be remedied using Symbolic Links instead of shortcuts, see Link Shell Extension)


*(tip: add the Layers folder as a Start Menu toolbar so they're always at your finger tips.)



Z:\Layers\

Base\
Thematic\
Reference\
All Dressed Base (250k).lyr
Administration Boundaries (1000k).lyr
...
Z:\Raster\
Landsat\
Orthos\
Z:\Data\

Foo_50k.gdb
Foo_250k.gdb
NoScale.gdb

Map compositions and outputs (print files, pdf's, exports, etc.) which by nature are more dynamic and variable are stored and organized differently somewhere else. This is the part which has been harder for us. We currently use a dedicated drive with folders named according to Job# (doing it again I'd use date instead, '2010-10-26') and sub folders for project specific data and results/deliberables. A spreadsheet index lists all the job numbers (folder name), their corresponding map titles and client. Ex:



W:\Foo_0123\
Foobarmap_001.mxd
Docs\
ReadMe.doc

Data\
buffers_2000m.shp
gps_tracks.csv
Output\
Foobarmap_001.pdf
Deliverables

Keeping the index up to date is a friction point, people don't like to do it, avoid it, and are inconsistent with naming etc. (using a database instead of spreadsheet would help). Using a numerical folder name convention also makes it very difficult to the map for project X without the index, another notable source of friction. Ideally the index would be a clickable html page which is automatically generated from a db application. That is whole 'nother project though.


Key principles:




  • separate the slowly changing and often reused stuff from the dynamic and variable, and treat them differently

  • Don't duplicate unecessarily, use layer files and shortcuts/links wherever possible.

  • don't change systems too frequently, give each a solid try.


I very much welcome examples of other structures, as I said we're not content with what we have. :)


No comments:

Post a Comment

arcpy - Changing output name when exporting data driven pages to JPG?

Is there a way to save the output JPG, changing the output file name to the page name, instead of page number? I mean changing the script fo...