Thursday, 7 April 2016

qgis - How to only show labels for an arbitrary selection of items?


I'm curious how others go about solving this problem: You've created a map for something with a large number of features that are labeled. The customer/client asks that you only show labels for X, Y and Z, based on some seemingly arbitrary decision (eg. what they deem important features). How would you go about doing this?


Some ideas:



  • Create a new string column for this special label and only fill in a value for the features they want to see (could result in duplicate information)

  • Create a new boolean column and flag the features they want to see with true, then use the conditional labeling in QGIS 1.8 to only display the label when the boolean is true



Answer



The second idea (to create a boolean attribute for selection) has many advantages:


(i) it clearly documents what needs to be labeled,



(ii) it is as permanent and portable as the underlying dataset,


(iii) it provides a simple and direct mechanism to determine which labels will appear (which is even portable to another GIS or plotting package),


(iv) it is even amenable to analysis in case there are ever questions about the relationships between these choices of labels and any other variables, and


(v) by parsimoniously encoding the client's choice, it creates no duplicate information.


There are some general database construction and management principles at work here, as wisely suggested in the question. One of them is that any coherent piece of information should be uniquely represented in the database if possible. (Information used as keys to implement joins and relates of course must appear in multiple places by virtue of its function as identifying corresponding records in different tables.) There are excellent reasons for this principle, as anyone who has attempted to maintain a non-normalized relational database can attest: if you don't consistently remember to update or remove or add this information to every table in which it appears, your database soon becomes internally inconsistent: it's corrupted, often irretrievably so.


Another principle is that in a good relational database design, each table should represent a single conceptual "entity": something that the data are modeling or a relationship among those things. When a client specifies a seemingly arbitrary selection of features, they are effectively specifying a subset of rows in a table. Mathematically, by the axiom of separation this is the same as flagging them with a boolean field. Thus, any meaningful "arbitrary" subset of things in a database can be represented by a boolean field and, conversely, such a field is a good way to store arbitrary subsets (or selections).


Yet another principle is that you should prefer using the underlying data management capabilities of the GIS to store information. The alternative is some ad hoc method based on the capability of the GIS to store information within its "project files" or in some other independent way. A typical example of this is the practice of manually choosing and placing the desired labels. Often it's quick and easy to do this. The problems arise whenever either a change is needed or the work needs to be reproduced; one or the other of these situations is practically inevitable. Manual placement of the labels is tantamount to storing information (namely, what subset of features should be labeled) outside the RDBMS in an extremely elliptical fashion. Namely, the selection specified solely by which labels appear and which ones do not. Think about how you would then solve these follow-on problems:




  • The client wants the same labels to appear in a related but different map, part of a different project.





  • A question arises as to whether the labels are associated with some other attribute.




  • After making several changes to the labels over time, you are asked to revert to the original version.




In these cases, the work involved to solve the problem can be enormous: you have to redo the labeling all over again, or perform manual cross-checks against database tables, or find and restore an old archived project file. If the labels were instead represented by a boolean field in the database, the work would instead be almost trivial.


No comments:

Post a Comment

arcpy - Changing output name when exporting data driven pages to JPG?

Is there a way to save the output JPG, changing the output file name to the page name, instead of page number? I mean changing the script fo...