I am in the process of building a ggplot
choropleth map of population in administrative areas in Wales. I have downloaded the Boundary-Line data from the Ordnance Survey and extracted what seems to be the right shapefile (community_ward_region.shp). Using R, I have got as far as reading in the shapefile.
require(maptools)
shape <- readShapePoly(wards)
str(shape)
Which gives me this promising output:
Formal class 'SpatialPolygonsDataFrame' [package "sp"] with 5 slots
..@ data :'data.frame': 1690 obs. of 4 variables:
.. ..$ NAME : Factor w/ 1507 levels "Abbey Cwmhir",..: 969 90 111 200 441 477 1455 249 255 305 ...
.. ..$ DESCRIPTIO: Factor w/ 4 levels "COMMUNITY","COMMUNITY WARD",..: 2 2 1 1 2 2 2 2 1 1 ...
.. ..$ COMMUNITY : Factor w/ 858 levels "Abbey Cwmhir",..: 67 67 81 128 152 152 152 152 157 190 ...
.. ..$ FILE_NAME : Factor w/ 23 levels "ABERTAWE_-_SWANSEA",..: 1 1 1 1 1 1 1 1 1 1 ...
.. ..- attr(*, "data_types")= chr [1:4] "C" "C" "C" "C"
..@ polygons :List of 1690
.. ..$ :Formal class 'Polygons' [package "sp"] with 5 slots
.. .. .. ..@ Polygons :List of 1
.. .. .. .. ..$ :Formal class 'Polygon' [package "sp"] with 5 slots
.. .. .. .. .. .. ..@ labpt : num [1:2] 259009 188524
.. .. .. .. .. .. ..@ area : num 1923892
.. .. .. .. .. .. ..@ hole : logi FALSE
.. .. .. .. .. .. ..@ ringDir: int 1
.. .. .. .. .. .. ..@ coords : num [1:1629, 1:2] 259413 259420 259427 259427 259432 ...
.. .. .. ..@ plotOrder: int 1
.. .. .. ..@ labpt : num [1:2] 259009 188524
.. .. .. ..@ ID : chr "0"
.. .. .. ..@ area : num 1923892
Now if I do this:
bar <- fortify(shape, region = "NAME")
I get a nice data frame called bar
that looks pretty much as I expected:
> str(bar)
'data.frame': 4744053 obs. of 7 variables:
$ long : num 302962 302970 302974 303013 303015 ...
$ lat : num 280066 280076 280078 280097 280105 ...
$ order: int 1 2 3 4 5 6 7 8 9 10 ...
$ hole : logi FALSE FALSE FALSE FALSE FALSE FALSE ...
$ piece: Factor w/ 29 levels "1","2","3","4",..: 1 1 1 1 1 1 1 1 1 1 ...
$ group: Factor w/ 1762 levels "Abbey Cwmhir.1",..: 1 1 1 1 1 1 1 1 1 1 ...
$ id : chr "Abbey Cwmhir" "Abbey Cwmhir" "Abbey Cwmhir" "Abbey Cwmhir" ..
However, this is a large data frame and ggplot
runs out of puff when trying to display it. In reality I only want to look at one area at a time. It looks as if the FILE_NAME
factor in the shape object is what I want as it mostly corresponds to counties and the major conurbations.
> unique(shape@data$FILE_NAME)
[1] ABERTAWE_-_SWANSEA
[2] BLAENAU_GWENT_-_BLAENAU_GWENT
[3] BRO_MORGANNWG_-_THE_VALE_OF_GLAMORGAN
[4] CAERDYDD_-_CARDIFF
[5] CAERFFILI_-_CAERPHILLY
[6] CASNEWYDD_-_NEWPORT
[7] CASTELL-NEDD_PORT_TALBOT_-_NEATH_PORT_TALBOT
[8] CONWY_-_CONWY
[9] GWYNEDD_-_GWYNEDD
[10] MERTHYR_TUDFUL_-_MERTHYR_TYDFIL
[11] PEN-Y-BONT_AR_OGWR_-_BRIDGEND
[12] POWYS_-_POWYS
[13] RHONDDA_CYNON_TAF_-_RHONDDA_CYNON_TAFF
[14] SIR BENFRO - PEMBROKESHIRE
[15] SIR_BENFRO_-_PEMBROKESHIRE
[16] SIR_CEREDIGION_-_CEREDIGION
[17] SIR_DDINBYCH_-_DENBIGHSHIRE
[18] SIR_FYNWY_-_MONMOUTHSHIRE
[19] SIR_GAERFYRDDIN_-_CARMARTHENSHIRE
[20] SIR_YNYS_MON_-_ISLE_OF_ANGLESEY
[21] SIR_Y_FFLINT_-_FLINTSHIRE
[22] TOR-FAEN_-_TORFAEN
[23] WRECSAM_-_WREXHAM
23 Levels: ABERTAWE_-_SWANSEA ... WRECSAM_-_WREXHAM
Q. How can I select only a subset of the data from the shape
object I extract from the shapefile? For example, only the POWYS_-_POWYS
parts? If I can somehow include the FILE_NAME
in the data frame that is created with fortify
then I could easily subset the bar
data frame but I don't know how to do that. Or is there a way to use fortify
to extract only parts of the object?
Answer
Select a subset of the shapefile data using indexing:
sub.shape <- shape[shape$FILE_NAME == "POWYS_-_POWYS",]
fortify(sub.shape)
will then give you a much reduced dataframe.
No comments:
Post a Comment