I have a dataset with geometry column in which there are linestrings and multilinestrings. While keeping the linestrings I want to convert the multilinestrings to linestrings (which should potentially increase the number of rows of the sf dataframe
). Unfortunately when I use sf::st_cast("LINESTRING")
there is a warning telling me that it is getting rid of all except the first linestring when transforming. Is there a way to keep all linestrings from the multilinestring when using sf::st_cast
. Reproducible example with warning below:
library(sf)
library(dplyr)
# sample dataframe - creating linestrings
df1 <- data.frame(lon = 1:10, lat = 1:10, var = c(1,1,1,2,2,2,3,3,4,4)) %>%
st_as_sf(coords = c("lon", "lat"), dim = "XY") %>% group_by(var) %>%
summarise(geometry = st_union(geometry), do_union = F) %>%
st_cast("LINESTRING")
# creating a multilinestring
df2 <- df1[1:2,] %>% mutate(var = c(1,1)) %>% group_by(var) %>%
summarise(geometry = st_union(geometry), do_union = F) %>%
st_cast("MULTILINESTRING")
# combining the two
df <- rbind(df1, df2)
# trying to convert only the multilinestring to two linestrings not changing
the already existing linestrings
df <- df %>% st_cast("LINESTRING")
# Warning message:
# In st_cast.MULTILINESTRING(X[[i]], ...) : keeping first linestring only
I can do it manually first converting everything to multilinestring, and after everything to linestring like in the following:
df <- df %>% st_cast("MULTILINESTRING") %>% st_cast("LINESTRING")
but is there maybe a better way of doing this?
Answer
One alternative, to apply an st_cast
to "LINESTRING"
over each row:
> do.call(rbind,lapply(1:nrow(df),function(i){st_cast(df[i,],"LINESTRING")}))
Simple feature collection with 6 features and 1 field
geometry type: LINESTRING
dimension: XY
bbox: xmin: 1 ymin: 1 xmax: 10 ymax: 10
epsg (SRID): NA
proj4string: NA
var geometry
1 1 LINESTRING (1 1, 2 2, 3 3)
2 2 LINESTRING (4 4, 5 5, 6 6)
3 3 LINESTRING (7 7, 8 8)
4 4 LINESTRING (9 9, 10 10)
5 1 LINESTRING (1 1, 2 2, 3 3)
6 1 LINESTRING (4 4, 5 5, 6 6)
cant really be much better than:
> st_cast(st_cast(df, "MULTILINESTRING"),"LINESTRING")
Simple feature collection with 6 features and 1 field
geometry type: LINESTRING
dimension: XY
bbox: xmin: 1 ymin: 1 xmax: 10 ymax: 10
epsg (SRID): NA
proj4string: NA
var geometry
1 1 LINESTRING (1 1, 2 2, 3 3)
2 2 LINESTRING (4 4, 5 5, 6 6)
3 3 LINESTRING (7 7, 8 8)
4 4 LINESTRING (9 9, 10 10)
5 1 LINESTRING (1 1, 2 2, 3 3)
6 1 LINESTRING (4 4, 5 5, 6 6)
I assume that's what you mean in your last line, you don't give code. This is probably pretty close to optimal. library(microbenchmark)
reckons the two-casts is about 10 times faster on your little example:
Unit: milliseconds
expr min lq mean median uq max neval
apply 9.087103 9.411445 10.056437 10.061594 10.50437 12.969576 100
casts 1.737474 1.819215 2.000212 1.866471 1.92306 4.406047 100
No comments:
Post a Comment