r - Kriging using multiple spatial input variables and one spatial output (response) variable

Tuesday 4 July 2017

r - Kriging using multiple spatial input variables and one spatial output (response) variable

I have a data table with the following columns in a CSV file.

X    Y     Zinc      L    F     C
---  ---   -----     --   ---  ----

Of course, I can import the CSV file as a data frame

a <- read.csv('givendata.csv', header = T)

And convert this data into spatial data by

coordinates(a) = ~X + Y

Now, I wish to run kriging by using L, F and C as spatial input variables and Zinc as the output variable. How do I achieve this?

Edit: I am new to Kriging.

I am adding some more explanation to the question, based on a comment that I do not have an idea about linear modelling.

If I do the following

A1 <- read.csv('givendata.csv', header = T)
mymodel <- lm(Zinc ~ L + F + C, data = A1)

it gives the output, with all the regression coefficients and the constant. But, these observations have spatial inputs (see the meuse dataset), which is why it is referred to as a spatial variable (this is GIS.SE!).

Now suppose I were to take the following steps:

x.range <- (range(A1@coords[,1]))
y.range <- (range(A1@coords[,2]))

A1.grd <- expand.grid(x=seq(from=x.range[1], to=x.range[2], by=30),  
                    y=seq(from=y.range[1], to=y.range[2], by=30))


coordinates(A1.grd) <- ~x+y
gridded(A1.grd) <- TRUE

library(automap) 
kZinc <- autoKrige(Zinc ~ L + F + C, A1, A1.grd)

I get the following error:

Error in eval(expr, envir, enclos) : object 'L' not found

which is funny because the lm was working and the Kriging does not work!

Answer

As I stated in the comments, you need values of your explanatory variables at the prediction locations in order to make predictions. Here is a fully reproducible example:

library(automap)

Make 100 random data points since we don't have your data file:

A1 = data.frame(x=runif(100),y=runif(100), Zinc=runif(100), L=runif(100), F=runif(100), C=runif(100))
coordinates(A1)=~x+y

Predict over this grid (note I've used len=20 instead of by=30 from your code because my numbers are different):

x.range <- (range(A1@coords[,1]))
y.range <- (range(A1@coords[,2]))

A1.grd <- expand.grid(x=seq(from=x.range[1], to=x.range[2], len=20),  
                    y=seq(from=y.range[1], to=y.range[2], len=20))
coordinates(A1.grd) <- ~x+y
gridded(A1.grd) <- TRUE

Now let's reproduce what you did:

kZinc <- autoKrige(Zinc ~ L + F + C, A1, A1.grd)

That reproduces your error. Now I can see what your problem is, something impossible to tell from your initial post. Thank you for editing.

The problem is the model wants to predict Zinc at the grid locations, but you need values of L, F, and C at those locations, in order to do predictions. Let's just make some up to illustrate. I have a 20x20 grid so that's 400 points:

A1.grd$L=runif(400)
A1.grd$F=runif(400)
A1.grd$C=runif(400)

Moment of truth:

kZinc <- autoKrige(Zinc ~ L + F + C, A1, A1.grd)

And that works.

Now we have no idea what your L, F, and C represent. Normally they will be one of two kinds of things:

Spatially continuous quantities that you can get the value of at any (x,y) location, such as altitude, or distance to nearest road. Compute them over your (x,y) grid and do kriging as above.

Non-continuous quantities only valid at the sample level. For example if these are the amounts of Zinc in human blood samples you might have the age of the human sampled. How do you then do kriging predictions at a point? You don't have sample-specific measurements there because you haven't any samples there.

In this case you can literally make up some values. To continue my example you might choose the average age of people in your study, do the kriging with that, and then show a kriged map of "Zinc amount for 27 year olds". With more variables you may end up with something like "Zinc amount for unvaccinated 27 year old men" if your variables are vaccination status, age, and sex.

You could also do separate kriged maps for extreme values of your covariates, and show them on two plots, something like "Range of Zinc values for ages 12 (left figure) to 52 (right figure)".

Or you might find that your covariates aren't significant at all and just drop them.

Blog

Tuesday 4 July 2017

r - Kriging using multiple spatial input variables and one spatial output (response) variable

No comments:

Post a Comment

arcpy - Changing output name when exporting data driven pages to JPG?