R/extractEnv.R
extractEnv.Rd
Accesses and download environmental data from the IMOS THREDDS server and append variables to detection data based on date of detection
extractEnv(
df,
X = "longitude",
Y = "latitude",
datetime = "detection_timestamp",
env_var,
folder_name = NULL,
verbose = TRUE,
cache_layers = TRUE,
crop_layers = TRUE,
full_timeperiod = FALSE,
fill_gaps = FALSE,
buffer = NULL,
nrt = FALSE,
output_format = ".grd",
.parallel = TRUE,
.ncores = NULL
)
detection data source in data frame with at the minimum a X, Y and date time field
name of column with X coordinate or longitude (EPSG 4326)
name of column with Y coordinate or latitude (EPSG 4326)
name of column with date time stamp (Coordinated Universal Time; UTC)
variable needed options include ('rs_sst', 'rs_sst_interpolated', 'rs_salinity', 'rs_chl', 'rs_turbidity', 'rs_npp', 'bathy', 'dist_to_land', 'rs_current')
name of folder within 'imos.cache' where downloaded rasters should be saved. default NULL produces automatic folder names based on study extent
should function provide details of what operation is being conducted.
Set to FALSE
to keep it quiet
should the extracted environmental data be cached within the working directory? if FALSE stored in temporary folder and discarded after environmental extraction
should the extracted environmental data be cropped to within the study site
should environmental variables extracted for each day across full monitoring period, time and memory consuming for long projects
should the function use a spatial buffer to estimate
environmental variables for detections where there is missing data. Default
is FALSE
to save computational time.
radius of buffer (in m) around each detection from which
environmental variables should be extracted from. A median value of pixels
that fall within the buffer will be used if fill_gaps = TRUE
. If NULL
a
buffer will be chosen based on the resolution of environmental layer. A
numeric value (in m) can be used here to customise buffer radius.
should Near Real-Time current data be used if Delayed-Mode current
data is missing. Default is FALSE
, in which case NA's are appended to current
variables for years (currently, all years after 2020) when current data are
missing. Note that Near Real-Time data are subject to less quality control
than Delayed-Mode data.
File format for cached environmental layers. You can use
gdal(drivers=TRUE)
to see what drivers are available in your installation.
The default format is '.grd'.
should the function be run in parallel
number of cores to use if set to parallel. If none provided,
uses detectCores
to determine number.
a dataframe with the environmental variable appended as an extra column based on date of each detection
The extractEnv
function allows the user to access, download and
append a range of environmental variables to each detection within a telemetry
data set. We advocate for users to first undertake a quality control step using
the runQC
function before further analysis, however the
functionality to append environmental data will work on any dataset that has
at the minimum spatial coordinates (i.e., latitude, longitude; in EPSG 4326)
and a timestamp (in UTC) for each detection event. Quality controlled
environmental variables housed in the IMOS Thredds server will be extracted
for each specific coordinate at the specific timestamp where available. A
summary table of the full range of environmental variables currently
available can be accessed using the imos_variables
function.
## Input example detection dataset that have run through the quality control
## workflow (see 'runQC' function)
library(tidyverse)
data("TownsvilleReefQC")
## simplify & subset data for example speed-up
qc_data <-
TownsvilleReefQC %>%
unnest(cols = c(QC)) %>%
ungroup() %>%
filter(Detection_QC %in% c(1,2)) %>%
filter(filename == unique(filename)[1]) %>%
slice(5:8)
## Extract daily interpolated sea surface temperature
## cache_layers & fill_gaps args set to FALSE for speed
data_with_sst <-
extractEnv(df = qc_data,
X = "receiver_deployment_longitude",
Y = "receiver_deployment_latitude",
datetime = "detection_datetime",
env_var = "rs_sst_interpolated",
cache_layers = FALSE,
crop_layers = TRUE,
full_timeperiod = FALSE,
fill_gaps = TRUE,
folder_name = "test",
.parallel = FALSE)
#> Extracting environmental data only on days detections were present; between 2013-08-18 and 2013-09-05 (3 days)
#> This may take a little while...
#> Accessing and downloading IMOS environmental variable: rs_sst_interpolated
#> Checking if files exist on IMOS server...
#>
|
| | 0%
|
|======================= | 33%
|
|=============================================== | 67%
|
|======================================================================| 100%
#> Extracting and appending environmental data
#> Filling gaps in environmental data by extracting median values from a 15km buffer around detections that fall on 'NA' values