| Title: | Tools For Working With Naturdata_kalo |
|---|---|
| Description: | This R package provides tools for working with the naturdata_kalo database |
| Authors: | Lars Dalby [aut, cre] (ORCID: <https://orcid.org/0000-0002-7270-6999>) |
| Maintainer: | Lars Dalby <[email protected]> |
| License: | MIT + file LICENSE |
| Version: | 0.4.16 |
| Built: | 2026-05-29 11:05:32 UTC |
| Source: | https://gitlab.au.dk/ecos/tools/r-pkgs/naturdata |
Generate a condition string based on the input arguments.
This function is only to be used in replace_na_if().
get_condition(column, progIds, naturtypeIds = NULL, year = NULL, year_cond)get_condition(column, progIds, naturtypeIds = NULL, year = NULL, year_cond)
column |
name The name of the column to replace values in |
progIds |
int The progIds to modify values within |
naturtypeIds |
int The naturtypeIds to modify values within |
year |
int The year to use in the condition. See |
year_cond |
chr The condition to apply on |
Calculate number of pins hit by an aggregated species group as the unique number of pins hit by the group.
ndb_aggregate_pinpoint(pp)ndb_aggregate_pinpoint(pp)
pp |
chr The pinpoint hits as character string of the format "[1,2,4,16]" |
int The number of pins
tibble::tibble( aktId = 1, artId = 1:2, pinpoint = c("[3,7,9,11,13,14,16]", "[1,2,3]") ) |> dplyr::summarize(hyppighed = ndb_aggregate_pinpoint(pinpoint), .by = aktId)tibble::tibble( aktId = 1, artId = 1:2, pinpoint = c("[3,7,9,11,13,14,16]", "[1,2,3]") ) |> dplyr::summarize(hyppighed = ndb_aggregate_pinpoint(pinpoint), .by = aktId)
Coordinates are stored as WKT on the database in the column geomText
This function will extract the x and y from geomText and add them
as columns to data.
ndb_append_coords(data)ndb_append_coords(data)
data |
data.frame containing a WKT string in |
Coordinates in geomText are in UTM32 (EPSG 25832)
data with augmented with utm_x and utm_y
Calculate the average of e.g. vegetationheight which is recorded by
four measurements each in it own column. The average is only calculated
where there are at least min_n_values non-na values.
ndb_calc_avg(tbl, prefix, min_n_values = 3)ndb_calc_avg(tbl, prefix, min_n_values = 3)
tbl |
The table with the columns to average over. |
prefix |
chr The string with which the columns are selected
(using |
min_n_values |
int Number of required non-na values (Default 3). |
tibble Table with aktId and the calculated average as prefix_avg
Calculate the cover ratio as the ratio between species A to the sum of species A and species B The cover ratio is defined as
ndb_calc_cover_ratio(species_A, species_B)ndb_calc_cover_ratio(species_A, species_B)
species_A |
int The number of pins hitting species A. |
species_B |
int The number of pins hitting species B. |
The ratio of species A to the sum of species A and species B.
ndb_calc_cover_ratio(10L, 5L)ndb_calc_cover_ratio(10L, 5L)
The function will calculate the ratio between jord_C and jord_N while performing checks for whether both are within the detection limit. Lastly the function will ensure that the calculated ratio is within 10 and 60.
ndb_calc_jord_cn(tbl)ndb_calc_jord_cn(tbl)
tbl |
A table containing the columns jord_C, jord_N, jord_C_under_detektionsgraensen and jord_N_under_detektionsgraensen. |
tbl agumented with the jord_CN column
Art, Hybrid, Sektion, Superart are counted as unique observations. Form and Variant are counted as their parent species.
ndb_calc_n_arter(artsliste, artsnavne)ndb_calc_n_arter(artsliste, artsnavne)
artsliste |
data.frame with |
artsnavne |
data.frame returned by |
tibble with artId & n_arter
Function to calculate the number of large trunks. What constitutes a large trunk is dependent on the naturetype. Since the naturetype can vary over time and is sometimes interpreted from previous registrations n_store_stammer needs to be re-calculated (in some situations).
ndb_calc_n_store_stammer(nt, traeregistrering, traediameter)ndb_calc_n_store_stammer(nt, traeregistrering, traediameter)
nt |
A table with (at least) |
traeregistrering |
The table |
traediameter |
The list |
The circle diameter has varied over time between 10 and 15m. To accomodate
for that observations from the 10m circle are scaled up by the ratio of the
areas of two circles with radii 15 and 10 is given by:
Table with aktId and n_store_stammer_15m
Calculate a weighted assessment where the area within and outside the NATURA 2000 sites is used as weights.
ndb_calc_weighted_assessment(Ai, Au, ni, nu, vi, vu)ndb_calc_weighted_assessment(Ai, Au, ni, nu, vi, vu)
Ai |
Area inside NATURA 2000 sites |
Au |
Area outside the NATURA 2000 sites |
ni |
Number of plots inside NATURA 2000 |
nu |
Number of plots outside NATURA 2000 |
vi |
Proportion of plots that meet the criteria inside the NATURA 2000 sites |
vu |
Proportion of plots that meet the criteria outside the NATURA 2000 sites |
The weighted assessment is defined as:
The weighted assesment
Cases with hyppighed > 16 are set to NA. Cases where
hyppighed == 0 & isPinpoint == 1 are changed to
isPinpoint == 0 and isProevefelt == 1
ndb_clean_artsliste(tbl)ndb_clean_artsliste(tbl)
tbl |
tibble The raw.artsliste table |
tbl cleaned
The function will insert a value of half the detection limit
for values less than the detection limit (0.05 for jord_C and 0.01
for jord_N).
For pH values outside 2.5 - 8.5 will be replaced with NA. In case
vand_ph is missing vand_ph2 is used instead. The vand_ph2
is not included in the returned table.
ndb_clean_kemi(tbl)ndb_clean_kemi(tbl)
tbl |
tibble The raw.kemi table |
The ratio between jord_C and jord_N is also calculated and
added to the returned table.
tbl cleaned
This function will perform several clean-up steps.
Replace vegetationshojde > 300 with NA
Replace areas over 78.5 and <= 80 with 78.5
Replace areas over 80 with NA
Replace negative values with NA
ndb_clean_lysaaben_struktur(tbl)ndb_clean_lysaaben_struktur(tbl)
tbl |
tibble The raw.lysaaben_struktur table |
In addition NA's are replaced with 0 under these conditions #TODO
hoeljer_5m_observeret is not written to data as this is the raw interval strings which are not used currently.
tbl cleaned
The function will ensure that all species have artId and
that all naturtypeId's are valid (i.e. not marine or aquatic).
The function will also insert a value of half the detection limit
for values less than the detection limit (0.03 for plante_N and 0.01
for plante_P) and add julian date as a column.
Cases where samples have been taken from species that was not part of
the protocol are removed (Dunet dueurt, Søblomme, Ager-padderok,
Uomsat dødt organisk materiale).
Lastly a few cases where multiple artId for the same artsNavn
are set to the correct artID.
ndb_clean_plantekemi(tbl, nogo_naturtyper)ndb_clean_plantekemi(tbl, nogo_naturtyper)
tbl |
tibble The raw.plantekemi table |
nogo_naturtyper |
integer Vector of illegal naturtypeId's. |
The julianDate column is calculated using lubridate::yday().
To have valid column names when pivoting to wider formats
values with å in retning are replaced with aa.
Cases where nedbrydning != 1:5 are set to NA.
The comment column is dropped.
ndb_clean_skov_doedtved(tbl)ndb_clean_skov_doedtved(tbl)
tbl |
tibble The raw.skov_doedtved table |
tbl cleaned
Columns with coverage in the 5 m circle have areas adjusted to
78.5 m2 when values are between 78.5 - 80, coverage over 80 are
set to NA.
In all cases negative values are replaced with NA. For progId
133, 134 with naturtypeId 9999 NA are replaced with 0.
Count and area columns are converted to integer.
For for all coverage variables NAs are replaced with 0.
For n_hulheder_15m and n_raad_15m NAs are only
replaced from 2011 and onwards.
ndb_clean_skov_struktur(tbl)ndb_clean_skov_struktur(tbl)
tbl |
tibble The raw.skov_struktur table |
The sum of hulheder_15m and raad_15m is also calculated
by this function as n_hulheder_raad_samlet_15m
Currently just sets naturtypeId to NA when status
is not "Godkendt data" (statusId = 100).
ndb_clean_stamdata(tbl)ndb_clean_stamdata(tbl)
tbl |
tibble The raw.stamdata table |
tbl cleaned
The function assumes that server, database,
uid & pwd are stored in the .Renviron file
using the variable names NATURDATA_KALO_SERVER, NATURDATA_KALO_DB,
NATURDATA_KALO_UID & NATURDATA_KALO_PWD
ndb_connect()ndb_connect()
Note that the ODBC Driver 17 for SQL Server is required.
OdbcConnection
This wraps DBI::dbWriteTable() to create a new table
ndb_create_tbl(con, obj, schema = "data", table)ndb_create_tbl(con, obj, schema = "data", table)
con |
An OdbcConnection |
obj |
Table to write |
schema |
chr Name of the schema |
table |
chr Name of the table to create |
create_tbl will overwrite any existing table.
TRUE, invisibly
Simple wrapper around DBI::dbDisconnect()
ndb_disconnect(con)ndb_disconnect(con)
con |
OdbcConnection |
Filter artslisten to only have observations where isPinpoint,
isProvefelt or is5m has the value of 1.
ndb_filter_artsliste(artsliste)ndb_filter_artsliste(artsliste)
artsliste |
The artsliste table |
Filter on progId to only 117,119,131:132,142,144
ndb_filter_lysaaben_prog(tbl)ndb_filter_lysaaben_prog(tbl)
tbl |
A table that contains the column |
tbl filtered
Filter on progId to only 133, 134 or 140
ndb_filter_skov_prog(tbl)ndb_filter_skov_prog(tbl)
tbl |
A table that contains the column |
tbl filtered
Get a specified table from the database
ndb_get(con, schema, tbl)ndb_get(con, schema, tbl)
con |
Connection to the database |
schema |
chr Name of the schema where |
tbl |
chr Name of table |
Get the artsgruppe_planter table
ndb_get_artsgruppe(con)ndb_get_artsgruppe(con)
con |
Connection to the database |
A lazy table
lists.artsgruppe_planter
Get the artsliste table and filter it to only have
observations where isPinpoint, isProvefelt or is5m has the
value of 1.
ndb_get_artsliste_pf(con)ndb_get_artsliste_pf(con)
con |
Connection to the database |
data.artsliste
Get the list of artsscorer per species
ndb_get_artsscorer(con)ndb_get_artsscorer(con)
con |
Connection to the database |
A lazy table
lists.artsScore
Get the list of traits
ndb_get_arttraits(con)ndb_get_arttraits(con)
con |
Connection to the database |
A lazy table
lists.arttraits
Get the skov doedtved table
ndb_get_doedtved(con)ndb_get_doedtved(con)
con |
Connection to the database |
A lazy table
data.skov_doedtved
Get the list of Ellenberg values (the English version) plus T_EllTysk.
It is somewhat misleading that T_EllTysk is in lists.ellenberg_engelsk,
but it is not a mistake.
ndb_get_ellenberg(con)ndb_get_ellenberg(con)
con |
Connection to the database |
A lazy table
lists.ellenberg_engelsk
Get the list of lichen species names (i.e. species in class
Lecanoromycetes). The returned species list will have
a column isLav with all 1L's.
ndb_get_laver(con)ndb_get_laver(con)
con |
Connection to the database |
A tibble with lichen species names.
lists.artsNavne
Get the lysaaben struktur table
ndb_get_lysaaben_struktur(con)ndb_get_lysaaben_struktur(con)
con |
Connection to the database |
A lazy table
data.lysaaben_struktur
Get the list of bryophyte species names (i.e. species in phylum Bryophyta).
ndb_get_mosser(con)ndb_get_mosser(con)
con |
Connection to the database |
A tibble with bryophyte species names.
lists.artsNavne
Get a list of aktId's where formaal == "NOVANA" and
ansvarligMyndighed == "Milj\u00f8styrelsen"
ndb_get_novana_aktids(con)ndb_get_novana_aktids(con)
con |
Connection to the database |
A lazy table with progId, aktId & dato
data.stamdata
Return the plantekemi table
ndb_get_plantekemi(con)ndb_get_plantekemi(con)
con |
Connection to the database |
A lazy table
data.plantekemi
Query the database for schemas, tables and their columns. Useful when only given a column name and need to find out where it lives on the database.
ndb_get_schema(con)ndb_get_schema(con)
con |
OdbcConnection |
The rapportering scheme is rarely relevant and is not returned. The same goes for two tables trend_p4 and tilstand_p4#'
A dataframe with
chr, Schema name
chr, Table name
chr, Column name
Get the skov struktur table
ndb_get_skov_struktur(con)ndb_get_skov_struktur(con)
con |
Connection to the database |
A lazy table
data.skov_struktur
This table holds information about the diameter a trunk should have for it to be considered "big".
ndb_get_skov_trae_diam_naturtype(con)ndb_get_skov_trae_diam_naturtype(con)
con |
Connection to the database |
A lazy table
list.trae_diameter_naturtype
Get the skov traeregistrering table
ndb_get_skov_traeregistrering(con)ndb_get_skov_traeregistrering(con)
con |
Connection to the database |
A lazy table
data.skov_traeregistrering
Get the stamdata table
ndb_get_stamdata(con)ndb_get_stamdata(con)
con |
Connection to the database |
Get table from the database and filter to only rows with
formaal == "NOVANA" and ansvarligMyndighed == "Miljøstyrelsen".
Joins (both the semijoin on novana and the left_join on kv) are
done on columns progId and aktId.
ndb_get_tbl(con, tbl, novana, kv = NULL)ndb_get_tbl(con, tbl, novana, kv = NULL)
con |
Connection to the database |
tbl |
chr Name of table |
novana |
lazy table Table with the progId & aktId that are from the NOVANA monitoring program. |
kv |
lazy table An optional table with status ("Godkendt data", "Indtastning påbegyndt") for the data. |
The filtering is done via the dplyr::semi_join() on novana
which is a table of progId and aktId where only the
entries relevant for NOVANA are included.
lazy table
The names of the variables (aka. indikators) used on Naturdata_kalo are not well suited for use on a webpage or figure. This function will return a more meaningful, but verbose string.
ndb_label(data, ...)ndb_label(data, ...)
data |
Vector of variable names |
... |
Not used |
data as label
ndb_label("val_vegetationshoejde_avg_5m")ndb_label("val_vegetationshoejde_avg_5m")
A named vector of variable (aka. indicator) labels. Not particularly useful
on its own, but is the basis for the rename function ndb_label()
ndb_labelsndb_labels
A named vector of length 130
A named vector of variable (aka. indicator) names. Not particularly useful
on its own, but is the basis for the rename function ndb_rename()
ndb_namesndb_names
A named vector of length 130
The names of naturdata_kalo where updated in 2024. This function makes it easy to convert old variable names to the new ones.
ndb_rename(data, ...)ndb_rename(data, ...)
data |
Vector of names or a data.frame with old names |
... |
Not used |
Names in the deprecated Frekvens and Cover tables where not unique. To ensure that the function returns the correct names prefix the old names with "frekvens_" or "cover_". The ambigous names where klokkelyng, hedelyng, blåtop, stjerne-bredribbe and revling.
data renamed
ndb_rename("antal_unik") tibble::tibble( "enebaer_5m" = 1:3, "unknown" = "A") |> ndb_rename()ndb_rename("antal_unik") tibble::tibble( "enebaer_5m" = 1:3, "unknown" = "A") |> ndb_rename()
cover to percentFor communication of results it is often desirable
to use percent rather than . This function converts
measurements in to percent using the convention that
a 5m circle has an area of 78.5 and a 15m circle an area of
706.9
ndb_to_percent(value, radius = NA)ndb_to_percent(value, radius = NA)
value |
A measurement in |
radius |
Radius of the circle (either 5 or 15) |
Utility function to update the status, rundate and version in the import.jobs table
ndb_update_import_job(con, job, status, version)ndb_update_import_job(con, job, status, version)
con |
OdbcConnection |
job |
chr JobId e.g. "data.lysaaben_struktur" |
status |
chr The status which can be either running or complete |
version |
chr The current version of the project. This is passed to the script as a an input argument. See naturdata-raw-to-data |
TRUE, invisibly
This a modified version of ndb_write_to_db() where
the rows in name are written species by species to
avoid writing too large chunks.
ndb_write_artsliste(con, tbl, schema = "data", name)ndb_write_artsliste(con, tbl, schema = "data", name)
con |
OdbcConnection to |
tbl |
tibble The table to be written |
schema |
chr Name of the schema to write to |
name |
chr Name of table to be written to |
The function will first call DELETE FROM before appending the
contents of tbl.
ndb_write_to_db(con, tbl, schema = "data", name)ndb_write_to_db(con, tbl, schema = "data", name)
con |
OdbcConnection to |
tbl |
tibble The table to be written |
schema |
chr Name of the schema to write to |
name |
chr Name of table to be written to |
TRUE, invisibly
A simple vector of marine and aquatic naturtypeId's. These are used for filtering tables to contain only the terrestrial types.
nogo_naturtypernogo_naturtyper
A vector of 21 integer values
Replace NA in a column with 0 when conditions based on progId
and optionally naturtypeId are true.
replace_na_if( tbl, column, progIds, naturtypeIds = NULL, year = NULL, year_cond )replace_na_if( tbl, column, progIds, naturtypeIds = NULL, year = NULL, year_cond )
tbl |
Input table |
column |
name The name of the column to replace values in |
progIds |
int The progIds to modify values within |
naturtypeIds |
int The naturtypeIds to modify values within |
year |
int The year to use in the condition. See |
year_cond |
chr The condition to apply on |
tbl with column modified.