Title: | Download and Integrate Zooplankton Datasets from the Upper San Francisco Estuary |
---|---|
Description: | This package downloads and integrates zooplankton datasets from the Sacramento San Joaquin Delta. Datasets are manipulated into a consistent format and bound together, then differences in taxonomic resolution among datasets are resolved using one of two methods, depending on whether the user wishes to analyze community or taxa-specific trends. Ancillary environmental data are retained in the final dataset. Users can also filter the dataset by a number of parameters. |
Authors: | Samuel M Bashevkin [aut, cre] , Rosemary Hartman [aut] , Karrin Alstad [aut], Catarina Pien [aut] |
Maintainer: | Samuel M Bashevkin <[email protected]> |
License: | GPL-3 |
Version: | 2.5.0.9000 |
Built: | 2024-11-25 21:33:14 UTC |
Source: | https://github.com/InteragencyEcologicalProgram/zooper |
Length-weight equations for macro zooplankton to be used for biomass conversions. The equations relate length in mm to dry mass in milligrams. Dry mass can be converted to carbon mass by assuming 40 Uye, S. 1982. Length-weight relationships of important zooplankton from the Inland Sea of Japan. Journal of the Oceanographical Society of Japan 38:149–158.
biomass_macro
biomass_macro
a tibble with 23 rows and 9 columns.
Current scientific name.
Taxonomic level of the taxa.
Preservative used to store sample before individuals were measured to develop the equations.
The type of weight measurement.
The number of individuals used in developing the equation.
Minimum length (mm) of individuals used in developing the equation.
Maximum length (mm) of individuals used in developing the equation.
Coefficient a in the equation Weight (mg) = a * Length (mm) ^ b.
Coefficient b in the equation Weight (mg) = a * Length (mm) ^ b.
Average carbon biomass values for meso and micro zooplankton to be used for biomass conversions
biomass_mesomicro
biomass_mesomicro
a tibble with 44 rows and 4 columns.
Current scientific name.
Taxonomic level of the taxa.
Plankton lifestage.
Average carbon mass of an individual in micrograms.
Calculates taxa by life stage combos present in all source datasets
Commontaxer(Source_taxa_key, Taxa_level, Size_class)
Commontaxer(Source_taxa_key, Taxa_level, Size_class)
Source_taxa_key |
A dataframe with columns named Source, Lifestage, SizeClass, and the value provided to the parameter |
Taxa_level |
Taxonomic level you would like to perform this calculation for. E.g., if you wish to determine all Genus x lifestage combinations present in all datasets, provide |
Size_class |
The size class(es) you would like this function to consider. You should generally only supply 1 size class. |
This function is designed to work on just one size class. To apply to multiple size classes, use map or apply functions to apply across size classes.
A tibble with a column for Taxa_level
and another for Lifestage
representing all combinations of these values present in all source datasets.
Sam Bashevkin
Zoopsynther
, crosswalk
, SourceTaxaKeyer
## Not run: library(rlang) library(purrr) SourceTaxaKey <- SourceTaxaKeyer(zoopComb, crosswalk) Size_classes <- set_names(c("Micro", "Meso", "Macro")) Commontax <- map(Size_classes, ~ Commontaxer(SourceTaxaKey, "Taxname", .)) ## End(Not run)
## Not run: library(rlang) library(purrr) SourceTaxaKey <- SourceTaxaKeyer(zoopComb, crosswalk) Size_classes <- set_names(c("Micro", "Meso", "Macro")) Commontax <- map(Size_classes, ~ Commontaxer(SourceTaxaKey, "Taxname", .)) ## End(Not run)
A complete list of all valid taxonomic names included in the full dataset. Used to limit choices for filtering by taxa.
completeTaxaList
completeTaxaList
a character vector of length 454.
Taxnamefinder
, Zoopsynther
, zooper
A crosswalk table relating the taxonomic code used by each dataset to current scientific names, life stages, and taxonomic hierarchies.
crosswalk
crosswalk
a tibble with 404 rows and 34 variables
Taxonomic codes used in the Environmental Monitoring Program microzooplankton (43 m) mesh dataset
Taxonomic codes used in the Environmental Monitoring Program mesozooplankton (160 m) mesh dataset
Taxonomic codes used in the Environmental Monitoring Program macrozooplankton (505 m mesh) dataset
Taxonomic codes used in the Townet Survey mesozooplankton (160 m mesh) dataset
Taxonomic codes used in the Townet Survey macrozooplankton (505 m mesh) dataset
Taxonomic codes used in the Fall Midwater Trawl mesozooplankton (160 m mesh) dataset
Taxonomic codes used in the Fall Midwater Trawl macrozooplankton (505 m mesh) dataset
Taxonomic codes used in the 20mm Survey mesozooplankton (160 m mesh) dataset
Taxonomic codes used in the Fish Restoration Program mesozooplankton (150 m mesh) dataset
Taxonomic codes used in the Fish Restoration Program macrozooplankton (500 m mesh) dataset
Taxonomic codes used in the Directed Outflow Project mesozooplankton (150 m mesh) dataset
Taxonomic codes used in the Directed Outflow Project macrozooplankton (500 m mesh) dataset
Taxonomic codes used in the Yolo Bypass Fish Monitoring Program zooplankton dataset)
Plankton lifestage
Current scientific name
Taxonomic level of the taxa
Phylum
Class
Order
Family
Genus
Species
Introduction year for non-native species
First year the Environmental Monitoring Program starting counting this taxa
Last year the Environmental Monitoring Program counted this taxa
First year the Fall Midwater Trawl starting counting this taxa
Last year the Fall Midwater Trawl counted this taxa
First year the 20mm Survey starting counting this taxa
Last year the 20mm Survey counted this taxa
First year the 20mm Survey restarted counting this taxa
First year DOP starting counting this taxa
Last year DOP counted this taxa
Zoopdownloader
, Zoopsynther
, zooper
Sums to least common denominator taxa, one taxonomic level at a time
LCD_Taxa( Data, Taxalevel, Groupers = c("Genus_g", "Family_g", "Order_g", "Class_g", "Phylum_g"), Response = "CPUE" )
LCD_Taxa( Data, Taxalevel, Groupers = c("Genus_g", "Family_g", "Order_g", "Class_g", "Phylum_g"), Response = "CPUE" )
Data |
Zooplankton dataset including columns named the same as the |
Taxalevel |
The value of Groupers on which to apply this function. |
Groupers |
A character vector of names of additional taxonomic levels to be removed in this step. This vector can include |
Response |
Which response variable(s) would you like for the zooplankton data? Choices are "CPUE" (catch per unit effort) and "BPUE" (carbon biomass per unit effort ( |
This function is designed to work on just one Taxalevel at a time. To apply to multiple Taxalevels, use map or apply functions to apply across taxonomic levels.
A tibble with sums calculated for each unique value in Data$Taxalevel
. Sums will be excluded for grouping taxa that only contain 1 unique Taxname.
Sam Bashevkin
Zoopsynther
, crosswalk
, zoopComb
## Not run: library(dplyr) df <- zoopComb%>% mutate(dplyr::across(tidyselect::all_of(c("Genus", "Family", "Order", "Class", "Phylum")), list(g=~if_else(.%in%completeTaxaList, ., NA_character_))))%>% select(-Phylum, -Class, -Order, -Family, -Genus, -Species, -Taxlifestage) family_sums <- LCD_Taxa(df, "Family_g") ## End(Not run)
## Not run: library(dplyr) df <- zoopComb%>% mutate(dplyr::across(tidyselect::all_of(c("Genus", "Family", "Order", "Class", "Phylum")), list(g=~if_else(.%in%completeTaxaList, ., NA_character_))))%>% select(-Phylum, -Class, -Order, -Family, -Genus, -Species, -Taxlifestage) family_sums <- LCD_Taxa(df, "Family_g") ## End(Not run)
Computes a dataframe with all unique taxa by lifestage combinations present in each source and size class
SourceTaxaKeyer(Data, Crosswalk)
SourceTaxaKeyer(Data, Crosswalk)
Data |
Zooplankton dataset. Must have a column named |
Crosswalk |
Crosswalk table (e.g., |
a tibble with the complete taxonomic information for each combination of source and size class.
Sam Bashevkin
Zoopsynther
, crosswalk
, zoopComb
SourceTaxaKey <- SourceTaxaKeyer(Data = dplyr::filter(zoopComb, Source!="YBFMP"), Crosswalk = crosswalk)
SourceTaxaKey <- SourceTaxaKeyer(Data = dplyr::filter(zoopComb, Source!="YBFMP"), Crosswalk = crosswalk)
First dates sampled by each survey and size class
startDates
startDates
a tibble with 14 rows and 3 columns.
Abbreviated name of the source dataset. "EMP"=Environmental Monitoring Program, "FRP"=Fish Restoration Program, "FMWT"= Fall Midwater Trawl, "STN"= Townet Survey, "20mm" =20mm survey, "DOP" = Directed Outflow Project Lower Trophic Study, and "YBFMP"= Yolo Bypass Fish Monitoring Program.
Net size class. Micro corresponds to 43 (EMP) or 50 (YBFMP) m mesh, Meso corresponds to 150 (FRP and DOP) or 160 (EMP, FMWT, STN, 20mm, YBFMP)
m mesh, and Macro corresponds to 500 (FRP and DOP) - 505 (EMP, FMWT, STN)
m mesh. However, prior to 1974 EMP macrozooplankton were sampled with a 930
m mesh net.
Date first sample was collected.
Latitudes and longitudes for each zooplankton station.
stations
stations
a tibble with 387 rows and 4 columns
Abbreviated name of the source dataset
Sampling station name
Latitude in decimal degrees
Longitude in decimal degrees
Latitudes and longitudes for EMP EZ stations on each sampling date from 2004 to present.
stationsEMPEZ
stationsEMPEZ
a tibble with 491 rows and 4 columns
Date sample was collected
Sampling station name
Latitude in decimal degrees
Longitude in decimal degrees
Helps filter the zooplankton dataset by returning a set of lowest-level taxa (i.e. the level taxa were recorded at when counted in plankton samples) within a vector of taxa (which can include taxa from any taxonomic level).
Taxnamefinder(Crosswalk, Taxa)
Taxnamefinder(Crosswalk, Taxa)
Crosswalk |
Crosswalk table (such as |
Taxa |
A character vector of taxa you wish to select. These taxa can be from any taxonomic level present in the list above. If using the built-in data and crosswalk, they should be present in the |
A character vector of scientific names contained within the vector of Taxa
provided.
Sam Bashevkin
Taxnames <- Taxnamefinder(crosswalk, c("Calanoida", "Cyclopoida"))
Taxnames <- Taxnamefinder(crosswalk, c("Calanoida", "Cyclopoida"))
Detects years when a species was present in the system (i.e., post-invasion for invasive species) but not counted in each zooplankton survey
Uncountedyears(Source, Size_class, Crosswalk, Start_year, Intro_lag)
Uncountedyears(Source, Size_class, Crosswalk, Start_year, Intro_lag)
Source |
String with the name of the source dataset (e.g., |
Size_class |
String with the name of the desired zooplankton size class (e.g., |
Crosswalk |
Crosswalk table like |
Start_year |
First year the |
Intro_lag |
Number of years buffer after a species is introduced when we expect surveys to start recording them. Effectively adds |
This function is designed to work on one source and size class at a time. To apply across multiple IDs, use the map or apply functions.
A tibble with columns for the Taxlifestage, Taxname, Lifestage, Source, Sizeclass, and then a list-column of years in which that particular taxon was not counted in the specified study and size class. Taxa that were counted in all applicable years are not included in the tibble.
Sam Bashevkin
Zoopsynther
, crosswalk
, startDates
require(purrr) require(dplyr) require(lubridate) datasets<-zooper::zoopComb%>% mutate(names=paste(Source, SizeClass, sep="_"))%>% select(names, Source, SizeClass)%>% filter(Source%in%c("EMP", "FMWT", "twentymm"))%>% distinct() BadYears<-map2_dfr(datasets$Source, datasets$SizeClass, ~ Uncountedyears(Source = .x, Size_class = .y, Crosswalk = zooper::crosswalk, Start_year = zooper::startDates%>% filter(Source==.x & SizeClass==.y)%>% pull(Startdate)%>% year(), Intro_lag=2))
require(purrr) require(dplyr) require(lubridate) datasets<-zooper::zoopComb%>% mutate(names=paste(Source, SizeClass, sep="_"))%>% select(names, Source, SizeClass)%>% filter(Source%in%c("EMP", "FMWT", "twentymm"))%>% distinct() BadYears<-map2_dfr(datasets$Source, datasets$SizeClass, ~ Uncountedyears(Source = .x, Size_class = .y, Crosswalk = zooper::crosswalk, Start_year = zooper::startDates%>% filter(Source==.x & SizeClass==.y)%>% pull(Startdate)%>% year(), Intro_lag=2))
A table listing the taxonomic names and life stages of plankton undersampled by each net mesh size (i.e. size class)
undersampled
undersampled
a tibble with 27 rows and 3 columns
The size class of zooplankton intended to be capture be each net mesh size. Micro corresponds to 43 m mesh and Meso corresponds to 150-160
m mesh.
The scientific name of taxa undersampled by the corresponding mesh size class
The lifestage of each taxa undersampled by the corresponding mesh size class
Extract latest EDI files This function extracts the latest version of a zooplankton EDI package and the list of files from that package
zoop_urls(Sources)
zoop_urls(Sources)
Sources |
Source datasets to be included. Choices include "EMP" (Environmental Monitoring Program), "FRP" (Fish Restoration Program), "FMWT" (Fall Midwater Trawl), "STN" (Townet Survey), "DOP" (Directed Outflow Project), and "20mm" (20mm survey). The YBFMP datasets cannot be used in this function due to taxonomic and life stage issues with that dataset. Defaults to |
A list with the files and/or URLs for each source dataset
Sam Bashevkin
g/ m3)) for taxa with conversion equations.Convert CPUE to biomass
This function converts zooplankton CPUE to carbon biomass (Carbon biomass per unit effort (g/ m3)) for taxa with conversion equations.
Zoopbiomass( Zoop, ZoopLengths, Biomass_mesomicro = zooper::biomass_mesomicro, Biomass_macro = zooper::biomass_macro )
Zoopbiomass( Zoop, ZoopLengths, Biomass_mesomicro = zooper::biomass_mesomicro, Biomass_macro = zooper::biomass_macro )
Zoop |
Zooplankton count dataset |
ZoopLengths |
Zooplankton length dataset for macrozooplankton. |
Biomass_mesomicro |
The micro and meso zooplankton biomass conversion table. The default is |
Biomass_macro |
The macro zooplankton biomass conversion table. The default is |
All source zooplankton datasets combined into one tibble.
zoopComb
zoopComb
a tibble with 3,615,105 rows and 14 columns.
Abbreviated name of the source dataset. "EMP"=Environmental Monitoring Program, "FRP"=Fish Restoration Program, "FMWT"= Fall Midwater Trawl, "STN"= Townet Survey, "20mm" =20mm survey, "DOP" = Directed Outflow Project Lower Trophic Study, and "YBFMP"= Yolo Bypass Fish Monitoring Program.
Net size class. Micro corresponds to 43-50 m mesh, Meso corresponds to 150-160
m mesh, and Macro corresponds to 500-505
m mesh. However, prior to 1974 EMP macrozooplankton were sampled with a 930
m mesh net.
Volume (m3) of water sampled
Zooplankton life stage
Scientific name
Phylum
Class
Order
Family
Genus
Species
Combined Taxname and Lifestage
Unique ID of the zooplankton sample. This key and SizeClass
link to the zoopEnvComb
dataset
Catch per unit effort (number m-3)
Carbon biomass per unit effort (g/ m3)
Note that EMP Macro samples with QAQC flags (any value of AmphipodCode other than "A") have had their Amphipod CPUE set to NA in this integrated dataset. For more information on the source datasets see zooper
.
Zoopdownloader
, Zoopsynther
, zooper
This function downloads all IEP zooplankton datasets from the internet, converts them to a consistent format, binds them together, and exports the combined dataset as .Rds R data files and/or an R object. Datasets currently include "EMP" (Environmental Monitoring Program), "FRP" (Fish Restoration Program), "FMWT" (Fall Midwater Trawl), "STN" (Townet Survey), "20mm" (20mm survey), "DOP" (Directed Outflow Project Lower Trophic Study), and "YBFMP" (Yolo Bypass Fish Monitoring Program).
Zoopdownloader( Data_sets = c("EMP_Meso", "FMWT_Meso", "STN_Meso", "20mm_Meso", "FRP_Meso", "EMP_Micro", "FRP_Macro", "EMP_Macro", "FMWT_Macro", "STN_Macro", "DOP_Meso", "DOP_Macro"), Biomass = TRUE, Data_folder = tempdir(), Save_object = TRUE, Return_object = FALSE, Return_object_type = "List", Redownload_data = FALSE, Download_method = "auto", Zoop_path = file.path(Data_folder, "zoopforzooper"), Env_path = file.path(Data_folder, "zoopenvforzooper"), Crosswalk = zooper::crosswalk, Stations = zooper::stations )
Zoopdownloader( Data_sets = c("EMP_Meso", "FMWT_Meso", "STN_Meso", "20mm_Meso", "FRP_Meso", "EMP_Micro", "FRP_Macro", "EMP_Macro", "FMWT_Macro", "STN_Macro", "DOP_Meso", "DOP_Macro"), Biomass = TRUE, Data_folder = tempdir(), Save_object = TRUE, Return_object = FALSE, Return_object_type = "List", Redownload_data = FALSE, Download_method = "auto", Zoop_path = file.path(Data_folder, "zoopforzooper"), Env_path = file.path(Data_folder, "zoopenvforzooper"), Crosswalk = zooper::crosswalk, Stations = zooper::stations )
Data_sets |
Datasets to include in combined data. Choices include "EMP_Meso", "FMWT_Meso", "STN_Meso", "20mm_Meso", "FRP_Meso", "YBFMP_Meso", "EMP_Micro", "YBFMP_Micro", "FRP_Macro", "EMP_Macro", "FMWT_Macro", "STN_Macro", "DOP_Macro", and "DOP_Meso". Defaults to including all datasets except the two YBFMP datasets. |
Biomass |
Whether to add carbon biomass (carbon biomass per unit effort ( |
Data_folder |
Path to folder in which source datasets are stored, and to which you would like datasets to be downloaded if you set |
Save_object |
Should the combined data be saved to disk? Defaults to |
Return_object |
Should data be returned as an R object? If |
Return_object_type |
If |
Redownload_data |
Should source datasets be redownloaded from the internet? Defaults to |
Download_method |
Method used to download files. See argument |
Zoop_path |
File path specifying the folder and filename of the zooplankton dataset. Defaults to |
Env_path |
File path specifying the folder and filename of the dataset with accessory environmental parameters. Defaults to |
Crosswalk |
Crosswalk table to be used for conversions. Must have columns named for each unique combination of source and size class with an underscore separator, as well as all taxonomic levels Phylum through Species, Taxname (full scientific name) and Lifestage. See |
Stations |
Latitudes and longitudes for each unique station. See |
Note that EMP Macro samples with QAQC flags (any value of AmphipodCode other than "A") have had their Amphipod CPUE set to NA in this function. For more information on the source datasets see zooper
.
If Return_object = TRUE
, returns the combined dataset as a list or tibble, depending on whether Return_object_type
is set to "List"
or "Combined"
. If Save_object = TRUE
, writes 2 .Rds files to disk: one with the zooplankton catch data and another with accessory environmental parameters.
Sam Bashevkin
Zoopsynther
, crosswalk
, stations
, zooper
## Not run: Data <- Zoopdownloader(Data_folder = tempdir(), Return_object = TRUE, Save_object = FALSE, Redownload_data = TRUE) ## End(Not run)
## Not run: Data <- Zoopdownloader(Data_folder = tempdir(), Return_object = TRUE, Save_object = FALSE, Redownload_data = TRUE) ## End(Not run)
Accessory environmental data from the combined zooplankton dataset. Not all datasets report all environmental parameters.
zoopEnvComb
zoopEnvComb
a tibble with 44,690 rows and 20 columns
Abbreviated name of the source dataset. "EMP"=Environmental Monitoring Program, "FRP"=Fish Restoration Program, "FMWT"= Fall Midwater Trawl, "STN"= Townet Survey, "20mm" =20mm survey, "DOP" = Directed Outflow Project Lower Trophic Study, and "YBFMP"= Yolo Bypass Fish Monitoring Program.
Year sample was collected
Date sample was collected
Date and time sample was collected, if time was provided
Sample collection method identifying whether each tow was a surface, bottom, oblique, or vertical pump sample
Tidal stage
Station where sample was collected. This is the key that links to the stations
dataset
Chlorophyll concentration in g/L
Secchi depth in cm
Temperature in °C.
Total depth of the water column in m
Water turbidity in Nephelometric Turbidity Units
Water turbidity in Formazin Nephelometric Units
Intensity of Microcystis bloom coded qualitatively from 1-5 where 1 = absent, 2 = low, 3 = medium, 4 = high, 5 = very high
Water pH
Dissolved oxygen in mg/L
Surface salinity in PPT
Bottom salinity in PPT
Latitude in decimal degrees
Longitude in decimal degrees
Code indicating sample quality for EMP macro amphipod samples (A=valid, B=questionable [veg/algal bloom in net], C=not valid [error in lab], D=suspect [possible missing data])
Unique ID of the zooplankton sample. This is the key that links to the zoopComb
dataset
Zoopdownloader
, Zoopsynther
, zooper
This package contains functions, lookup tables, and 2 built-in pre-combined datasets (one with the zooplankton data and another with the environmental data).
Zoopdownloader
function) zooplankton datasets. These may be out of dateThe EMP zooplankton survey is run by the California Department
of Fish and Wildlife. Zooplankton were first collected in 1972. It samples monthly at 17 fixed stations,
2 floating entrapment zone stations, and 3 stations in Carquinez Strait and San Pablo Bay that are only
sampled during high outflow and low salinity conditions. EMP samples using micro (43 m),
meso (160
m), and macro (505
m) zooplankton nets. Note that additional Amphipod data with
quality issues (e.g., vegetation in net) are available in the EMP data publication.
Data are available here.
The 20-mm survey is run by the California Department of Fish and Wildlife.
Zooplankton were first collected in 1995. Zooplankton are collected concurrently with fish
samples at 41-55 fixed open-channel stations per year. Samples are collected twice per month
between March and July. Only Mesozooplankton are collected with a 160 m mesh net.
Data are available here.
The FMWT and STN are run by the
California Department of Fish and Wildlife. FMWT samples are collected monthly between September
and December from a subset of the 122 fixed open-channel stations. STN samples are collected
monthly between June and August from 40 fixed open-channel stations. Macrozooplankton have been
collected since 2007 with a 505 m mesh net while mesozooplankton have been collected
since 2005 with a 160
m mesh net.
Data are available here..
Supplemental sampling from the Suisun Marsh Salinity Control Gate study data are also included
and those data can be found here.
FRP is run by the California Department of Fish and Wildlife.
Zooplankton were first collected in 2015. Samples are collected monthly between March and December
in shallow-water habitats near marshes. FRP samples with meso (150 m) and macro (500
m)
zooplankton nets. Data are available here.
The Directed Outflow Project Lower Trophic Study
is run by ICF for the United States Bureau of Reclamation. Zooplankton were first collected in fall 2017.
Samples were collected once every two weeks in 2017 and weekly thereafter. Sampling is conducted in the fall
and, starting in 2019, spring and summer seasons have also been sampled. Three sampling stations per region
are randomly selected for 5 regions (Suisun Bay, Suisun Marsh, Lower Sac. River, Cache Slough, Sac Ship Channel).
In 2017, stations were sampled in 3 additional regions: West of the Benicia Bridge, Lower San Joaquin, and
Upper Sac River. At each station, sample collection is attempted at both shoal (<=10 feet) and channel (>10 feet) habitat.
Channels are sampled at the surface and, if deeper than 20 feet, also at the bottom 1/2 to 1/3 of the water column.
DOP samples with meso (150 m) and macro (500
m) zooplankton nets.
Data are available here.
YBFMP is run by the California Department of Water Resources.
Zooplankton were first collected in 1999. Samples are from 2 sites, one in the Yolo Bypass and one in the Sacramento River.
In 1999 and 2000, samples were collected for a couple of months each year, and then it increased to roughly winter/spring from 2001-2010.
Since 2011, samples are collected twice monthly during most of the year, and weekly when the bypass is inundated.
YBFMP samples with micro (50 m) and meso (150
m) zooplankton nets.
Data are available here.
Maintainer: Samuel M Bashevkin [email protected] (ORCID)
Authors:
Rosemary Hartman [email protected] (ORCID)
Karrin Alstad [email protected]
Catarina Pien [email protected] (ORCID)
Useful links:
This function returns an integrated zooplankton dataset with taxonomic issues resolved, according to user-specifications, along with important caveats about the data. It requires the output of the Zoopdownloader
function to run. This can be provided either as a list or paths to saved .Rds files generated by the Zoopdownloader
function. The function defaults to loading pre-packaged combined datasets (which may be outdated).
Zoopsynther( Data_type = NULL, Zoop = zooper::zoopComb, ZoopEnv = zooper::zoopEnvComb, Zoop_path = NULL, Env_path = NULL, Sources = c("EMP", "FRP", "FMWT", "STN", "20mm", "DOP"), Size_class = c("Micro", "Meso", "Macro"), Time_consistency = FALSE, Intro_lag = 2, Response = "CPUE", Taxa = NULL, Date_range = c(NA, NA), Months = NA, Years = NA, Sal_bott_range = NA, Sal_surf_range = NA, Temp_range = NA, Lat_range = NA, Long_range = NA, Reload_data = F, Redownload_data = F, All_env = T, Shiny = F, Crosswalk = zooper::crosswalk, Undersampled = zooper::undersampled, CompleteTaxaList = zooper::completeTaxaList, StartDates = zooper::startDates, ... )
Zoopsynther( Data_type = NULL, Zoop = zooper::zoopComb, ZoopEnv = zooper::zoopEnvComb, Zoop_path = NULL, Env_path = NULL, Sources = c("EMP", "FRP", "FMWT", "STN", "20mm", "DOP"), Size_class = c("Micro", "Meso", "Macro"), Time_consistency = FALSE, Intro_lag = 2, Response = "CPUE", Taxa = NULL, Date_range = c(NA, NA), Months = NA, Years = NA, Sal_bott_range = NA, Sal_surf_range = NA, Temp_range = NA, Lat_range = NA, Long_range = NA, Reload_data = F, Redownload_data = F, All_env = T, Shiny = F, Crosswalk = zooper::crosswalk, Undersampled = zooper::undersampled, CompleteTaxaList = zooper::completeTaxaList, StartDates = zooper::startDates, ... )
Data_type |
What type of data are you looking for? This option allows you to to choose a final output dataset for either community ( |
Zoop |
Zooplankton data. You must provide the "Zooplankton" element from the list returned from |
ZoopEnv |
Accessory environmental data. You must provide the "Environment" element from the list returned from |
Zoop_path |
If you wish to save time by saving the combined zooplankton datasets returned from the |
Env_path |
If you wish to save time by saving the combined zooplankton datasets returned from the |
Sources |
Source datasets to be included. Choices include "EMP" (Environmental Monitoring Program), "FRP" (Fish Restoration Program), "FMWT" (Fall Midwater Trawl), "STN" (Townet Survey), "DOP" (Directed Outflow Project), and "20mm" (20mm survey). The YBFMP datasets cannot be used in this function due to taxonomic and life stage issues with that dataset. Defaults to |
Size_class |
Zooplankton size classes (as defined by net mesh sizes) to be included in the integrated dataset. Choices include "Micro" (43 |
Time_consistency |
Would you like to apply a fix to enforce consistent taxonomic resolution over time? Only available for the Community option. |
Intro_lag |
Only applicable if |
Response |
Which response variable(s) would you like for the zooplankton data? Choices are "CPUE" (catch per unit effort) and "BPUE" (carbon biomass per unit effort ( |
Taxa |
If you only wish to include a subset of taxa, provide a character vector of the taxa you wish included. This can include taxa of any taxonomic level (e.g., |
Date_range |
Range of dates to include in the final dataset. To filter within a range of dates, include a character vector of 2 dates formatted in the yyyy-mm-dd format exactly, specifying the upper and lower bounds. To specify an infinite upper or lower bound (to include all values above or below a limit) input |
Months |
Months (as integers) to be included in the integrated dataset. If you wish to only include data from a subset of months, input a vector of integers corresponding to the months you wish to be included. Defaults to |
Years |
Years to be included in the integrated dataset. If you wish to only include data from a subset of years, input a vector of years you wish to be included. Defaults to |
Sal_bott_range |
Filter the data by bottom salinity values. Include a vector of length 2 specifying the minimum and maximum values you wish to include. To include all values above or below a limit, utilize Inf or -Inf for the upper or lower bound respectively. Defaults to |
Sal_surf_range |
Same as previous, but for surface salinity. |
Temp_range |
Same as |
Lat_range |
Latitude range to include in the final dataset. Include a vector of length 2 specifying the minimum and maximum values you wish to include, in decimal degree format. Defaults to |
Long_range |
Same as previous, but for longitude. Don't forget that Longitudes should be negative in the Delta! |
Reload_data |
If set to |
Redownload_data |
Should data be re-downloaded from the internet? If set to |
All_env |
Should all environmental parameters be included? Defaults to |
Shiny |
Is this function being used within the shiny app? If set to |
Crosswalk |
Crosswalk table to be used for conversions. Must have columns named for each unique combination of source and size class with an underscore separator, as well as all taxonomic levels Phylum through Species, Taxname (full scientific name) and Lifestage. See |
Undersampled |
A table listing the taxonomic names and life stages of plankton undersampled by each net mesh size (i.e. size class). See |
CompleteTaxaList |
Character vector of all taxonomic names in source datasets. Defaults to |
StartDates |
Tibble with the starting dates of each source dataset. Defaults to |
... |
Arguments passed to |
This function combines any combination of the zooplankton datasets (included as parameters)
and calculates least common denominator taxa to facilitate comparisons across datasets with differing
levels of taxonomic resolution. For more information on the source datasets see zooper
.
An integrated zooplankton dataset.
The Data_type
parameter toggles between two approaches to resolving differences in taxonomic resolution.
If you want all available data on given Taxa, use Data_type="Taxa"
but if you want to conduct a community
analysis, use Data_type = "Community"
.
Briefly, Data_type = "Community"
optimizes for community-level analyses by taking all taxa x life stage
combinations that are not measured in every input dataset, and summing them up taxonomic levels to the lowest
taxonomic level they belong to that is covered by all datasets. Remaining Taxa x life stage combos that are not
covered in all datasets up to the phylum level (usually something like Annelida or Nematoda or Insect Pupae) are
removed from the final dataset. However, some taxa x life stage combos are retained if they are taxonomic levels
higher than species that are counted in some surveys, and a lower taxonomic level within this group is counted in all surveys.
For example, if we had 3 surveys where surveys A and B count Pseudodiaptomus forbesi, Pseudodiaptomus marinus,
and Pseudodiaptomus spp. (UnID) but survey C only counts P. forbesi and P. marinus then the
Pseudodiaptomus spp. (UnID) category would be retained after applying the community approach.Data_type = "Taxa"
optimizes for the Taxa-level user by maintaining all data at the original taxonomic level
(but it outputs warnings for taxa not measured in all datasets, which we call "orphans").
To facilitate comparisons across datasets, this option also sums data into general categories that are comparable
across all datasets and years: "summed groups." The new variable "Taxatype" identifies which taxa are summed groups
(Taxatype = "Summed group"
), which are measured to the species level (Taxatype = "Species"
), and which
are higher taxonomic groupings with the species designation unknown: (Taxatype = "UnID species"
).
Sam Bashevkin
Zoopdownloader
, Taxnamefinder
, SourceTaxaKeyer
, crosswalk
, undersampled
, zoopComb
, zoopEnvComb
, zooper
MyZoops <- Zoopsynther(Data_type = "Community", Sources = c("EMP", "FRP", "FMWT"), Size_class = "Meso", Date_range = c("1990-10-01", "2000-09-30"))
MyZoops <- Zoopsynther(Data_type = "Community", Sources = c("EMP", "FRP", "FMWT"), Size_class = "Meso", Date_range = c("1990-10-01", "2000-09-30"))