Package 'zooper'

Title: Download and Integrate Zooplankton Datasets from the Upper San Francisco Estuary
Description: This package downloads and integrates zooplankton datasets from the Sacramento San Joaquin Delta. Datasets are manipulated into a consistent format and bound together, then differences in taxonomic resolution among datasets are resolved using one of two methods, depending on whether the user wishes to analyze community or taxa-specific trends. Ancillary environmental data are retained in the final dataset. Users can also filter the dataset by a number of parameters.
Authors: Samuel M Bashevkin [aut, cre] , Rosemary Hartman [aut] , Karrin Alstad [aut], Catarina Pien [aut]
Maintainer: Samuel M Bashevkin <[email protected]>
License: GPL-3
Version: 2.5.0.9000
Built: 2024-11-25 21:33:14 UTC
Source: https://github.com/InteragencyEcologicalProgram/zooper

Help Index


Macro zooplankton length-weight equations

Description

Length-weight equations for macro zooplankton to be used for biomass conversions. The equations relate length in mm to dry mass in milligrams. Dry mass can be converted to carbon mass by assuming 40 Uye, S. 1982. Length-weight relationships of important zooplankton from the Inland Sea of Japan. Journal of the Oceanographical Society of Japan 38:149–158.

Usage

biomass_macro

Format

a tibble with 23 rows and 9 columns.

Taxname

Current scientific name.

Level

Taxonomic level of the taxa.

Preservative

Preservative used to store sample before individuals were measured to develop the equations.

Weight_type

The type of weight measurement.

N

The number of individuals used in developing the equation.

Min_length

Minimum length (mm) of individuals used in developing the equation.

Max_length

Maximum length (mm) of individuals used in developing the equation.

a

Coefficient a in the equation Weight (mg) = a * Length (mm) ^ b.

b

Coefficient b in the equation Weight (mg) = a * Length (mm) ^ b.

See Also

biomass_mesomicro


Meso and Micro zooplankton average biomass values

Description

Average carbon biomass values for meso and micro zooplankton to be used for biomass conversions

Usage

biomass_mesomicro

Format

a tibble with 44 rows and 4 columns.

Taxname

Current scientific name.

Level

Taxonomic level of the taxa.

Lifestage

Plankton lifestage.

Carbon_mass_micrograms

Average carbon mass of an individual in micrograms.

See Also

biomass_macro


Detect common taxonomic names across all source datasets

Description

Calculates taxa by life stage combos present in all source datasets

Usage

Commontaxer(Source_taxa_key, Taxa_level, Size_class)

Arguments

Source_taxa_key

A dataframe with columns named Source, Lifestage, SizeClass, and the value provided to the parameter Taxa_level. This dataframe should list all Taxa_level by Lifestage combinations present for each source dataset. You can provide it with the output of SourceTaxaKeyer.

Taxa_level

Taxonomic level you would like to perform this calculation for. E.g., if you wish to determine all Genus x lifestage combinations present in all datasets, provide Taxa_level = "Genus". The value provided here must be the name of a column in the dataset provided to Source_taxa_key.

Size_class

The size class(es) you would like this function to consider. You should generally only supply 1 size class.

Details

This function is designed to work on just one size class. To apply to multiple size classes, use map or apply functions to apply across size classes.

Value

A tibble with a column for Taxa_level and another for Lifestage representing all combinations of these values present in all source datasets.

Author(s)

Sam Bashevkin

See Also

Zoopsynther, crosswalk, SourceTaxaKeyer

Examples

## Not run: 
library(rlang)
library(purrr)
SourceTaxaKey <- SourceTaxaKeyer(zoopComb, crosswalk)
Size_classes <- set_names(c("Micro", "Meso", "Macro"))
Commontax <- map(Size_classes, ~ Commontaxer(SourceTaxaKey, "Taxname", .))

## End(Not run)

All taxonomic names

Description

A complete list of all valid taxonomic names included in the full dataset. Used to limit choices for filtering by taxa.

Usage

completeTaxaList

Format

a character vector of length 454.

See Also

Taxnamefinder, Zoopsynther, zooper


Taxonomic crosswalk among datasets

Description

A crosswalk table relating the taxonomic code used by each dataset to current scientific names, life stages, and taxonomic hierarchies.

Usage

crosswalk

Format

a tibble with 404 rows and 34 variables

EMP_Micro

Taxonomic codes used in the Environmental Monitoring Program microzooplankton (43 μ\mum) mesh dataset

EMP_Meso

Taxonomic codes used in the Environmental Monitoring Program mesozooplankton (160 μ\mum) mesh dataset

EMP_Macro

Taxonomic codes used in the Environmental Monitoring Program macrozooplankton (505 μ\mum mesh) dataset

STN_Meso

Taxonomic codes used in the Townet Survey mesozooplankton (160 μ\mum mesh) dataset

STN_Macro

Taxonomic codes used in the Townet Survey macrozooplankton (505 μ\mum mesh) dataset

FMWT_Meso

Taxonomic codes used in the Fall Midwater Trawl mesozooplankton (160 μ\mum mesh) dataset

FMWT_Macro

Taxonomic codes used in the Fall Midwater Trawl macrozooplankton (505 μ\mum mesh) dataset

twentymm_Meso

Taxonomic codes used in the 20mm Survey mesozooplankton (160 μ\mum mesh) dataset

FRP_Meso

Taxonomic codes used in the Fish Restoration Program mesozooplankton (150 μ\mum mesh) dataset

FRP_Macro

Taxonomic codes used in the Fish Restoration Program macrozooplankton (500 μ\mum mesh) dataset

DOP_Meso

Taxonomic codes used in the Directed Outflow Project mesozooplankton (150 μ\mum mesh) dataset

DOP_Macro

Taxonomic codes used in the Directed Outflow Project macrozooplankton (500 μ\mum mesh) dataset

YBFMP

Taxonomic codes used in the Yolo Bypass Fish Monitoring Program zooplankton dataset)

Lifestage

Plankton lifestage

Taxname

Current scientific name

Level

Taxonomic level of the taxa

Phylum

Phylum

Class

Class

Order

Order

Family

Family

Genus

Genus

Species

Species

Intro

Introduction year for non-native species

EMPstart

First year the Environmental Monitoring Program starting counting this taxa

EMPend

Last year the Environmental Monitoring Program counted this taxa

FMWTstart

First year the Fall Midwater Trawl starting counting this taxa

FMWTend

Last year the Fall Midwater Trawl counted this taxa

twentymmstart

First year the 20mm Survey starting counting this taxa

twentymmend

Last year the 20mm Survey counted this taxa

twentymmstart2

First year the 20mm Survey restarted counting this taxa

DOPstart

First year DOP starting counting this taxa

DOPend

Last year DOP counted this taxa

See Also

Zoopdownloader, Zoopsynther, zooper


Apply LCD approach for "Taxa" option

Description

Sums to least common denominator taxa, one taxonomic level at a time

Usage

LCD_Taxa(
  Data,
  Taxalevel,
  Groupers = c("Genus_g", "Family_g", "Order_g", "Class_g", "Phylum_g"),
  Response = "CPUE"
)

Arguments

Data

Zooplankton dataset including columns named the same as the Groupers, a Taxname column, "CPUE", and no other taxonomic identifying columns.

Taxalevel

The value of Groupers on which to apply this function.

Groupers

A character vector of names of additional taxonomic levels to be removed in this step. This vector can include Taxalevel and, if so, it will be removed from the vector within the function so Taxalevel is preserved.

Response

Which response variable(s) would you like for the zooplankton data? Choices are "CPUE" (catch per unit effort) and "BPUE" (carbon biomass per unit effort (μ\mug/ m3)). Defaults to Response = "CPUE".

Details

This function is designed to work on just one Taxalevel at a time. To apply to multiple Taxalevels, use map or apply functions to apply across taxonomic levels.

Value

A tibble with sums calculated for each unique value in Data$Taxalevel. Sums will be excluded for grouping taxa that only contain 1 unique Taxname.

Author(s)

Sam Bashevkin

See Also

Zoopsynther, crosswalk, zoopComb

Examples

## Not run: 
library(dplyr)
df <- zoopComb%>%
  mutate(dplyr::across(tidyselect::all_of(c("Genus", "Family", "Order", "Class", "Phylum")),
                   list(g=~if_else(.%in%completeTaxaList, ., NA_character_))))%>%
  select(-Phylum, -Class, -Order, -Family, -Genus, -Species, -Taxlifestage)
family_sums <- LCD_Taxa(df, "Family_g")

## End(Not run)

Unique taxa by lifestage combinations present in each source and size class

Description

Computes a dataframe with all unique taxa by lifestage combinations present in each source and size class

Usage

SourceTaxaKeyer(Data, Crosswalk)

Arguments

Data

Zooplankton dataset. Must have a column named Source with the names of the source datasets and a column named SizeClass with the names of the zooplankton size classes.

Crosswalk

Crosswalk table (e.g., crosswalk) with columns named "Phylum", "Class", "Order", "Family", "Genus", "Taxname", "Lifestage", and column names corresponding to each unique value of paste(data$Source, data$SizeClass, sep="_").

Value

a tibble with the complete taxonomic information for each combination of source and size class.

Author(s)

Sam Bashevkin

See Also

Zoopsynther, crosswalk, zoopComb

Examples

SourceTaxaKey <- SourceTaxaKeyer(Data = dplyr::filter(zoopComb, Source!="YBFMP"),
Crosswalk = crosswalk)

Start dates

Description

First dates sampled by each survey and size class

Usage

startDates

Format

a tibble with 14 rows and 3 columns.

Source

Abbreviated name of the source dataset. "EMP"=Environmental Monitoring Program, "FRP"=Fish Restoration Program, "FMWT"= Fall Midwater Trawl, "STN"= Townet Survey, "20mm" =20mm survey, "DOP" = Directed Outflow Project Lower Trophic Study, and "YBFMP"= Yolo Bypass Fish Monitoring Program.

SizeClass

Net size class. Micro corresponds to 43 (EMP) or 50 (YBFMP) μ\mum mesh, Meso corresponds to 150 (FRP and DOP) or 160 (EMP, FMWT, STN, 20mm, YBFMP) μ\mum mesh, and Macro corresponds to 500 (FRP and DOP) - 505 (EMP, FMWT, STN) μ\mum mesh. However, prior to 1974 EMP macrozooplankton were sampled with a 930 μ\mum mesh net.

Startdate

Date first sample was collected.

See Also

Uncountedyears, Zoopsynther


Station locations

Description

Latitudes and longitudes for each zooplankton station.

Usage

stations

Format

a tibble with 387 rows and 4 columns

Source

Abbreviated name of the source dataset

Station

Sampling station name

Latitude

Latitude in decimal degrees

Longitude

Longitude in decimal degrees

See Also

Zoopdownloader, zooper


EMP EZ Station locations

Description

Latitudes and longitudes for EMP EZ stations on each sampling date from 2004 to present.

Usage

stationsEMPEZ

Format

a tibble with 491 rows and 4 columns

Date

Date sample was collected

Station

Sampling station name

Latitude

Latitude in decimal degrees

Longitude

Longitude in decimal degrees

See Also

Zoopdownloader, zooper


Finds all of the lowest-level (i.e. counted) taxonomic names within a vector of taxa

Description

Helps filter the zooplankton dataset by returning a set of lowest-level taxa (i.e. the level taxa were recorded at when counted in plankton samples) within a vector of taxa (which can include taxa from any taxonomic level).

Usage

Taxnamefinder(Crosswalk, Taxa)

Arguments

Crosswalk

Crosswalk table (such as crosswalk) with columns named "Phylum", "Class", "Order", "Family", "Genus", "Species", and "Taxname." "Taxname" corresponds to the full scientific name of the taxonomic level assigned to the plankter when recorded in the dataset.

Taxa

A character vector of taxa you wish to select. These taxa can be from any taxonomic level present in the list above. If using the built-in data and crosswalk, they should be present in the completeTaxaList.

Value

A character vector of scientific names contained within the vector of Taxa provided.

Author(s)

Sam Bashevkin

See Also

completeTaxaList, Zoopsynther

Examples

Taxnames <- Taxnamefinder(crosswalk, c("Calanoida", "Cyclopoida"))

Detect dates when a species was not counted

Description

Detects years when a species was present in the system (i.e., post-invasion for invasive species) but not counted in each zooplankton survey

Usage

Uncountedyears(Source, Size_class, Crosswalk, Start_year, Intro_lag)

Arguments

Source

String with the name of the source dataset (e.g., Source="EMP").

Size_class

String with the name of the desired zooplankton size class (e.g., Source="Meso").

Crosswalk

Crosswalk table like crosswalk or another table in the same format.

Start_year

First year the Source survey started sampling zooplankton.

Intro_lag

Number of years buffer after a species is introduced when we expect surveys to start recording them. Effectively adds Intro_lag years to the introduction year of each species.

Details

This function is designed to work on one source and size class at a time. To apply across multiple IDs, use the map or apply functions.

Value

A tibble with columns for the Taxlifestage, Taxname, Lifestage, Source, Sizeclass, and then a list-column of years in which that particular taxon was not counted in the specified study and size class. Taxa that were counted in all applicable years are not included in the tibble.

Author(s)

Sam Bashevkin

See Also

Zoopsynther, crosswalk, startDates

Examples

require(purrr)
require(dplyr)
require(lubridate)

datasets<-zooper::zoopComb%>%
 mutate(names=paste(Source, SizeClass, sep="_"))%>%
 select(names, Source, SizeClass)%>%
 filter(Source%in%c("EMP", "FMWT", "twentymm"))%>%
 distinct()

BadYears<-map2_dfr(datasets$Source, datasets$SizeClass, ~ Uncountedyears(Source = .x,
Size_class = .y,
Crosswalk = zooper::crosswalk,
Start_year = zooper::startDates%>%
 filter(Source==.x & SizeClass==.y)%>%
 pull(Startdate)%>%
 year(),
 Intro_lag=2))

Taxa undersampled in each size class

Description

A table listing the taxonomic names and life stages of plankton undersampled by each net mesh size (i.e. size class)

Usage

undersampled

Format

a tibble with 27 rows and 3 columns

SizeClass

The size class of zooplankton intended to be capture be each net mesh size. Micro corresponds to 43 μ\mum mesh and Meso corresponds to 150-160 μ\mum mesh.

Taxname

The scientific name of taxa undersampled by the corresponding mesh size class

Lifestage

The lifestage of each taxa undersampled by the corresponding mesh size class

See Also

Zoopsynther, zooper


Extract latest EDI files This function extracts the latest version of a zooplankton EDI package and the list of files from that package

Description

Extract latest EDI files This function extracts the latest version of a zooplankton EDI package and the list of files from that package

Usage

zoop_urls(Sources)

Arguments

Sources

Source datasets to be included. Choices include "EMP" (Environmental Monitoring Program), "FRP" (Fish Restoration Program), "FMWT" (Fall Midwater Trawl), "STN" (Townet Survey), "DOP" (Directed Outflow Project), and "20mm" (20mm survey). The YBFMP datasets cannot be used in this function due to taxonomic and life stage issues with that dataset. Defaults to Sources = c("EMP", "FRP", "FMWT", "STN", "20mm", "DOP").

Value

A list with the files and/or URLs for each source dataset

Author(s)

Sam Bashevkin


Convert CPUE to biomass This function converts zooplankton CPUE to carbon biomass (Carbon biomass per unit effort (μ\mug/ m3)) for taxa with conversion equations.

Description

Convert CPUE to biomass This function converts zooplankton CPUE to carbon biomass (Carbon biomass per unit effort (μ\mug/ m3)) for taxa with conversion equations.

Usage

Zoopbiomass(
  Zoop,
  ZoopLengths,
  Biomass_mesomicro = zooper::biomass_mesomicro,
  Biomass_macro = zooper::biomass_macro
)

Arguments

Zoop

Zooplankton count dataset

ZoopLengths

Zooplankton length dataset for macrozooplankton.

Biomass_mesomicro

The micro and meso zooplankton biomass conversion table. The default is biomass_mesomicro

Biomass_macro

The macro zooplankton biomass conversion table. The default is biomass_macro


Combined zooplankton dataset

Description

All source zooplankton datasets combined into one tibble.

Usage

zoopComb

Format

a tibble with 3,615,105 rows and 14 columns.

Source

Abbreviated name of the source dataset. "EMP"=Environmental Monitoring Program, "FRP"=Fish Restoration Program, "FMWT"= Fall Midwater Trawl, "STN"= Townet Survey, "20mm" =20mm survey, "DOP" = Directed Outflow Project Lower Trophic Study, and "YBFMP"= Yolo Bypass Fish Monitoring Program.

SizeClass

Net size class. Micro corresponds to 43-50 μ\mum mesh, Meso corresponds to 150-160 μ\mum mesh, and Macro corresponds to 500-505 μ\mum mesh. However, prior to 1974 EMP macrozooplankton were sampled with a 930 μ\mum mesh net.

Volume

Volume (m3) of water sampled

Lifestage

Zooplankton life stage

Taxname

Scientific name

Phylum

Phylum

Class

Class

Order

Order

Family

Family

Genus

Genus

Species

Species

Taxlifestage

Combined Taxname and Lifestage

SampleID

Unique ID of the zooplankton sample. This key and SizeClass link to the zoopEnvComb dataset

CPUE

Catch per unit effort (number m-3)

BPUE

Carbon biomass per unit effort (μ\mug/ m3)

Details

Note that EMP Macro samples with QAQC flags (any value of AmphipodCode other than "A") have had their Amphipod CPUE set to NA in this integrated dataset. For more information on the source datasets see zooper.

See Also

Zoopdownloader, Zoopsynther, zooper


Downloads and combines zooplankton datasets collected by the Interagency Ecological Program from the Sacramento-San Joaquin Delta

Description

This function downloads all IEP zooplankton datasets from the internet, converts them to a consistent format, binds them together, and exports the combined dataset as .Rds R data files and/or an R object. Datasets currently include "EMP" (Environmental Monitoring Program), "FRP" (Fish Restoration Program), "FMWT" (Fall Midwater Trawl), "STN" (Townet Survey), "20mm" (20mm survey), "DOP" (Directed Outflow Project Lower Trophic Study), and "YBFMP" (Yolo Bypass Fish Monitoring Program).

Usage

Zoopdownloader(
  Data_sets = c("EMP_Meso", "FMWT_Meso", "STN_Meso", "20mm_Meso", "FRP_Meso",
    "EMP_Micro", "FRP_Macro", "EMP_Macro", "FMWT_Macro", "STN_Macro", "DOP_Meso",
    "DOP_Macro"),
  Biomass = TRUE,
  Data_folder = tempdir(),
  Save_object = TRUE,
  Return_object = FALSE,
  Return_object_type = "List",
  Redownload_data = FALSE,
  Download_method = "auto",
  Zoop_path = file.path(Data_folder, "zoopforzooper"),
  Env_path = file.path(Data_folder, "zoopenvforzooper"),
  Crosswalk = zooper::crosswalk,
  Stations = zooper::stations
)

Arguments

Data_sets

Datasets to include in combined data. Choices include "EMP_Meso", "FMWT_Meso", "STN_Meso", "20mm_Meso", "FRP_Meso", "YBFMP_Meso", "EMP_Micro", "YBFMP_Micro", "FRP_Macro", "EMP_Macro", "FMWT_Macro", "STN_Macro", "DOP_Macro", and "DOP_Meso". Defaults to including all datasets except the two YBFMP datasets.

Biomass

Whether to add carbon biomass (carbon biomass per unit effort (μ\mug/ m3)) to the dataset (where conversion equations and required data are available). Defaults to Biomass = TRUE

Data_folder

Path to folder in which source datasets are stored, and to which you would like datasets to be downloaded if you set Redownload_data = TRUE. If you do not want to store every source dataset, you can leave this at the default tempdir(). If you do not wish to redownload these datasets every time you run the function, you can set this to a directory on your computer and run the function in the future with Redownload_data = FALSE, which will load the source datasets from Data_folder instead of downloading them again.

Save_object

Should the combined data be saved to disk? Defaults to Save_object = TRUE.

Return_object

Should data be returned as an R object? If TRUE, the function will return the full combined dataset. Defaults to 'Return_object = FALSE'.

Return_object_type

If Return_object = TRUE, should data be returned as a combined dataframe (Return_object_type = "Combined") or a list with component "Zooplankton" containing the zooplankton data and component "Environment" containing the environmental data (Return_object_type = "List", the default). A list is required to feed data into the Zoopsynther function without saving the combined dataset to disk.

Redownload_data

Should source datasets be redownloaded from the internet? Defaults to Redownload_data = FALSE.

Download_method

Method used to download files. See argument method options in download.file. Defaults to "curl".

Zoop_path

File path specifying the folder and filename of the zooplankton dataset. Defaults to Zoop_path = file.path(Data_folder, "zoopforzooper").

Env_path

File path specifying the folder and filename of the dataset with accessory environmental parameters. Defaults to Env_path = file.path(Data_folder, "zoopenvforzooper").

Crosswalk

Crosswalk table to be used for conversions. Must have columns named for each unique combination of source and size class with an underscore separator, as well as all taxonomic levels Phylum through Species, Taxname (full scientific name) and Lifestage. See crosswalk (the default) for an example.

Stations

Latitudes and longitudes for each unique station. See stations (the default) for an example.

Details

Note that EMP Macro samples with QAQC flags (any value of AmphipodCode other than "A") have had their Amphipod CPUE set to NA in this function. For more information on the source datasets see zooper.

Value

If Return_object = TRUE, returns the combined dataset as a list or tibble, depending on whether Return_object_type is set to "List" or "Combined". If Save_object = TRUE, writes 2 .Rds files to disk: one with the zooplankton catch data and another with accessory environmental parameters.

Author(s)

Sam Bashevkin

See Also

Zoopsynther, crosswalk, stations, zooper

Examples

## Not run: 
Data <- Zoopdownloader(Data_folder = tempdir(), Return_object = TRUE,
Save_object = FALSE, Redownload_data = TRUE)

## End(Not run)

Environmental data

Description

Accessory environmental data from the combined zooplankton dataset. Not all datasets report all environmental parameters.

Usage

zoopEnvComb

Format

a tibble with 44,690 rows and 20 columns

Source

Abbreviated name of the source dataset. "EMP"=Environmental Monitoring Program, "FRP"=Fish Restoration Program, "FMWT"= Fall Midwater Trawl, "STN"= Townet Survey, "20mm" =20mm survey, "DOP" = Directed Outflow Project Lower Trophic Study, and "YBFMP"= Yolo Bypass Fish Monitoring Program.

Year

Year sample was collected

Date

Date sample was collected

Datetime

Date and time sample was collected, if time was provided

TowType

Sample collection method identifying whether each tow was a surface, bottom, oblique, or vertical pump sample

Tide

Tidal stage

Station

Station where sample was collected. This is the key that links to the stations dataset

Chl

Chlorophyll concentration in μ\mug/L

Secchi

Secchi depth in cm

Temperature

Temperature in °C.

BottomDepth

Total depth of the water column in m

TurbidityNTU

Water turbidity in Nephelometric Turbidity Units

TurbidityFNU

Water turbidity in Formazin Nephelometric Units

Microcystis

Intensity of Microcystis bloom coded qualitatively from 1-5 where 1 = absent, 2 = low, 3 = medium, 4 = high, 5 = very high

pH

Water pH

DO

Dissolved oxygen in mg/L

SalSurf

Surface salinity in PPT

SalBott

Bottom salinity in PPT

Latitude

Latitude in decimal degrees

Longitude

Longitude in decimal degrees

AmphipodCode

Code indicating sample quality for EMP macro amphipod samples (A=valid, B=questionable [veg/algal bloom in net], C=not valid [error in lab], D=suspect [possible missing data])

SampleID

Unique ID of the zooplankton sample. This is the key that links to the zoopComb dataset

See Also

Zoopdownloader, Zoopsynther, zooper


zooper: A package for integrating zooplankton datasets from the Sacramento San Joaquin Delta

Description

This package contains functions, lookup tables, and 2 built-in pre-combined datasets (one with the zooplankton data and another with the environmental data).

zooper functions

zooper lookup tables

zooper pre-combined (with the Zoopdownloader function) zooplankton datasets. These may be out of date

Source datasets

Environmental Monitoring Program (EMP)

The EMP zooplankton survey is run by the California Department of Fish and Wildlife. Zooplankton were first collected in 1972. It samples monthly at 17 fixed stations, 2 floating entrapment zone stations, and 3 stations in Carquinez Strait and San Pablo Bay that are only sampled during high outflow and low salinity conditions. EMP samples using micro (43 μ\mum), meso (160 μ\mum), and macro (505 μ\mum) zooplankton nets. Note that additional Amphipod data with quality issues (e.g., vegetation in net) are available in the EMP data publication. Data are available here.

20-mm Survey

The 20-mm survey is run by the California Department of Fish and Wildlife. Zooplankton were first collected in 1995. Zooplankton are collected concurrently with fish samples at 41-55 fixed open-channel stations per year. Samples are collected twice per month between March and July. Only Mesozooplankton are collected with a 160 μ\mum mesh net. Data are available here.

Fall Midwater Trawl (FMWT) and Summer Townet Survey (STN)

The FMWT and STN are run by the California Department of Fish and Wildlife. FMWT samples are collected monthly between September and December from a subset of the 122 fixed open-channel stations. STN samples are collected monthly between June and August from 40 fixed open-channel stations. Macrozooplankton have been collected since 2007 with a 505 μ\mum mesh net while mesozooplankton have been collected since 2005 with a 160 μ\mum mesh net. Data are available here.. Supplemental sampling from the Suisun Marsh Salinity Control Gate study data are also included and those data can be found here.

Fish Restoration Program (FRP)

FRP is run by the California Department of Fish and Wildlife. Zooplankton were first collected in 2015. Samples are collected monthly between March and December in shallow-water habitats near marshes. FRP samples with meso (150 μ\mum) and macro (500 μ\mum) zooplankton nets. Data are available here.

Directed Outflow Project Lower Trophic Study (DOP)

The Directed Outflow Project Lower Trophic Study is run by ICF for the United States Bureau of Reclamation. Zooplankton were first collected in fall 2017. Samples were collected once every two weeks in 2017 and weekly thereafter. Sampling is conducted in the fall and, starting in 2019, spring and summer seasons have also been sampled. Three sampling stations per region are randomly selected for 5 regions (Suisun Bay, Suisun Marsh, Lower Sac. River, Cache Slough, Sac Ship Channel). In 2017, stations were sampled in 3 additional regions: West of the Benicia Bridge, Lower San Joaquin, and Upper Sac River. At each station, sample collection is attempted at both shoal (<=10 feet) and channel (>10 feet) habitat. Channels are sampled at the surface and, if deeper than 20 feet, also at the bottom 1/2 to 1/3 of the water column. DOP samples with meso (150 μ\mum) and macro (500 μ\mum) zooplankton nets. Data are available here.

Yolo Bypass Fish Monitoring Program (YBFMP)

YBFMP is run by the California Department of Water Resources. Zooplankton were first collected in 1999. Samples are from 2 sites, one in the Yolo Bypass and one in the Sacramento River. In 1999 and 2000, samples were collected for a couple of months each year, and then it increased to roughly winter/spring from 2001-2010. Since 2011, samples are collected twice monthly during most of the year, and weekly when the bypass is inundated. YBFMP samples with micro (50 μ\mum) and meso (150 μ\mum) zooplankton nets. Data are available here.

Author(s)

Maintainer: Samuel M Bashevkin [email protected] (ORCID)

Authors:

See Also

Useful links:


Integrates zooplankton datasets collected by the Interagency Ecological Program from the Sacramento-San Joaquin Delta

Description

This function returns an integrated zooplankton dataset with taxonomic issues resolved, according to user-specifications, along with important caveats about the data. It requires the output of the Zoopdownloader function to run. This can be provided either as a list or paths to saved .Rds files generated by the Zoopdownloader function. The function defaults to loading pre-packaged combined datasets (which may be outdated).

Usage

Zoopsynther(
  Data_type = NULL,
  Zoop = zooper::zoopComb,
  ZoopEnv = zooper::zoopEnvComb,
  Zoop_path = NULL,
  Env_path = NULL,
  Sources = c("EMP", "FRP", "FMWT", "STN", "20mm", "DOP"),
  Size_class = c("Micro", "Meso", "Macro"),
  Time_consistency = FALSE,
  Intro_lag = 2,
  Response = "CPUE",
  Taxa = NULL,
  Date_range = c(NA, NA),
  Months = NA,
  Years = NA,
  Sal_bott_range = NA,
  Sal_surf_range = NA,
  Temp_range = NA,
  Lat_range = NA,
  Long_range = NA,
  Reload_data = F,
  Redownload_data = F,
  All_env = T,
  Shiny = F,
  Crosswalk = zooper::crosswalk,
  Undersampled = zooper::undersampled,
  CompleteTaxaList = zooper::completeTaxaList,
  StartDates = zooper::startDates,
  ...
)

Arguments

Data_type

What type of data are you looking for? This option allows you to to choose a final output dataset for either community (Data_type = "Community"; the default) or Taxa-specific (Data_type = "Taxa") analyses. NOTE: If you set Data_type="Community" we do not recommend utilizing the Taxa argument. See below for more explanation of this argument.

Zoop

Zooplankton data. You must provide the "Zooplankton" element from the list returned from Zoopdownloader(Save_object = FALSE, Return_object = TRUE, Return_object_type="List"). The default argument provides the built-in (and possibly outdated) version of this combined dataset. If you instead wish to provide paths to saved datasets from the Zoopdownloader function, set Data_list = NULL and provide Zoop_path.

ZoopEnv

Accessory environmental data. You must provide the "Environment" element from the list returned from Zoopdownloader(Save_object = FALSE, Return_object = TRUE, Return_object_type="List"). The default argument provides the built-in (and possibly outdated) version of this combined dataset. If you instead wish to provide paths to saved datasets from the Zoopdownloader function, set Data_list = NULL and provide Env_path.

Zoop_path

If you wish to save time by saving the combined zooplankton datasets returned from the zoopdatadownloader to disk, provider here the path to the combined zooplankton dataset on disk. You must also set Data_list = NULL.

Env_path

If you wish to save time by saving the combined zooplankton datasets returned from the zoopdatadownloader to disk, provider here the path to the combined accessory environmental data on disk. You must also set Data_list = NULL.

Sources

Source datasets to be included. Choices include "EMP" (Environmental Monitoring Program), "FRP" (Fish Restoration Program), "FMWT" (Fall Midwater Trawl), "STN" (Townet Survey), "DOP" (Directed Outflow Project), and "20mm" (20mm survey). The YBFMP datasets cannot be used in this function due to taxonomic and life stage issues with that dataset. Defaults to Sources = c("EMP", "FRP", "FMWT", "STN", "20mm", "DOP").

Size_class

Zooplankton size classes (as defined by net mesh sizes) to be included in the integrated dataset. Choices include "Micro" (43 μ\mum), "Meso" (150 - 160 μ\mum), and "Macro" (500-505 μ\mum). Defaults to Size_class = c("Micro", "Meso", "Macro").

Time_consistency

Would you like to apply a fix to enforce consistent taxonomic resolution over time? Only available for the Community option.

Intro_lag

Only applicable if Time_consistency = TRUE. How many years after a species is introduced should we expect surveys to start counting them? Defaults to 2.

Response

Which response variable(s) would you like for the zooplankton data? Choices are "CPUE" (catch per unit effort) and "BPUE" (carbon biomass per unit effort (μ\mug/ m3)). Defaults to Response = "CPUE".

Taxa

If you only wish to include a subset of taxa, provide a character vector of the taxa you wish included. This can include taxa of any taxonomic level (e.g., Taxa = "Calanoida") to include only calanoids. NOTE: we do not recommend you use this feature AND set Data_type="Community". This is better suited to selecting higher-level taxa. If you wish to just include one or a few species, it would be faster to just filter the output of Zoopdownloader to include those taxa. Defaults to NULL, which includes all taxa.

Date_range

Range of dates to include in the final dataset. To filter within a range of dates, include a character vector of 2 dates formatted in the yyyy-mm-dd format exactly, specifying the upper and lower bounds. To specify an infinite upper or lower bound (to include all values above or below a limit) input NA for that infinite bound. Defaults to Date_range = c(NA, NA), which includes all dates.

Months

Months (as integers) to be included in the integrated dataset. If you wish to only include data from a subset of months, input a vector of integers corresponding to the months you wish to be included. Defaults to Months = NA, which includes all months.

Years

Years to be included in the integrated dataset. If you wish to only include data from a subset of years, input a vector of years you wish to be included. Defaults to Years = NA, which includes all months.

Sal_bott_range

Filter the data by bottom salinity values. Include a vector of length 2 specifying the minimum and maximum values you wish to include. To include all values above or below a limit, utilize Inf or -Inf for the upper or lower bound respectively. Defaults to Sal_bott_range = NA, which includes all bottom salinities.

Sal_surf_range

Same as previous, but for surface salinity.

Temp_range

Same as Sal_bott_range but for surface temperature.

Lat_range

Latitude range to include in the final dataset. Include a vector of length 2 specifying the minimum and maximum values you wish to include, in decimal degree format. Defaults to Lat_range = NA, which includes all latitudes.

Long_range

Same as previous, but for longitude. Don't forget that Longitudes should be negative in the Delta!

Reload_data

If set to Reload_data = T runs the Zoopdownloader function to re-combine source datasets. To include local versions of the datasets without redownloading them from online, set Reload_data = TRUE and Redownload_data = FALSE. Defaults to Reload_data= FALSE

Redownload_data

Should data be re-downloaded from the internet? If set to Redownload_data = TRUE, runs Zoopdownloader(Redownload_data=Redownload_data, Zoop_path=Zoop_path, Env_path=Env_path, ...). Defaults to Redownload_data = FALSE.

All_env

Should all environmental parameters be included? Defaults to All_env = TRUE.

Shiny

Is this function being used within the shiny app? If set to Shiny = TRUE, outputs a list with the integrated dataset as one component and the caveats as the other component. Defaults to Shiny = FALSE.

Crosswalk

Crosswalk table to be used for conversions. Must have columns named for each unique combination of source and size class with an underscore separator, as well as all taxonomic levels Phylum through Species, Taxname (full scientific name) and Lifestage. See crosswalk (the default) for an example.

Undersampled

A table listing the taxonomic names and life stages of plankton undersampled by each net mesh size (i.e. size class). See undersampled (the default) for an example.

CompleteTaxaList

Character vector of all taxonomic names in source datasets. Defaults to completeTaxaList.

StartDates

Tibble with the starting dates of each source dataset. Defaults to startDates.

...

Arguments passed to Zoopdownloader.

Details

This function combines any combination of the zooplankton datasets (included as parameters) and calculates least common denominator taxa to facilitate comparisons across datasets with differing levels of taxonomic resolution. For more information on the source datasets see zooper.

Value

An integrated zooplankton dataset.

Data type

The Data_type parameter toggles between two approaches to resolving differences in taxonomic resolution. If you want all available data on given Taxa, use Data_type="Taxa" but if you want to conduct a community analysis, use Data_type = "Community".

Briefly, Data_type = "Community" optimizes for community-level analyses by taking all taxa x life stage combinations that are not measured in every input dataset, and summing them up taxonomic levels to the lowest taxonomic level they belong to that is covered by all datasets. Remaining Taxa x life stage combos that are not covered in all datasets up to the phylum level (usually something like Annelida or Nematoda or Insect Pupae) are removed from the final dataset. However, some taxa x life stage combos are retained if they are taxonomic levels higher than species that are counted in some surveys, and a lower taxonomic level within this group is counted in all surveys. For example, if we had 3 surveys where surveys A and B count Pseudodiaptomus forbesi, Pseudodiaptomus marinus, and Pseudodiaptomus spp. (UnID) but survey C only counts P. forbesi and P. marinus then the Pseudodiaptomus spp. (UnID) category would be retained after applying the community approach.

Data_type = "Taxa" optimizes for the Taxa-level user by maintaining all data at the original taxonomic level (but it outputs warnings for taxa not measured in all datasets, which we call "orphans"). To facilitate comparisons across datasets, this option also sums data into general categories that are comparable across all datasets and years: "summed groups." The new variable "Taxatype" identifies which taxa are summed groups (Taxatype = "Summed group"), which are measured to the species level (Taxatype = "Species"), and which are higher taxonomic groupings with the species designation unknown: (Taxatype = "UnID species").

Author(s)

Sam Bashevkin

See Also

Zoopdownloader, Taxnamefinder, SourceTaxaKeyer, crosswalk, undersampled, zoopComb, zoopEnvComb, zooper

Examples

MyZoops <- Zoopsynther(Data_type = "Community",
Sources = c("EMP", "FRP", "FMWT"),
Size_class = "Meso",
Date_range = c("1990-10-01", "2000-09-30"))