Title: | Integrated Dataset of Discrete Water Quality in the San Francisco Estuary |
Description: | Produce an integrated dataset of discrete water quality measurements using any combination of the 17 source datasets included. |
Authors: | Samuel M Bashevkin [aut, cre] |
Maintainer: | Samuel M Bashevkin <[email protected]> |
License: | GPL-3 |
Version: | |
Built: | 2025-03-01 06:55:06 UTC |
Source: | https://github.com/sbashevkin/discretewq |
Water quality data from the California Department of Fish and Wildlife Bay Study.
a tibble with 23,080 rows and 14 variables
Name of the source dataset.
Station where sample was collected.
Latitude in decimal degrees.
Longitude in decimal degrees.
Were lat/long coordinates collected in the field (TRUE), or do they represent the location of a fixed station (FALSE)?
Date sample was collected.
Date and time of sample collection.
Bottom depth (m).
Tidal stage.
Secchi depth (cm).
Temperature (°C) at surface.
Temperature (°C) at bottom.
Specific conductance (S cm-1) at surface.
Specific conductance (S cm-1) at bottom.
More metadata and information on methods are available here.
This package contains the source datasets and a function to combine any combination into an integrated dataset
Maintainer: Samuel M Bashevkin [email protected] (ORCID)
Dave Bosworth [email protected] (ORCID)
Sarah Perry [email protected] (ORCID)
Elizabeth B Stumpner [email protected] (ORCID)
Other contributors:
Rosemary Hartman [email protected] (ORCID) [contributor]
Water quality data from the United States Fish and Wildlife Service Delta Juvenile Fish Monitoring Program.
a tibble with 150,488 rows and 9 variables
Name of the source dataset.
Station where sample was collected.
Latitude in decimal degrees.
Longitude in decimal degrees.
Date sample was collected.
Date and time of sample collection.
Secchi depth (cm).
Temperature in °C.
Specific conductance (S cm-1) at surface.
Dissolved oxygen (mg/L) at surface.
More metadata and information on methods are available here.
Water quality data from the ICF/USBR Directed Outflow Project.
a tibble with 3,473 rows and 16 variables
Name of the source dataset.
Station where sample was collected. Includes Station_Code and Habitat from the source dataset because multiple habitats are collected at each station.
Latitude at start of zooplankton tow in decimal degrees.
Longitude at start of zooplankton tow in decimal degrees.
Were lat/long coordinates collected in the field (TRUE), or do they represent the location of a fixed station (FALSE)?
Date sample was collected.
Date and time of sample collection.
Bottom depth at start of trawl (m).
Secchi depth (cm).
Temperature (°C) at surface.
Salinity at surface.
Specific conductance (S cm-1) at surface.
Dissolved oxygen (mg/L) at surface.
pH (dimensionless) at surface.
Turbidity (FNU) at surface.
Chlorophyll-a concentration (g L-1) at surface.
More metadata and information on methods are available here.
Water quality data from the United States Fish and Wildlife Service Enhanced Delta Smelt Monitoring Program.
a tibble with 30,957 rows and 14 variables
Name of the source dataset.
Station where sample was collected. This represents an identifier for the unique EDSM target location. Multiple tows (and water quality samples) were often collected at each target location on a day.
Latitude in decimal degrees. This is the actual latitude of the sample collection, not the latitude of the target location.
Longitude in decimal degrees. This is the actual longitude of the sample collection, not the longitude of the target location.
Were lat/long coordinates collected in the field (TRUE), or do they represent the location of a fixed station (FALSE)?
Date sample was collected.
Date and time of sample collection.
Bottom depth (m).
Tidal stage.
Secchi depth (cm).
Temperature in °C.
Temperature (°C) at bottom.
Specific conductance (S cm-1) at surface.
Dissolved oxygen (mg/L) at surface.
Dissolved oxygen (mg/L) at bottom.
More metadata and information on methods are available here.
Water quality data from the California Department of Water Resources Environmental Monitoring Program.
a tibble with 17,366 rows and 68 variables
Name of the source dataset.
Station where sample was collected.
Latitude in decimal degrees.
Longitude in decimal degrees.
Were lat/long coordinates collected in the field (TRUE), or do they represent the location of a fixed station (FALSE)?
Date sample was collected.
Date and time sample was collected.
Notes or comments.
Bottom depth (m).
Tidal stage (always High Slack).
Microcystis bloom intensity on a qualitative scale from 1 to 5 where 1 = absent, 2 = low, 3 = medium, 4 = high, and 5 = very high.
Whether the Chlorophyll value is lower than reported ("<" because it is below the reporting limit and the reporting limit is used as the value), or reported as the measured value "=".
Chlorophyll concentration (g L-1).
Secchi depth (cm).
Temperature (°C) at surface.
Temperature (°C) at bottom.
Specific conductance (S cm-1) at surface.
Specific conductance (S cm-1) at bottom.
Dissolved oxygen (mg/L) at surface.
Dissolved oxygen (mg/L) at bottom.
Dissolved oxygen percent (dimensionless) at surface.
Dissolved oxygen percent (dimensionless) at bottom.
pH (dimensionless) at surface.
pH (dimensionless) at bottom.
Turbidity (NTU) at surface.
Turbidity (NTU) at bottom.
Turbidity (FNU) at surface.
Turbidity (FNU) at bottom.
Whether the Pheophytin value is lower than reported ("<" because it is below the reporting limit and the reporting limit is used as the value), or reported as the measured value "=".
Pheophytin concentration (g L-1).
Whether the Total Alkalinity value is lower than reported ("<" because it is below the reporting limit and the reporting limit is used as the value), or reported as the measured value "=".
Total Alkalinity (mg/L).
Whether the Total Ammonia value is lower than reported ("<" because it is below the reporting limit and the reporting limit is used as the value), or reported as the measured value "=".
Total Ammonia (mg/L).
Whether the Dissolved Ammonia value is lower than reported ("<" because it is below the reporting limit and the reporting limit is used as the value), or reported as the measured value "=".
Dissolved Ammonia (mg/L). If DissAmmonia_Sign is <, this is equal to the reporting limit, NA = RL unknown.
Whether the Dissolved Bromide value is lower than reported ("<" because it is below the reporting limit and the reporting limit is used as the value), or reported as the measured value "=".
Dissolved bromide (mg/L).
Whether the Dissolved Calcium value is lower than reported ("<" because it is below the reporting limit and the reporting limit is used as the value), or reported as the measured value "=".
Dissolved calcium (mg/L).
Whether the Total Chloride value is lower than reported ("<" because it is below the reporting limit and the reporting limit is used as the value), or reported as the measured value "=".
Total chloride (mg/L).
Whether the Dissolved Chloride value is lower than reported ("<" because it is below the reporting limit and the reporting limit is used as the value), or reported as the measured value "=".
Dissolved chloride (mg/L).
Whether the Dissolved Nitrate Nitrite value is lower than reported ("<" because it is below the reporting limit and the reporting limit is used as the value), or reported as the measured value "=".
Dissolved Nitrate and Nitrite (mg/L). If DissNitrateNitrite_Sign is <, this is equal to the reporting limit, with NA = RL unknown.
Whether the DOC value is lower than reported ("<" because it is below the reporting limit and the reporting limit is used as the value), or reported as the measured value "=".
Dissolved organic carbon (mg/L).
Whether the TOC value is lower than reported ("<" because it is below the reporting limit and the reporting limit is used as the value), or reported as the measured value "=".
Total Organic Carbon (mg/L).
Whether the DON value is lower than reported ("<" because it is below the reporting limit and the reporting limit is used as the value), or reported as the measured value "=".
Dissolved Organic Nitrogen (mg/L).
Whether the Total Organic Nitrogen value is lower than reported ("<" because it is below the reporting limit and the reporting limit is used as the value), or reported as the measured value "=".
Total Organic Nitrogen (mg/L).
Whether the Dissolved Ortho-phosphate value is lower than reported ("<" because it is below the reporting limit and the reporting limit is used as the value), or reported as the measured value "=".
Dissolved Ortho-phosphate (mg/L). If DissOrthophos_Sign is <, this is equal to the reporting limit, with NA = RL unknown.
Whether the Total Phosphate value is lower than reported ("<" because it is below the reporting limit and the reporting limit is used as the value), or reported as the measured value "=".
Total phosphorous (mg/L).
Whether the Dissolved Silica value is lower than reported ("<" because it is below the reporting limit and the reporting limit is used as the value), or reported as the measured value "=".
Dissolved silica (mg/L).
Whether the Total Dissolved Solids value is lower than reported ("<" because it is below the reporting limit and the reporting limit is used as the value), or reported as the measured value "=".
Total Dissolved Solids (mg/L).
Whether the TSS value is lower than reported ("<" because it is below the reporting limit and the reporting limit is used as the value), or reported as the measured value "=".
Total Suspended Solids (mg/L).
Whether the VSS value is lower than reported ("<" because it is below the reporting limit and the reporting limit is used as the value), or reported as the measured value "=".
Volatile Suspended Solids (mg/L).
Whether the TKN value is lower than reported ("<" because it is below the reporting limit and the reporting limit is used as the value), or reported as the measured value "=".
Total Kjeldahl Nitrogen (mg/L).
More metadata and information on methods are available here.
; for more information on _Sign variables: sign_variables
Water quality data from the California Department of Fish and Wildlife Fall Midwater Trawl.
a tibble with 29,237 rows and 16 variables
Name of the source dataset.
Station where sample was collected.
Latitude in decimal degrees.
Longitude in decimal degrees.
Date sample was collected.
Date and time of sample collection.
Bottom depth (m).
Tidal stage.
Microcystis bloom intensity on a qualitative scale from 1 to 5 where 1 = absent, 2 = low, 3 = medium, 4 = high, and 5 = very high.
Secchi depth (cm).
Was Secchi depth estimated? Y/N
Temperature (°C) at surface.
Temperature (°C) at bottom.
Specific conductance (S cm-1) at surface.
Specific conductance (S cm-1) at bottom.
Turbidity (NTU) at surface.
More metadata and information on methods are available here.
Water quality data from the California Department of Water Resources North Central Region Office.
a tibble with 10,652 rows and 49 variables
Name of the source dataset.
Station where sample was collected.
Latitude in decimal degrees.
Longitude in decimal degrees.
Date sample was collected.
Date and time sample was collected.
Secchi depth (cm).
Microcystis bloom intensity on a qualitative scale from 1 to 5 where 1 = absent, 2 = low, 3 = medium, 4 = high, and 5 = very high.
Temperature (°C) at surface.
Specific conductance (S cm-1) at surface.
Dissolved oxygen (mg/L) at surface.
Dissolved oxygen (mg/L) at bottom.
pH (dimensionless) at surface.
Turbidity (NTU) at surface.
Turbidity (FNU) at surface.
Whether the Total Alkalinity value is lower than reported ("<" because it is below the reporting limit and the reporting limit is used as the value), or reported as the measured value "=".
Total Alkalinity (mg/L).
Whether the Dissolved Ammonia value is lower than reported ("<" because it is below the reporting limit and the reporting limit is used as the value), or reported as the measured value "=".
Dissolved Ammonia (mg/L). If DissAmmonia_Sign is <, this is equal to the reporting limit
Whether the Dissolved Bromide value is lower than reported ("<" because it is below the reporting limit and the reporting limit is used as the value), or reported as the measured value "=".
Dissolved bromide (mg/L). If DissBromide_Sign is <, this is equal to the reporting limit
Whether the Dissolved Calcium value is lower than reported ("<" because it is below the reporting limit and the reporting limit is used as the value), or reported as the measured value "=".
Dissolved calcium (mg/L). If DissCalcium_Sign is <, this is equal to the reporting limit
Whether the Dissolved chloride value is lower than reported ("<" because it is below the reporting limit and the reporting limit is used as the value), or reported as the measured value "=".
Dissolved chloride (mg/L).
Whether the Chlorophyll value is lower than reported ("<" because it is below the reporting limit and the reporting limit is used as the value), or reported as the measured value "="
Chlorophyll concentration (g L-1).
Whether the Pheophytin is lower than reported ("<" because it is below the reporting limit and the reporting limit is used as the value), or reported as the measured value "=".
Pheophytin concentration (g L-1).
Whether the Dissolved Nitrate Nitrite value is lower than reported ("<" because it is below the reporting limit and the reporting limit is used as the value), or reported as the measured value "=".
Dissolved Nitrate and Nitrite (mg/L). If DissNitrateNitrite_Sign is <, this is equal to the reporting limit
Whether the Dissolved Organic Carbon value is lower than reported ("<" because it is below the reporting limit and the reporting limit is used as the value), or reported as the measured value "="
Dissolved organic carbon (mg/L). If DOC_Sign is <, this is equal to the reporting limit
Whether the Total Organic Carbon value is lower than reported ("<" because it is below the reporting limit and the reporting limit is used as the value), or reported as the measured value "="
Total Organic Carbon (mg/L).If TOC_Sign is <, this is equal to the reporting limit
Whether the Dissolved Organic Nitrate value is lower than reported ("<" because it is below the reporting limit and the reporting limit is used as the value), or reported as the measured value "="
Dissolved Organic Nitrogen (mg/L).If DON_Sign is <, this is equal to the reporting limit
Whether the Dissolved Orthophos value is lower than reported ("<" because it is below the reporting limit and the reporting limit is used as the value), or reported as the measured value "=".
Dissolved Ortho-phosphate (mg/L). If DissOrthophos_Sign is <, this is equal to the reporting limit
Whether the Total Phosphate value is lower than reported ("<" because it is below the reporting limit and the reporting limit is used as the value), or reported as the measured value "="
Total phosphorous (mg/L). If TotPhos_Sign is <, this is equal to the reporting limit.
Whether the Total Suspended Solids value is lower than reported ("<" because it is below the reporting limit and the reporting limit is used as the value), or reported as the measured value "="
Total suspended solids (mg/L). If TSS_Sign is <, this is equal to the reporting limit
Whether the Volatile Suspended Solids value is lower than reported ("<" because it is below the reporting limit and the reporting limit is used as the value), or reported as the measured value "="
Volatile suspended solids (mg/L). If VSS_Sign is <, this is equal to the reporting limit
Whether the Total Kjeldahl Nitrogen value is lower than reported ("<" because it is below the reporting limit and the reporting limit is used as the value), or reported as the measured value "=". "NA" indicates reporting limit unknown.
Total Kjeldahl Nitrogen (mg/L). IF TKN_Sign is <, this is equal to the reporting limit.
Whether the Total Dissolved Solids value is lower than reported ("<" because it is below the reporting limit and the reporting limit is used as the value), or reported as the measured value "=".
Total Dissolved Solids (mg/L).
Contact Jared Frantzich [email protected] for more information.
; for more information on _Sign variables: sign_variables
Water quality data from the California Department of Water Resources Stockton Dissolved Oxygen monitoring.
a tibble with 3,112 rows and 16 variables
Name of the source dataset.
Station where sample was collected.
Latitude in decimal degrees.
Longitude in decimal degrees.
Date sample was collected.
Date and time of sample collection.
Microcystis bloom intensity on a qualitative scale from 1 to 5 where 1 = absent, 2 = low, 3 = medium, 4 = high, and 5 = very high.
Secchi depth (cm).
Temperature (°C) at surface.
Temperature (°C) at bottom.
Specific conductance (S cm-1) at surface.
Specific conductance (S cm-1) at bottom.
Dissolved oxygen (mg/L) at surface.
Dissolved oxygen (mg/L) at bottom.
pH (dimensionless) at surface.
pH (dimensionless) at bottom.
More metadata and information on methods are available here.
Water quality data from the California Department of Fish and Wildlife Spring Kodiak Trawl.
a tibble with 4,505 rows and 13 variables
Name of the source dataset.
Station where sample was collected.
Latitude in decimal degrees.
Longitude in decimal degrees.
Were lat/long coordinates collected in the field (TRUE), or do they represent the location of a fixed station (FALSE)?
Date sample was collected.
Date and time of sample collection.
Bottom depth (m).
Tidal stage.
Secchi depth (cm).
Temperature (°C) at surface.
Specific conductance (S cm-1) at surface.
More metadata and information on methods are available here.
Water quality data from the California Department of Fish and Wildlife Smelt Larva Survey.
a tibble with 2,889 rows and 12 variables
Name of the source dataset.
Station where sample was collected.
Latitude in decimal degrees.
Longitude in decimal degrees.
Date sample was collected.
Date and time of sample collection.
Bottom depth (m).
Tidal stage.
Secchi depth (cm).
Temperature (°C) at surface.
Specific conductance (S cm-1) at surface.
More metadata and information on methods are available here.
Water quality data from the California Department of Fish and Wildlife Summer Townet Survey.
a tibble with 8,074 rows and 16 variables
Name of the source dataset.
Station where sample was collected.
Latitude in decimal degrees.
Longitude in decimal degrees.
Date sample was collected.
Date and time of sample collection.
Bottom depth (m).
Tidal stage.
Microcystis bloom intensity on a qualitative scale from 1 to 5 where 1 = absent, 2 = low, 3 = medium, 4 = high, and 5 = very high.
Secchi depth (cm).
Temperature (°C) at surface.
Temperature (°C) at bottom.
Specific conductance (S cm-1) at surface.
Specific conductance (S cm-1) at bottom.
Turbidity (NTU) at surface.
More metadata and information on methods are available here.
Water quality data from the UC Davis Suisun Marsh Fish Study.
a tibble with 14,206 rows and 11 variables
Name of the source dataset.
Station where sample was collected.
Latitude in decimal degrees.
Longitude in decimal degrees.
Date sample was collected.
Date and time of sample collection.
Bottom depth (m).
Tidal stage.
Secchi depth (cm).
Temperature (°C) at surface.
Specific conductance (S cm-1) at surface.
Dissolved oxygen (mg/L) at surface.
Dissolved oxygen percent (dimensionless) at surface.
More metadata and information on methods are available here.
Water quality data from the California Department of Fish and Wildlife 20mm survey.
a tibble with 10,469 rows and 14 variables
Name of the source dataset.
Station where sample was collected.
Latitude in decimal degrees.
Longitude in decimal degrees.
Were lat/long coordinates collected in the field (TRUE), or do they represent the location of a fixed station (FALSE)?
Date sample was collected.
Date and time of sample collection.
Bottom depth (m).
Tidal stage.
Secchi depth (cm).
Temperature (°C) at surface.
Specific conductance (S cm-1) at surface.
Specific conductance (S cm-1) at bottom.
More metadata and information on methods are available here.
Water quality data from the United States Bureau of Reclamation Sacramento Deepwater Ship Channel cruises.
a tibble with 904 rows and 13 variables
Name of the source dataset.
Station where sample was collected.
Latitude in decimal degrees.
Longitude in decimal degrees.
Date sample was collected.
Date and time of sample collection.
Bottom depth (m). Only 1 value per station, probably an average?
Depth (m) of surface sample.
Depth (m) of bottom sample.
Chlorophyll concentration (g L-1).
Temperature (°C) at surface.
Temperature (°C) at bottom.
Specific conductance (S cm-1) at surface.
Discrete water quality data from the USGS California Water Science Center
a tibble with 16,751 rows and 19 variables
Name of the source dataset.
Station where sample was collected.
Latitude in decimal degrees.
Longitude in decimal degrees.
Date sample was collected.
Date and time sample was collected.
Whether the Chlorophyll value is estimated (extrapolated at low end) or reported as measured.
Chlorophyll concentration (g L-1).
Whether the Dissolved Ammonia value is lower than reported ("<" because it is below the reporting limit and the reporting limit is used as the value), estimated "~", or reported as the measured value "=".
Dissolved Ammonia (mg/L). If DissAmmonia_Sign is <, this is equal to the reporting limit, NA = RL unknown.
Whether the Dissolved Nitrate Nitrite value is lower than reported ("<" because it is below the reporting limit and the reporting limit is used as the value), estimated "~", or reported as the measured value "=".
Dissolved Nitrate and Nitrite (mg/L)
Dissolved Organic Carbon (mg/L)
Whether the Dissolved Orthophosphate value is lower than reported ("<" because it is below the reporting limit and the reporting limit is used as the value), estimated "~", or reported as the measured value "=".
Dissolved Ortho-phosphate (mg/L)
Dissolved oxygen (mg/L) at surface.
pH (dimensionless) at surface.
Temperature (°C) at surface.
Specific conductance (S cm-1) at surface.
More metadata and information on methods are available here for data and here for metadata.
; for more information on _Sign variables: sign_variables
Water quality data from the United States Geological Survey San Francisco Bay Water Quality Survey.
a tibble with 23,923 rows and 22 variables
Name of the source dataset.
Station where sample was collected.
Latitude in decimal degrees.
Longitude in decimal degrees.
Date sample was collected.
Date and time of sample collection. Reported as an average when the collection times varied among the the surface and bottom WQ and nutrient parameters.
Depth (m) of surface sample. Reported as an average when surface depths varied among the WQ parameters.
Depth (m) of bottom sample. Reported as an average when bottom depths varied among the WQ parameters.
Temperature (°C) at surface.
Temperature (°C) at bottom.
Salinity at surface.
Salinity at bottom.
Chlorophyll concentration (g L-1) at surface.
Dissolved oxygen (mg/L) at surface.
Dissolved oxygen (mg/L) at bottom.
Dissolved oxygen percent (dimensionless) at surface.
Dissolved oxygen percent (dimensionless) at bottom.
Depth (m) paired w/ nutrient sampling (range: 0-5 m). Reported as an average when surface depths varied among the nutrient parameters.
Dissolved Nitrate and Nitrite (mg/L).
Dissolved Ammonia (mg/L).
Dissolved Ortho-phosphate (mg/L).
Dissolved Silica (mg/L).
More metadata and information on methods are available here for data from 1969-2015 and here for data from 2016-2019.
Imports, filters, and processes water quality datasets and outputs an integrated dataset
wq(Sources = NULL, Start_year = NULL, End_year = NULL)
wq(Sources = NULL, Start_year = NULL, End_year = NULL)
Sources |
Character vector of data sources for the water quality variables. No default, this must be specified.
Choices include "20mm" (20mm Survey, |
Start_year |
Earliest year you would like included in the dataset. Must be an integer. Defaults to year |
End_year |
Latest year you would like included in the dataset. Must be an integer. Defaults to the current year. |
An integrated dataset
Data <- wq( Sources = c( "20mm", "Baystudy", "DJFMP", "DOP", "EDSM", "EMP", "FMWT", "NCRO", "SDO", "SKT", "SLS", "STN", "Suisun", "USBR", "USGS_CAWSC", "USGS_SFBS", "YBFMP" ) )
Data <- wq( Sources = c( "20mm", "Baystudy", "DJFMP", "DOP", "EDSM", "EMP", "FMWT", "NCRO", "SDO", "SKT", "SLS", "STN", "Suisun", "USBR", "USGS_CAWSC", "USGS_SFBS", "YBFMP" ) )
Water quality data from the California Department of Water Resources Yolo Bypass Fish Monitoring Program.
a tibble with 8,883 rows and 14 variables
Name of the source dataset.
Station where sample was collected.
Latitude in decimal degrees.
Longitude in decimal degrees.
Date sample was collected.
Date and time sample was collected.
Tidal stage ('overtopping' refers to periods of floodplain inundation that drown out tidal effects).
Microcystis bloom intensity on a qualitative scale from 1 to 5 where 1 = absent, 2 = low, 3 = medium, 4 = high, and 5 = very high.
Secchi depth (cm).
Temperature (°C) at surface.
Specific conductance (S cm-1) at surface.
Dissolved oxygen (mg/L) at surface.
pH (dimensionless) at surface.
Notes or comments.
More metadata and information on methods are available here and here.