- Add a new attribute to
EHydro
class calledsurvey_grid
. It's ageopandas.GeoDataFrame
that includes the survey grid of the eHydro dataset which is a 35-km hexagonal grid.
- Remove dependency on
dask
. - Move all NLCD related functions to a separate module called
nlcd
. This doesn't affect the API since the functions are still available underpygeohydro
namespace.
This release provides access to three new datasets:
- USACE Hydrographic Surveys (eHydro) and
- USGS Short-Term Network (STN) Flood Event Data (contributed by Fernando Aristizabal)
- NLCD 2021
- Add support for getting topobathymetry data from USACE Hydrographic
Surveys (eHydro). The new class is called
EHydro
and gives users the ability to subset the eHydro dataset by geometry, ID, or SQL queries. - Add new
stnfloodevents
module withSTNFloodEventData
class for retrieving flood event data from the USGS Short-Term Network (STN) RESTful Service. This Python API abstracts away RESTful principles and produces analysis ready data in geo-referenced GeoDataFrames, DataFrames, lists, or dictionaries as desired. The core class methods available aredata_dictionary
,get_all_data
, andget_filtered_data
. These class methods retrieve the data dictionaries by type, get all the available data by type, and make filtered requests for data by type as well, respectively. The four types of data includeinstruments
,peaks
,hwms
, andsites
. Contributed by Fernando Aristizabal. - Add a wrapper function for the
STNFloodEventData
class calledstn_flood_event
. - Add support for the new NLCD data (2021) for the three supported layers.
From release 0.15 onward, all minor versions of HyRiver packages
will be pinned. This ensures that previous minor versions of HyRiver
packages cannot be installed with later minor releases. For example,
if you have py3dep==0.14.x
installed, you cannot install
pydaymet==0.15.x
. This is to ensure that the API is
consistent across all minor versions.
- Add a new option to
NWIS.get_info
, callednhd_info
, for retrieving NHDPlus related info on the sites. This will two new service calls that might slow down the function, so it's disabled by default. - Update links in
NID
to the latest CSV and GPKG versions of the NID dataset. - Add two new properties to
NID
to access the entire NID dataset. You can useNID.df
to access the CSV version as apandas.DataFrame
andNID.gdf
to access the GPKG version as ageopandas.GeoDataFrame
. Installingpyogrio
is highly recommended for much faster reading of the GPKG version. - Refactor
NID.bygeom
to use the newNID.gdf
property for spatial querying of the dataset. This change should make the query much faster. - For now, retain compatibility with
shapely<2
while supportingshapley>=2
.
- Add a new function, called
nlcd_area_percent
, for computing the percentages or natural, developed, and impervious areas within geometries of a givenGeoDataFrame
. This function uses imperviousness and land use/land cover data from NLCD to compute the area percentages of the natural, developed, and impervious areas. For more information please refer to the function's documentation. - Add a new column to the dataframe returned by
NWIS.get_info
, callednhd_comid
, and renamedrain_sqkm
tonhd_areasqkm
. The new drainage area is the best available estimates of stations' drainage area that have been extracted from the NHDPlus. The newnhd_comid
column makes it easier to link stations to NHDPlus. - In
get_camels
, returnqobs
with negatives values set toNaN
. Also, Add a new variable calledNewman_2017
to both datasets for identifying the 531 stations that were used in Newman et al. (2017). - Add a new function, called
streamflow_fillna
, for filling missing streamflow values (NAN
) with day-of-year average values.
- Bump the minimum required version of
shapely
to 2.0, and use its new API.
- Sync all minor versions of HyRiver packages to 0.14.0.
- Improve performance of all NLCD functions by merging two methods of
the
NLCD
and also reducing the memory footprint of the functions.
- Add initial support for SensorThings API
Currently, the
SensorThings
class only supportsThings
endpoint. Users need to provide a valid Odata filter. The class has aodata_helper
function that can be used to generate and validate Odata filters. Additionally, usingsensor_info
andsensor_property
functions users can request for information about sensors themselves or their properties.
- Simplify geometry validation by using
pygeoutils.geo2polygon
function inssebopeta_bygeom
. - Fully migrate
setup.cfg
andsetup.py
topyproject.toml
. - Convert relative imports to absolute with
absolufy-imports
. - Sync all patch versions of HyRiver packages to x.x.12.
- The NID service has changed some of its endpoints to use Federal ID
instead of Dam ID. This change affects the
NID.inventory_byid
function. This function now accepts Federal IDs instead of dam IDs.
- Refactor the
show_versions
function to improve performance and print the output in a nicer table-like format.
- Use the new
pygeoogc.streaming_download
function inhuc_wb_full
to improve performance and reduce code complexity. - Skip 0.13.9 version so the minor version of all HyRiver packages become the same.
- Modify the codebase based on the latest changes in
geopandas
related to empty dataframes. - Use
pyright
for static type checking instead ofmypy
and address all typing issues that it raised.
- Add a function called
huc_wb_full
that returns the full watershed boundaryGeoDataFrame
of a given HUC level. If only a subset of HUCs is needed thepygeohydro.WBD
class should be used. The full dataset is downloaded from the National Maps' WBD staged products. - Add a new function called
irrigation_withdrawals
for retrieving estimated monthly water use for irrigation by 12-digit hydrologic unit in the CONUS for 2015 from ScienceBase. - Add a new property to
NID
, calleddata_units
for indicating the units of NID dataset variables. - The
get_us_states
now acceptsconus
as asubset_key
which is equivalent tocontiguous
.
- Add
get_us_states
to__init__
file, so it can be loaded directly, e.g.,gh.get_us_states("TX")
. - Modify the codebase based on Refurb suggestions.
- Significant performance improvements in
NWIS.get_streamflow
especially for large requests by refactoring the timezone handling.
- Fix the dam types and purposes mapping dictionaries in
NID
class.
- Add a two new function for retrieving soil properties across the US:
soil_properties
: Porosity, available water capacity, and field capacity,soil_gnatsgo
: Soil properties from the gNATSGO database.
- Add a new help function called
state_lookup_table
for getting a lookup table of US states and their counties. This can be particularly useful for mapping the digitstate_cd
andcounty_cd
that NWIS returns to state names/codes. - Add support for getting individual state geometries using
get_us_states
function by passing their two letter state code. Also, use TIGER 2022 data for the US states and counties instead of TIGER 2021.
- Remove
proplot
as a dependency and usematplotlib
instead.
- Add the missing PyPi classifiers for the supported Python versions.
- Append "Error" to all exception classes for conforming to PEP-8 naming conventions.
- Deprecate
ssebopeta_byloc
since it's been replaced withssebopeta_bycoords
since version 0.13.0.
- Bump the minimum versions of
pygeoogc
andpygeoutils
to 0.13.5 and that ofasync-retriever
to 0.3.5.
- Add a new argument to
NID.inventory_byid
class for staging the entire NID dataset prior to inventory queries. There a new public method calledNID.stage_nid_inventory
that can be used to download the entire NID dataset and save it as afeather
file. This is useful inventory queries with large number of IDs and is much more efficient than querying the NID web service.
- The background value in
cover_statistics
function should have been 127 not 0. Also, dropped the background value from the return statistics.
- Set the minimum supported version of Python to 3.8 since many of the
dependencies such as
xarray
,pandas
,rioxarray
have dropped support for Python 3.7.
- Remove
USGS
prefixes from the input station IDs inNWIS.get_streamflow
function. Also, check if the remaining parts of the IDs are all digits and throw an exception if otherwise. Additionally, make sure that IDs have at least 8 chars by adding leading zeros (:issue_hydro:`99`). - Use micromamba for running tests and use nox for linting in CI.
- Add a new function called
get_us_states
to thehelpers
module for obtaining a GeoDataFrame of the US states. It has an optional argument for returning thecontiguous
states,continental
states,commonwealths
states, or USterritories
. The data are retrieved from the Census' Tiger 2021 database. - In the
NID
class keep thevalid_fields
property as apandas.Series
instead of alist
, so it can be searched easier via itsstr
accessor.
- Refactor the
plot.signatures
function to useproplot
instead ofmatplotlib
. - Improve performance of
NWIS.get_streamflow
by not validating the layer name when instantiating theWaterData
class. Also, make the function more robust by checking if streamflow data is available for each station and throw a warning if not.
- Fix an issue in
NWIS.get_streamflow
where-9999
values were not being filtered out. According to NWIS, these values are reserved for ice-affected data. This fix sets these values tonumpy.nan
.
- Add a new flag to
nlcd_*
functions calledssl
for disabling SSL verification. - Add a new function called
get_camels
for getting the CAMELS dataset. The function returns ageopandas.GeoDataFrame
that includes basin-level attributes for all 671 stations in the dataset and axarray.Dataset
that contains streamflow data for all 671 stations and their basin-level attributes. - Add a new function named
overland_roughness
for getting the overland roughness values from land cover data. - Add a new class called
WBD
for getting watershed boundary (HUC) data.
from pygeohydro import WBD
wbd = WBD("huc4")
hudson = wbd.byids("huc4", ["0202", "0203"])
Remove caching-related arguments from all functions since now they can be set globally via three environmental variables:
HYRIVER_CACHE_NAME
: Path to the caching SQLite database.HYRIVER_CACHE_EXPIRE
: Expiration time for cached requests in seconds.HYRIVER_CACHE_DISABLE
: Disable reading/writing from/to the cache file.
You can do this like so:
import os
os.environ["HYRIVER_CACHE_NAME"] = "path/to/file.sqlite"
os.environ["HYRIVER_CACHE_EXPIRE"] = "3600"
os.environ["HYRIVER_CACHE_DISABLE"] = "true"
- Write
nodata
attribute usingrioxarray
innlcd_bygeom
since the clipping operation ofrioxarray
uses this value as the fill value.
- Return a named tuple instead of a
dict
of percentages in thecover_statistics
function. It makes accessing the values easier. - Add
pycln
as a newpre-commit
hooks for removing unused imports. - Remove time zone info from the inputs to
plot.signatures
to avoid issues with thematplotlib
backend.
- Fix an issue in
plot.signatures
where the newmatplotlib
version requires anumpy
array instead of apandas.DataFrame
.
- Replace no data values of data in
ssebopeta_bygeom
withnp.nan
before converting it to mm/day. - Fix an inconsistency issue with CRS projection when using UTM in
nlcd_*
. UseEPSG:3857
for all reprojections and get the data from NLCD in the same projection. (:issue_hydro:`85`) - Improve performance of
nlcd_*
functions by reducing number of service calls.
- Add type checking with
typeguard
and fix type hinting issues raised bytypeguard
. - Refactor
show_versions
to ensure getting correct versions of all dependencies.
- The
NWIS.get_info
now returns ageopandas.GeoDataFrame
instead of apandas.DataFrame
.
- Fix a bug in
NWIS.get_streamflow
where the drainage area might not be computed correctly if target stations are not located at the outlet of their watersheds.
- Use the three new
ar.retrieve_*
functions instead of the oldar.retrieve
function to improve type hinting and to make the API more consistent.
- Fix an in issue with
NWIS.get_streamflow
where time zone of the data was not being correctly determined when it was US specific abbreviations such asCST
.
- Add support for getting instantaneous streamflow from NWIS in addition to
the daily streamflow by adding
freq
argument toNWIS.get_streamflow
that can be eitheriv
ordv
. The default isdv
to retain the previous behavior of the function. - Convert the time zone of the streamflow data to UTC.
- Add attributes of the requested stations as
attrs
parameter to the returnedpandas.DataFrame
. (:issue_hydro:`75`) - Add a new flag to
NWIS.get_streamflow
for returning the streamflow asxarray.Dataset
. This dataset has two dimensions;time
andstation_id
. It has ten variables which includesdischarge
and nine other station attributes. (:issue_hydro:`75`) - Add
drain_sqkm
from GagesII toNWIS.get_info
. - Show
drain_sqkm
in the interactive map generated byinteractive_map
. - Add two new functions for getting NLCD data;
nlcd_bygeom
andnlcd_bycoords
. The newnlcd_bycoords
function returns ageopandas.GeoDataFrame
with the NLCD layers as columns and input coordinates, which should be a list of(lon, lat)
tuples, as thegeometry
column. Moreover, The newnlcd_bygeom
function now accepts ageopandas.GeoDataFrame
as the input. In this case, it returns adict
with keys as indices of the inputgeopandas.GeoDataFrame
. (:issue_hydro:`80`) - The previous
nlcd
function is being deprecated. For now, it callsnlcd_bygeom
internally and retains the old behavior. This function will be removed in future versions.
- The
ssebop_byloc
is being deprecated and replaced byssebop_bycoords
. The new function accepts apandas.DataFrame
as input that should include three columns:id
,x
, andy
. It returns axarray.Dataset
with two dimensions:time
andlocation_id
. Theid
columns from the input is used as thelocation_id
dimension. Thessebop_byloc
function still retains the old behavior and will be removed in future versions. - Set the request caching's expiration time to never expire. Add two flags to all
functions to control the caching:
expire_after
anddisable_caching
. - Replace
NID
class with the new RESTful-based web service of National Inventory of Dams. The new NID service is very different from the old one, so this is considered a breaking change.
- Improve exception handling in
NWIS.get_info
when NWIS returns an error message rather than 500s web service error. - The
NWIS.get_streamflow
function now checks if the site info dataset contains any duplicates. Therefore, all the remaining station numbers will be unique. This prevents an issue with settingattrs
where duplicate indexes cause an exception when being converted to a dict. (:issue_hydro:`75`) - Add all the missing types so
mypy --strict
passes.
- Add support for the Water Quality Portal Web Services. (:issue_hydro:`72`)
- Add support for two versions of NID web service. The original NID web service is considered
version 2 and the new NID is considered version 3. You can pass the version number to the
NID
like soNID(2)
. The default version is 2.
- Fix an issue with background percentage calculation in
cover_statistics
.
- Add a new map service for National Inventory of Dams (NID).
- Use
importlib-metadata
for getting the version instead ofpkg_resources
to decrease import time as discussed in this issue.
- Refactor
cover_statistics
to address an issue with wrong category names and also improve performance for large datasets by usingnumpy
's functions. - Fix an issue with detecting wrong number of stations in
NWIS.get_streamflow
. Also, improve filtering stations that their start/end date don't match the user requested interval.
The highlight of this release is adding support for NLCD 2019 and significant improvements in NWIS support.
Add support for the recently released version of NLCD (2019), including the impervious descriptor layer. Highlights of the new database are:
NLCD 2019 now offers land cover for years 2001, 2004, 2006, 2008, 2011, 2013, 2016, 2019, and impervious surface and impervious descriptor products now updated to match each date of land cover. These products update all previously released versions of land cover and impervious products for CONUS (NLCD 2001, NLCD 2006, NLCD 2011, NLCD 2016) and are not directly comparable to previous products. NLCD 2019 land cover and impervious surface product versions of previous dates must be downloaded for proper comparison. NLCD 2019 also offers an impervious surface descriptor product that identifies the type of each impervious surface pixel. This product identifies types of roads, wind tower sites, building locations, and energy production sites to allow deeper analysis of developed features.
—MRLC
Add support for all the supported regions of NLCD database (CONUS, AK, HI, and PR).
Add support for passing multiple years to the NLCD function, like so
{"cover": [2016, 2019]}
.Add
plot.descriptor_legends
function to plot the legend for the impervious descriptor layer.New features in
NWIS
class are:- Remove
query_*
methods since it's not convenient to pass them directly as a dictionary. - Add a new function called
get_parameter_codes
to query parameters and get information about them. - To decrease complexity of
get_streamflow
method add a new private function to handle some tasks. - For handling more of NWIS's services make
retrieve_rdb
more general.
- Remove
Add a new argument called
nwis_kwds
tointeractive_map
so any NWIS specific keywords can be passed for filtering stations.Improve exception handling in
get_info
method and simplify and improve its performance for getting HCDN.
- Migrate to using
AsyncRetriever
for handling communications with web services.
- Drop support for Python 3.6 since many of the dependencies such as
xarray
andpandas
have done so. - Remove
get_nid
andget_nid_codes
functions since NID now has a ArcGISRESTFul service.
- Add a new class called
NID
for accessing the recently released National Inventory of Dams web service. This service is based on ArcGIS's RESTful service. So now the user just need to instantiate the class like soNID()
and with three methods ofAGRBase
class, the user can retrieve the data. These methods are:bygeom
,byids
, andbysql
. Moreover, it has aattrs
property that includes descriptions of the database fields with their units. - Refactor
NWIS.get_info
to be more generic by accepting any valid queries that are documented at USGS Site Web Service. - Allow for passing a list of queries to
NWIS.get_info
and useasync_retriever
that significantly improves the network response time. - Add two new flags to
interactive_map
for limiting the stations to those with daily values (dv=True
) and/or instantaneous values (iv=True
). This function now includes a link to stations webpage on USGS website.
- Use persistent caching for all send/receive requests that can significantly improve the network response time.
- Explicitly include all the hard dependencies in
setup.cfg
. - Refactor
interactive_map
andNWIS.get_info
to make them more efficient and reduce their code complexity.
- Add announcement regarding the new name for the software stack, HyRiver.
- Improve
pip
installation and release workflow.
- Add
lxml
to deps.
- The official first release of PyGeoHydro with a new name and logo.
- Replace
cElementTree
withElementTree
since it's been deprecated bydefusedxml
. - Make
mypy
checks more strict and fix all the errors and prevent possible bugs. - Speed up CI testing by using
mamba
and caching.
- Rename
hydrodata
package toPyGeoHydro
for publication on JOSS. - In
NWIS.get_info
, drop rows that don't have mean daily discharge data instead of slicing. - Speed up Github Actions by using
mamba
and caching. - Improve
pip
installation by addingpyproject.toml
.
- Add support for the National Inventory of Dams (NID) via
get_nid
function.
- Fix an issue with
NWIS.get_info
method where stations with False values as theirhcdn_2009
value were returned asNone
instead.
- Bump versions of packages across the stack to the same version.
- Use the new PyNHD function for getting basins,
NLDI.get_basisn
. - Made
mypy
checks more strict and added all the missing type annotations.
- Fixed the issue with WaterData due to the recent changes on the server side.
- Updated the examples based on the latest changes across the stack.
- Add support for multipolygon.
- Remove the
fill_hole
argument. - Fix a warning in
nlcd
regarding performing division onnan
values.
- Replaced
simplejson
withorjson
to speed-up JSON operations. - Explicitly sort the time dimension of the
ssebopeta_bygeom
function.
- Fix an issue with the
nlcd
function where high resolution requests fail.
- Added a new argument to
plot.signatures
for controlling the vertical position of the plot title, calledtitle_ypos
. This could be useful for multi-line titles.
- Fixed an issue with the
nlcd
function where none layers are not dropped and cause the function to fails.
This version divides PyGeoHydro into six standalone Python libraries. So many of the changes listed below belong to the modules and functions that are now a separate package. This decision was made for reducing the complexity of the code base and allow the users to only install the packages that they need without having to install all the PyGeoHydro dependencies.
- The
services
module is now a separate package called PyGeoOGCC and is set as a requirement for PyGeoHydro. PyGeoOGC is a leaner package with much fewer dependencies and is suitable for people who might only need an interface to web services. - Unified function names for getting feature by ID and by box.
- Combined
start
andend
arguments into atuple
argument calleddates
across the code base. - Rewrote NLDI function and moved most of its
classmethods
toStation
so nowStation
class has more cohesion. - Removed exploratory functionality of
ArcGISREST
, since it's more convenient to do so from a browser. Now,base_url
is a required argument. - Renamed
in_crs
indatasets
andservices
functions togeo_crs
for geometry andbox_crs
for bounding box inputs. - Re-wrote the
signatures
function from scratch usingNamedTuple
to improve readability and efficiency. Now, thedaily
argument should be just apandas.DataFrame
orpandas.Series
and the column names are used for legends. - Removed
utils.geom_mask
function and replaced it withrasterio.mask.mask
. - Removed
width
as an input in functions with raster output sinceresolution
is almost always the preferred way to request for data. This change made the code more readable. - Renamed two functions:
ArcGISRESTful
andwms_bybox
. These function now returnrequests.Response
type output. onlyipv4
is now a class method inRetrySession
.- The
plot.signatures
function now assumes that the input time series are in mm/day. - Added a flag to
get_streamflow
function in theNWIS
class to convert from cms to mm/day which is useful for plotting hydrologic signatures using thesignatures
functions.
- Remove soft requirements from the env files.
- Refactored
requests
functions into a single class and a separate file. - Made all the classes available directly from
PyGeoHydro
. - Added CodeFactor to the Github pipeline and addressed
some issues that
CodeFactor
found. - Added Bandit to check the code for security issue.
- Improved docstrings and documentations.
- Added customized exceptions for better exception handling.
- Added
pytest
fixtures to improve the tests speed. - Refactored
daymet
andnwis_siteinfo
functions to reduce code complexity and improve readability. - Major refactoring of the code base while adding type hinting.
- The input geometry (or bounding box) can be provided in any projection and the necessary re-projections are done under the hood.
- Refactored the method for getting object IDs in
ArcGISREST
class to improve robustness and efficiency. - Refactored
Daymet
class to improve readability. - Add Deepsource for further code quality checking.
- Automatic handling of large WMS requests (more than 8 million pixels i.e., width x height)
- The
json_togeodf
function now accepts both a single (Geo)JSON or a list of them - Refactored
plot.signatures
usingadd_gridspec
for a much cleaner code.
- Added access to WaterData's GeoServer databases.
- Added access to the remaining NLDI database (Water Quality Portal and Water Data Exchange).
- Created a Binder for launching a computing environment on the cloud and testing PyGeoHydro.
- Added a URL repository for the supported services called
ServiceURL
- Added support for FEMA web services for flood maps and FWS for wetlands.
- Added a new function called
wms_toxarray
for converting WMS request responses toxarray.DataArray
orxarray.Dataset
.
- Re-projection issues for function with input geometry.
- Start and end variables not being initialized when coords was used in
Station
. - Geometry mask for
xarray.DataArray
- WMS output re-projections
- Refactor requests session
- Improve overall code quality based on CodeFactor suggestions
- Migrate to Github Actions from TravisCI
- Add to conda-forge
- Remove pqdm and arcgis2geojson dependencies
- Added threading capability to the flow accumulation function
- Generalized WFS to include both by bbox and by featureID
- Migrate RTD to
pip
fromconda
. - Changed HCDN database source to GagesII database
- Increased robustness of functions that need network connections
- Made the flow accumulation output a pandas Series for better handling of time series input
- Combined DEM, slope, and aspect in a class called NationalMap.
- Installation from pip installs all the dependencies
- An almost complete re-writing of the code base and not backward-compatible
- New website design
- Added vector accumulation
- Added base classes and function accessing any ArcGIS REST, WMS, WMS service
- Standalone functions for creating datasets from responses and masking the data
- Added threading using
pqdm
to speed up the downloads - Interactive map for exploring USGS stations
- Replaced OpenTopography with 3DEP
- Added HCDN database for identifying natural watersheds
- Added new databases: NLDI, NHDPLus V2, OpenTopography, gridded Daymet, and SSEBop
- The gridded data are returned as xarray DataArrays
- Removed dependency on StreamStats and replaced it by NLDI
- Improved overall robustness and efficiency of the code
- Not backward comparable
- Added code style enforcement with
isort
, black, flake8 and pre-commit - Added a new shiny logo!
- New installation method
- Changed OpenTopography base url to their new server
- Fixed NLCD legend and statistics bug
- Clipped the obtained NLCD data using the watershed geometry
- Added support for specifying the year for getting NLCD
- Removed direct NHDPlus data download dependency by using StreamStats and USGS APIs
- Renamed
get_lulc
function toget_nlcd
- Simplified import method
- Changed usage from
rst
format toipynb
- Auto-formatting with the black python package
- Change
docstring
format based on Sphinx - Fixed
pytest
warnings and changed its working directory - Added an example notebook with data files
- Added
docstring
for all the functions - Added Module section to the documentation
- Fixed py7zr issue
- Changed 7z extractor from
pyunpack
to py7zr - Fixed some linting issues.
- First release on PyPI.