Releases: nsidc/earthaccess
v0.6.0
v0.5.3
Enhancements
- We can search by doi at the granule level, if a collection is found earthaccess will grab the
concept_id
from the CMR record and search using it. - We will be able to use pattern matching on the granule file names! closes #198 combining the two we could have searches like
results = earthaccess.search_data(
doi = "10.5067/SLREF-CDRV3",
granule_name = "2005-*.nc",
count=100
)
-
If using remote Dask cluster, earthaccess will open the files using HTTPS links and will switch on the fly to S3 links if the cluster is in us-west-2 Thanks to @jrbourbeau! this change implemented a thin wrapper around
fsspec.AbstractFileSystem
-
The granule representation removed the spatial output in favor of a simpler
is_cloud_hosted
until we have a nicer spatial formatter (it was a blob of json)
Bugs fixed
size()
method for granules had a typo and returned 0 all the time, this was fixed- https sessions returned to
trust_env=False
with a True value the session will read the.netrc
and send both simple auth and tokens at the same time causing an authentication error with most services.
Documentation improvements
- Reorganized docs to include resources and end to end examples
- README is now using the SSHA dataset from PODAAC as is simpler to explain and work with compared to ATL data, addresses #241
- SSL and EMIT examples included in the documentation, they are executed end to end on CI
- Added a minimal example of
search_data()
filtering thanks @andypbarrett!
CI Maintenance:
- Integration tests are on a different file
- Integration tests are going to run only on pushes to main
- Documentation is only going to be updated when we update main
- PODAAC migrated all their data to the cloud already so there is no point in having it on the
on_prem
tests
Contributors to this release
@MattF-NSIDC @jrbourbeau @mrocklin @andypbarrett @betolink
π
v0.5.2
v0.5.1
This release will fix #212 and implements more testing for Auth and S3Credentials
endpoints. Eventually they are going to support bearer tokens but only ASF does at the moment.
- Fix call to
S3Credentials
- Fix readthedocs
- Removed
python_magic
from core dependencies (will fix Windows for conda) - Updated example notebooks to use the new top level API
- Support EARTHDATA_USERNAME and EARTHADATA_PASSWORD same as in IcePyx (work in progress with @JessicaS11)
- Once logged in we can access our profile (and email) with
auth = earthaccess.login()
profile = auth.user_profile
email = profile["email_address"]
v0.5.0
This release will fix some bugs and bring new capabilities to the top level API
import earthaccess
auth = earthaccess.login()
will automatically try all strategies, there is no need to specify one, if our credentials are not found it will ask the user to provide them interactively.
s3_credentials = earthaccess.get_s3_credentials(daac="PODAAC")
# use them with your fav library, e.g. boto3
# another thing we can do with our auth instance is to refresh our EDL tokens
auth.refresh_tokens()
We can also get authenticated fsspec sessions:
url = "https://data.lpdaac.earthdatacloud.nasa.gov/lp-prod-protected/EMITL2ARFL.001/EMIT_L2A_RFL_001_20220903T163129_2224611_012/EMIT_L2A_RFL_001_20220903T163129_2224611_012.nc"
fs = earthaccess.get_fsspec_https_session()
with fs.open(lpcloud_url) as f:
data = f.read(10)
data
or we can use them in tandem with xarray/rioxarray
import xarray as xr
ds = xr.open_mfdataset(earthaccess.open([url]))
ds
v0.4.7 AGU Edition
Bug fixes:
- direct access streaming: .open() now works with granules from results when we run the code in
us-west-2
- python-magic is a dev dependency, moved to the dev section in pyproject.toml
v0.4.6
This is the first formal release under the new name. 0.4.6 will be available in both pypi and conda-forge.
The first thing to mention is the new API notation that should evolve to support all the use cases,
import earthaccess
earthaccess.login(strategy="netrc")
granules = earthaccess.search_data(params)
earthaccess.download(granules, local_path= "./test")
is equivalent to
from earthdata import Store, Auth, DataGranules
auth = Auth()
auth.login(strategy="netrc")
store = Store(auth)
granules = DataGranules().params(params).get()
store.get(granules, local_path="./test")
We can still use the classes the same way but eventually we should support only module-level API.
Features:
- search datasets by DOI, e.g.
datasets = earthaccess.search_datasets(
doi="10.5067/AQR50-3Q7CS"
cloud_hosted=True
)
searching by DOI should usually return only one dataset but I'm not sure what would happen if the same data is also in the cloud so to be sure we can use the cloud_hosted
parameter if we want to operate on the AWS hosted version.
The documentation started to get updated and soon we should have a "gallery" with more examples of how to use the library.
earthaccess
First release under the new name, pypi was updated and the current earthaccess package installs v0.4.5
, conda-forge is still pending.
The old notation is still supported, we can import the classes and instantiate them the same way but having a simpler notation is probably a better idea. From now on we can do the following:
import earthaccess
earthaccess.login(strategy="netrc")
granules = earthaccess.search_data(params)
earthaccess.download(granules, local_path= "./test")
and voila!
This is still beta and the though is that we can have a stable package starting on v0.5.0, we need to add more tests and deal with EULAs as they represent a big obstacle for programmatic access specially for new accounts with NASA.
v0.4.1
This is a minor release with some bug fixes but the last one with the old name. The next release will come with the earthaccess
name.
store.get()
had a bug when we used it with empty lists- GESDISC didn't have S3 credential endpoints
- LP DAAC changed its S3 credential endpoint
- documentation from super classes was not showing due a new change in mkdocstrings, had to re-implement the inherited members and call super()
v0.4.0
earthdata can now persist user's credentials into a .netrc
file
from earthdata import Auth, DataCollections, DataGranules, Store
auth = Auth().login(strategy="netrc")
# are we authenticated?
if not auth.authenticated:
# ask for credentials and persist them in a .netrc file
auth.login(strategy="interactive", persist=True)
We can also renew our CMR token to make sure our authenticated queries work:
auth.refresh_token()
collections = DataCollections(auth).concept_id("c-some-restricted-dataset").get()
We can get authenticated fsspec
file sessions. closes #41
store = Store(auth)
fs = store.get_https_session()
# we can use fsspec to get any granule from any DAAC!
fs.get("https://DAAC/granule", "./data")
We can use Store
to get our files from a URL list. closes #43
store = Store(auth)
files = store.get(["https://GRANULE_URL"], "./data/")
Lastly, we can stream certain datasets directly into xarray (even if we are not in AWS)
%%time
import xarray as xr
query_results = DataGranules().concept_id("C2036880672-POCLOUD").temporal("2018-01-01", "2018-12-31").get()
ds = xr.open_mfdataset(store.open(query_results))
ds