Skip to content

Releases: nsidc/earthaccess

v0.6.0

20 Sep 21:55
b3572d7
Compare
Choose a tag to compare

bug fixes

  • earthaccess.search_datasets() and earthaccess.search_data() can find restricted datasets #296
  • distributed serialization fixed for EarthAccessFile #301 and #276

new features

  • earthaccess.get_s3fs_session() can use the results to find the right set of S3 credentials

v0.5.3

01 Aug 15:44
Compare
Choose a tag to compare

Enhancements

  • We can search by doi at the granule level, if a collection is found earthaccess will grab the concept_id from the CMR record and search using it.
  • We will be able to use pattern matching on the granule file names! closes #198 combining the two we could have searches like
results = earthaccess.search_data(
    doi = "10.5067/SLREF-CDRV3",
    granule_name = "2005-*.nc",
    count=100
)
  • If using remote Dask cluster, earthaccess will open the files using HTTPS links and will switch on the fly to S3 links if the cluster is in us-west-2 Thanks to @jrbourbeau! this change implemented a thin wrapper around fsspec.AbstractFileSystem

  • The granule representation removed the spatial output in favor of a simpler is_cloud_hosted until we have a nicer spatial formatter (it was a blob of json)

Bugs fixed

  • size() method for granules had a typo and returned 0 all the time, this was fixed
  • https sessions returned to trust_env=False with a True value the session will read the .netrc and send both simple auth and tokens at the same time causing an authentication error with most services.

Documentation improvements

  • Reorganized docs to include resources and end to end examples
  • README is now using the SSHA dataset from PODAAC as is simpler to explain and work with compared to ATL data, addresses #241
  • SSL and EMIT examples included in the documentation, they are executed end to end on CI
  • Added a minimal example of search_data() filtering thanks @andypbarrett!

CI Maintenance:

  • Integration tests are on a different file
  • Integration tests are going to run only on pushes to main
  • Documentation is only going to be updated when we update main
  • PODAAC migrated all their data to the cloud already so there is no point in having it on the on_prem tests

Contributors to this release

@MattF-NSIDC @jrbourbeau @mrocklin @andypbarrett @betolink

πŸš€

v0.5.2

21 Apr 00:02
618311d
Compare
Choose a tag to compare
  • deprecating Benedict as dict data structure in favor of just using the built-in Python dict. Thanks @psarka!
  • fixed NSIDC S3 credentials endpoint

v0.5.1

20 Mar 14:34
9bae957
Compare
Choose a tag to compare

This release will fix #212 and implements more testing for Auth and S3Credentials endpoints. Eventually they are going to support bearer tokens but only ASF does at the moment.

  • Fix call to S3Credentials
  • Fix readthedocs
  • Removed python_magic from core dependencies (will fix Windows for conda)
  • Updated example notebooks to use the new top level API
  • Support EARTHDATA_USERNAME and EARTHADATA_PASSWORD same as in IcePyx (work in progress with @JessicaS11)
  • Once logged in we can access our profile (and email) with
auth = earthaccess.login()

profile = auth.user_profile
email = profile["email_address"]

v0.5.0

23 Feb 16:43
3a2d171
Compare
Choose a tag to compare

This release will fix some bugs and bring new capabilities to the top level API

import earthaccess

auth = earthaccess.login()

will automatically try all strategies, there is no need to specify one, if our credentials are not found it will ask the user to provide them interactively.

s3_credentials = earthaccess.get_s3_credentials(daac="PODAAC")
# use them with your fav library, e.g. boto3
# another thing we can do with our auth instance is to refresh our EDL tokens
auth.refresh_tokens()

We can also get authenticated fsspec sessions:

url = "https://data.lpdaac.earthdatacloud.nasa.gov/lp-prod-protected/EMITL2ARFL.001/EMIT_L2A_RFL_001_20220903T163129_2224611_012/EMIT_L2A_RFL_001_20220903T163129_2224611_012.nc"

fs = earthaccess.get_fsspec_https_session()
with fs.open(lpcloud_url) as f:
    data = f.read(10)
data

or we can use them in tandem with xarray/rioxarray

import xarray as xr

ds = xr.open_mfdataset(earthaccess.open([url]))
ds

This PR will fix #195 #187 and completes #167

v0.4.7 AGU Edition

12 Dec 03:59
Compare
Choose a tag to compare

Bug fixes:

  • direct access streaming: .open() now works with granules from results when we run the code in us-west-2
  • python-magic is a dev dependency, moved to the dev section in pyproject.toml

v0.4.6

08 Dec 20:31
Compare
Choose a tag to compare

This is the first formal release under the new name. 0.4.6 will be available in both pypi and conda-forge.

The first thing to mention is the new API notation that should evolve to support all the use cases,

import earthaccess

earthaccess.login(strategy="netrc")

granules = earthaccess.search_data(params)

earthaccess.download(granules, local_path= "./test")

is equivalent to

from earthdata import Store, Auth, DataGranules

auth = Auth()
auth.login(strategy="netrc")
store = Store(auth)

granules = DataGranules().params(params).get()

store.get(granules, local_path="./test")

We can still use the classes the same way but eventually we should support only module-level API.

Features:

  • search datasets by DOI, e.g.
datasets = earthaccess.search_datasets(
    doi="10.5067/AQR50-3Q7CS"
    cloud_hosted=True
)

searching by DOI should usually return only one dataset but I'm not sure what would happen if the same data is also in the cloud so to be sure we can use the cloud_hosted parameter if we want to operate on the AWS hosted version.

The documentation started to get updated and soon we should have a "gallery" with more examples of how to use the library.

earthaccess

05 Dec 03:15
Compare
Choose a tag to compare
earthaccess Pre-release
Pre-release

First release under the new name, pypi was updated and the current earthaccess package installs v0.4.5, conda-forge is still pending.

The old notation is still supported, we can import the classes and instantiate them the same way but having a simpler notation is probably a better idea. From now on we can do the following:

import earthaccess

earthaccess.login(strategy="netrc")

granules = earthaccess.search_data(params)

earthaccess.download(granules, local_path= "./test")

and voila!

This is still beta and the though is that we can have a stable package starting on v0.5.0, we need to add more tests and deal with EULAs as they represent a big obstacle for programmatic access specially for new accounts with NASA.

v0.4.1

02 Nov 20:30
Compare
Choose a tag to compare

This is a minor release with some bug fixes but the last one with the old name. The next release will come with the earthaccess name.

  • store.get() had a bug when we used it with empty lists
  • GESDISC didn't have S3 credential endpoints
  • LP DAAC changed its S3 credential endpoint
  • documentation from super classes was not showing due a new change in mkdocstrings, had to re-implement the inherited members and call super()

v0.4.0

16 Aug 23:43
Compare
Choose a tag to compare

earthdata can now persist user's credentials into a .netrc file

from earthdata import Auth, DataCollections, DataGranules, Store

auth = Auth().login(strategy="netrc")
# are we authenticated?
if not auth.authenticated:
    # ask for credentials and persist them in a .netrc file
    auth.login(strategy="interactive", persist=True)

We can also renew our CMR token to make sure our authenticated queries work:

auth.refresh_token()
collections = DataCollections(auth).concept_id("c-some-restricted-dataset").get()

We can get authenticated fsspec file sessions. closes #41

store = Store(auth)

fs = store.get_https_session()
# we can use fsspec to get any granule from any DAAC!
fs.get("https://DAAC/granule", "./data")

We can use Store to get our files from a URL list. closes #43

store = Store(auth)
files = store.get(["https://GRANULE_URL"], "./data/")

Lastly, we can stream certain datasets directly into xarray (even if we are not in AWS)

%%time 
import xarray as xr

query_results =  DataGranules().concept_id("C2036880672-POCLOUD").temporal("2018-01-01", "2018-12-31").get()
ds = xr.open_mfdataset(store.open(query_results))
ds