This is the API providing most of bibliographic data to the NYPL Research Catalog front-end. Check the current swagger for the machine readable api contract.
For local development, it's easiest to just use local node binaries:
nvm use; npm i
nvm use; ENV=qa LOCAL=true npm start
Note that when developing locally, if connecting to a IP ACL protected index (a practice we're currently deprecating), you may need to add your IP to the access control policy of the relevant ES domain. If your IP has not been authorized, you will see errors such as the following in the application logs:
error: Error connecting to index: 403: {"Message":"User: anonymous is not authorized to perform: es:ESHttpPost because no resource-based policy allows the es:ESHttpPost action"}
Docker files are included for deployment and can be used locally.
To start the container with AWS creds so that the app can decrypt config from config/*
:
AWS_ACCESS_KEY_ID=... AWS_SECRET_ACCESS_KEY=... docker-compose up
After making changes, rebuild the image:
docker-compose build
Or, equivalently, to build and run the image and container directly:
docker image build -t discovery-api:local .
docker container rm discovery-api
docker run --name discovery-api -e ENV=qa -e AWS_ACCESS_KEY_ID=... -e AWS_SECRET_ACCESS_KEY=... -p 8082:8082 -it discovery-api:local
This app uses a PRs Target Main, Merge to Deployment Branches git workflow.
main
has the lastest-and-greatest commits, production
should represent what's in our production environment. Because we deploy often, main
and production
will often be in parity.
- Cut a feature branch off of
main
- Commit changes to your feature branch
- File a pull request against
main
and assign a reviewer - After the PR is accepted, merge into
main
- Merge
main
>qa
- Confirm app deploys to QA and run appropriate testing
- Merge
main
>production
App deploys through GH Actions to ECS when updates are made to deployment branches:
qa
: discovery-api-qa.nypl.orgqa2
: discovery-api-qa2.nypl.orgproduction
: discovery-api-production.nypl.org
Run all tests:
npm test
Almost all HTTP dependencies are rerouted to fixtures (except for requesting nypl-core mapping files). All fixtures can be updated dynamically (using creds in ./config/production.env
) via the following:
Run tests and automatically build any missing Elasticsearch or SCSB fixtures:
UPDATE_FIXTURES=if-missing npm test
The above command can be used to fill in missing fixtures as new tests are written or ES queries change.
As ES queries change, some auto generated fixtures may no longer be used by any tests. Remove them with this flag:
REMOVE_UNUSED_FIXTURES=true npm test
Note that other Platform API fixtures (e.g. requests on the Bib service like bibs/sierra-nypl/1234
) must be fetched and saved manually and then enabled in a before
via fixtures.enableDataApiFixtures({ %requestpath% : %fixturepath%
, ... })`. (There's not a great reason for the extra work required to create and use other Platform API fixtures except that there are fewer of them and they tend not to need to change as ES queries change.)
The following summarises the kinds of queries that are possible with this app. See the swagger for the complete OpenAPI 2.0 spec.
Match by keyword:
/resources?q=war peace
Match by exact phrase:
/resources?q="war and peace"
In general, q
param accepts Elastic "Query String Query" strings, which supports field-specific and/or boolean searches.
For example, you could q
alone without filters to retrieve records matching war with a dateStartYear
overlap on 1999-2012:
/resources?q=war dateStartYear:[1999 TO 2012]
.. Or get things created in 1999 or later:
/resources?q=dateStartYear:>1999
Or match "war" or "peace":
/resources?q=war OR peace
Check the Elastic "Query String Query" strings docs for more information.
Filters are applied using a filters
param that expects this syntax on the query string:
/resources?filters[property1]=value1&filters[property2]=value2
Where property*
is one of: 'owner', 'subjectLiteral', 'holdingLocation', 'deliveryLocation', 'language', 'materialType', 'mediaType', 'carrierType', 'publisher', 'contributor', 'creator', 'issuance', 'createdYear', 'dateAfter', or 'dateBefore'.
The value given should be exact. Do not use quotes.
For example, to filter by English language:
/resources?filters[language]=lang:eng
By contributor (literal):
/resources?filters[contributorLiteral]=Dostoyevsky, Fyodor, 1821-1881.
Filters can be combined across different properties to form a boolean AND. This will match only English books written by Dostoyevsky:
/resources?filters[language]=lang:eng&filters[contributorLiteral]=Dostoyevsky, Fyodor, 1821-1881.
Using two filters for the same property combines them as a boolean OR, but you must add explicit, distinct indexes to the duplicated filter (the ordering of the parameters does not have to match the index order). This is because the AWS API Gateway deserializes the filter parameters to a JSON object, so specifying two filters with the same property key will cause one to be overwritten. It may help, therefore, to think of the filters
parameters as a serialized JSON object. For example, this will match Dostoyevsky books written in English OR Russian:
/resources?filters[language][0]=lang:eng&filters[language][1]=lang:rus&filters[contributorLiteral]=Dostoyevsky, Fyodor, 1821-1881.
Filter by publisher:
/resources?filters[publisher]=Oxford University Press,
Filter by date range (resources created anywhere inside the range given, inclusive):
/resources?filters[dateAfter]=1999&filters[dateBefore]=2012
Note that dateStartYear and dateEndYear are often very broad, causing the above to match many things catalogued with range 999-9999. To match against the specific catalogued "created" year, use createdYear
:
/resources?filters[createdYear]=1999
All search queries support:
page
: Integer. Page number to retrieve. (Default 1)per_page
: Integer. Number of results to retrieve at a time. Default 50. Valid range 0-100.
All search queries support sort
ing on:
title
: Case insensitive sort on title. Default ascending.date
: Sort on dateStartYear. Default descending.creator
: Case insensitive sort on first creator. (Note the "first" creator may not be the best creator.) Default ascending.
To set a non-default direction use sort_direction=(asc|desc)
. To sort by relevance (i.e. keyword query), omit the sort
param.
All searches above can be retrieved as aggregations. To fetch the standard set of aggregations, append '/aggregations' to a search path. For example:
/resources/aggregations?q=dateStartYear:{1999 TO *}
All aggregations (no filter):
/resources/aggregations
To fetch a specific aggregation (especially useful when fetching more than the default number of buckets):
/resources/aggregation/[aggregation id]
For example to fetch the first 100 subject aggregations:
/resources/aggregation/subject?per_page=100
Note that page=
is not supported for aggregations (ES doesn't seem to offer a way to jump to an offset https://www.elastic.co/guide/en/elasticsearch/reference/current/search-aggregations-bucket-terms-aggregation.html#_size )
/resources/b15704876
.. Or by any item @id:
/resources/b15704876-i25375512
There is currently one feature flag in this app, which is 'no-on-site-edd'. When it is set, all onsite items have an eddRequestable property of false.
NB: numAvailable and numItem*Parsed counts do not exclude the e-item, but these items are not indexed with statuses, volumes, or date ranges, and are therefore not actually included in this count.
NB: As the table above indicates, there is a mismatch between what the front end and API regard as "electronic items". As far as the API is concerned, there is only at most ONE electronic item, which can have many electronic locator values. numElectronicResources
counts these locator values, but the other item count values treat all the electronic resources as a single item.