Skip to content

Commit

Permalink
Merge pull request #246 from KrKOo/header-blacklist
Browse files Browse the repository at this point in the history
UrlStorage blacklist of forwarded headers
  • Loading branch information
brainstorm authored May 19, 2024
2 parents 4cff66c + da9b008 commit b856d0c
Show file tree
Hide file tree
Showing 5 changed files with 131 additions and 15 deletions.
28 changes: 21 additions & 7 deletions htsget-config/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -171,19 +171,21 @@ To use `S3Storage`, build htsget-rs with the `s3-storage` feature enabled, and s
`UrlStorage` is another storage backend which can be used to serve data from a remote HTTP URL. When using this storage backend, htsget-rs will fetch data from a `url` which is set in the config. It will also forward any headers received with the initial query, which is useful for authentication.
To use `UrlStorage`, build htsget-rs with the `url-storage` feature enabled, and set the following options under `[resolvers.storage]`:

| Option | Description | Type | Default |
|--------------------------------------|------------------------------------------------------------------------------------------------------------------------------------------------------------------|--------------------------|-----------------------------------------------------------------------------------------------------------------|
| <span id="url">`url`</span> | The URL to fetch data from. | HTTP URL | `"https://127.0.0.1:8081/"` |
| <span id="url">`response_url`</span> | The URL to return to the client for fetching tickets. | HTTP URL | `"https://127.0.0.1:8081/"` |
| `forward_headers` | When constructing the URL tickets, copy HTTP headers received in the initial query. Note, the headers received with the query are always forwarded to the `url`. | Boolean | `true` |
| `tls` | Additionally enables client authentication, or sets non-native root certificates for TLS. See [TLS](#tls) for more details. | TOML table | TLS is always allowed, however the default performs no client authentication and uses native root certificates. |
| Option | Description | Type | Default |
|--------------------------------------|------------------------------------------------------------------------------------------------------------------------------|--------------------------|-----------------------------------------------------------------------------------------------------------------|
| <span id="url">`url`</span> | The URL to fetch data from. | HTTP URL | `"https://127.0.0.1:8081/"` |
| <span id="url">`response_url`</span> | The URL to return to the client for fetching tickets. | HTTP URL | `"https://127.0.0.1:8081/"` |
| `forward_headers` | When constructing the URL tickets, copy HTTP headers received in the initial query. | Boolean | `true` |
| `header_blacklist` | List of headers that should not be forwarded | Array of headers | `[]` |
| `tls` | Additionally enables client authentication, or sets non-native root certificates for TLS. See [TLS](#tls) for more details. | TOML table | TLS is always allowed, however the default performs no client authentication and uses native root certificates. |

When using `UrlStorage`, the following requests will be made to the `url`.
* `GET` request to fetch only the headers of the data file (e.g. `GET /data.bam`, with `Range: bytes=0-<end_of_bam_header>`).
* `GET` request to fetch the entire index file (e.g. `GET /data.bam.bai`).
* `HEAD` request on the data file to get its length (e.g. `HEAD /data.bam`).

All headers received in the initial query will be included when making these requests.
By default, all headers received in the initial query will be included when making these requests. To exclude certain headers from being forwarded, set the `header_blacklist` option. Note that the blacklisted headers are removed from the requests made to `url` and from the URL tickets as well.


For example, a `resolvers` value of:
```toml
Expand Down Expand Up @@ -222,6 +224,18 @@ bucket = 'bucket'
```

`UrlStorage` can only be specified manually.
Example of a resolver with `UrlStorage`:
```toml
[[resolvers]]
regex = ".*"
substitution_string = "$0"

[resolvers.storage]
url = "http://localhost:8080"
response_url = "https://example.com"
forward_headers = true
header_blacklist = ["Host"]
```

There are additional examples of config files located under [`examples/config-files`][examples-config-files].

Expand Down
1 change: 1 addition & 0 deletions htsget-config/src/resolver.rs
Original file line number Diff line number Diff line change
Expand Up @@ -537,6 +537,7 @@ mod tests {
inner: InnerUrl::from_str("https://example.com/").unwrap(),
}),
true,
vec![],
client,
);

Expand Down
14 changes: 14 additions & 0 deletions htsget-config/src/storage/url.rs
Original file line number Diff line number Diff line change
Expand Up @@ -25,6 +25,7 @@ pub struct UrlStorage {
url: ValidatedUrl,
response_url: ValidatedUrl,
forward_headers: bool,
header_blacklist: Vec<String>,
#[serde(skip_serializing)]
tls: TlsClientConfig,
}
Expand All @@ -35,6 +36,7 @@ pub struct UrlStorageClient {
url: ValidatedUrl,
response_url: ValidatedUrl,
forward_headers: bool,
header_blacklist: Vec<String>,
client: Client,
}

Expand Down Expand Up @@ -63,6 +65,7 @@ impl TryFrom<UrlStorage> for UrlStorageClient {
storage.url,
storage.response_url,
storage.forward_headers,
storage.header_blacklist,
client,
))
}
Expand All @@ -74,12 +77,14 @@ impl UrlStorageClient {
url: ValidatedUrl,
response_url: ValidatedUrl,
forward_headers: bool,
header_blacklist: Vec<String>,
client: Client,
) -> Self {
Self {
url,
response_url,
forward_headers,
header_blacklist,
client,
}
}
Expand All @@ -99,6 +104,11 @@ impl UrlStorageClient {
self.forward_headers
}

/// Get the headers that should not be forwarded.
pub fn header_blacklist(&self) -> &[String] {
&self.header_blacklist
}

/// Get an owned client by cloning.
pub fn client_cloned(&self) -> Client {
self.client.clone()
Expand Down Expand Up @@ -142,6 +152,7 @@ impl UrlStorage {
url: InnerUrl,
response_url: InnerUrl,
forward_headers: bool,
header_blacklist: Vec<String>,
tls: TlsClientConfig,
) -> Self {
Self {
Expand All @@ -150,6 +161,7 @@ impl UrlStorage {
inner: response_url,
}),
forward_headers,
header_blacklist,
tls,
}
}
Expand Down Expand Up @@ -182,6 +194,7 @@ impl Default for UrlStorage {
url: default_url(),
response_url: default_url(),
forward_headers: true,
header_blacklist: vec![],
tls: TlsClientConfig::default(),
}
}
Expand All @@ -206,6 +219,7 @@ mod tests {
"https://example.com".parse::<InnerUrl>().unwrap(),
"https://example.com".parse::<InnerUrl>().unwrap(),
true,
vec![],
client_config,
));

Expand Down
1 change: 1 addition & 0 deletions htsget-search/src/htsget/from_storage.rs
Original file line number Diff line number Diff line change
Expand Up @@ -104,6 +104,7 @@ impl<S> ResolveResponse for HtsGetFromStorage<S> {
url_storage_config.url().clone(),
url_storage_config.response_url().clone(),
url_storage_config.forward_headers(),
url_storage_config.header_blacklist().to_vec(),
));
searcher.search(query.clone()).await
}
Expand Down
Loading

0 comments on commit b856d0c

Please sign in to comment.