Skip to content
This repository has been archived by the owner on Aug 14, 2024. It is now read-only.

docs(self-hosted): external storage configurations #1269

Closed
wants to merge 5 commits into from
Closed
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
3 changes: 2 additions & 1 deletion src/components/sidebar.tsx
Original file line number Diff line number Diff line change
Expand Up @@ -112,7 +112,8 @@ export default () => {
<SidebarLink to="/self-hosted/geolocation/">Geolocation</SidebarLink>
<SidebarLink to="/self-hosted/sso/">Single Sign-On (SSO)</SidebarLink>
<SidebarLink to="/self-hosted/csp/">Content Security Policy (CSP)</SidebarLink>
<SidebarLink to="/self-hosted/reverse-proxy">Reverse Proxy</SidebarLink>
<SidebarLink to="/self-hosted/reverse-proxy/">Reverse Proxy</SidebarLink>
<SidebarLink to="/self-hosted/external-storage/">External Storage</SidebarLink>
<SidebarLink to="/self-hosted/troubleshooting/">Troubleshooting</SidebarLink>
<SidebarLink to="/self-hosted/support/">Support</SidebarLink>
</ul>
Expand Down
84 changes: 84 additions & 0 deletions src/docs/self-hosted/external-storage.mdx
Original file line number Diff line number Diff line change
@@ -0,0 +1,84 @@
---
title: External Storage
Copy link

@stayallive stayallive May 16, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe we should generalize to "Data Storage" or something.

This way this document can explain where the data is stored by default and can list alternatives if there are any.

Could have the following sections:

  • Sentry (with a general explanation about postgres, clickhouse and kafka maybe)
    • Filestore (Uploads, Replays)
      • Database
      • Object Storage
    • Nodestore (Event data)
      • Database
  • Vroom (Profiles)
    • Docker volume
    • Object Storage

We should probably either rename those section to what the specifically store or explain that in the intro because "Vroom" is not very descriptive but if it's explained that that component is responsible for (ingest and) storing profiling data it makes a lot more sense.

Maybe with until someone else also chimes in before rewriting the whole thing in case I'm off base with this outline but this sounds like a document I would love to have had when I started my self-hosted adventures 👍

For the Object Storage thing we might want to link to the relevant documentation instead of adding examples for every option under the sun because otherwise there is no bound to the size of this document.

---

In some cases, storing Sentry data on-disk is not really something people can do. Sometimes, it's better if they can offload it into some bucket storage (like AWS S3 or Google Cloud Storage).
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This seems a bit confusing, but I think adding another page later after this about "Unsupported Workflows" in which we can specify more about what kind of things that we can't offer support to (external Redis, external Postgres, installing third party plugins for extending some stuff).

See @azaslavsky's comment here #1269 (comment)


<Alert title="Note" level="info">
After changing configuration files, re-run the <code>./install.sh</code> script, to rebuild and restart the containers. See the <Link to="/self-hosted/#configuration">configuration section</Link> for more information.
</Alert>

## Sentry

The Sentry service has a abstraction called "filestore" that handles storing attachment, sourcemap (release artifacts), and replays. Filestore configuration for Sentry should be configured on the `sentry/config.yml` file.

### Google Cloud Storage backend

The configuration for GCS backend is pointed to `sentry.filestore.gcs.GoogleCloudStorage`. You will need to set `GOOGLE_APPLICATION_CREDENTIALS` environment variable. For more information, refer to the [Google Cloud documentation for setting up authentication](https://cloud.google.com/storage/docs/reference/libraries#setting_up_authentication).

```yaml
filestore.backend: "gcs"
filestore.options:
bucket_name: "..."
```

### S3 backend
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just to make sure, have you tried the Azure/s3 compatible backend without issues? We're using GCS so wanted to make sure

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.


<Alert title="Note" level="warning">
Although S3 support is available, it is not thoroughly tested and being used by Sentry SaaS internally. Therefore, it is not something that Sentry folks will offer very good support for it.
</Alert>

The configuration for S3-compatible backend is pointed to `sentry.filestore.s3.S3Boto3Storage`.

```yaml
filestore.backend: 's3'
filestore.options:
bucket_acl: 'private'
default_acl: 'private'
access_key: '<REDACTED>'
secret_key: '<REDACTED>'
bucket_name: 'my-bucket'
region_name: 'auto'
endpoint_url: 'https://<REDACTED>' # If you're not using AWS.
addressing_style: 'path' # For regular AWS S3, use "auto" or "virtual". For other S3-compatible API like MinIO or Ceph, use "path".
signature_version: 's3v4'
```

Refer to [botocore configuration](https://botocore.amazonaws.com/v1/documentation/api/latest/reference/config.html) for valid configuration values.

## Vroom

Vroom is the service that handles profiling. By default the data for profiling is saved on local filesystem. On self-hosted deployment, this should be done by overriding the `SENTRY_BUCKET_PROFILES` environment variable. It's also possible that additional environment variables should be added, depending on the backend of choice.

### Google Cloud Storage backend

You will need to set `GOOGLE_APPLICATION_CREDENTIALS` environment variable. For more information, refer to the [Google Cloud documentation for setting up authentication](https://cloud.google.com/storage/docs/reference/libraries#setting_up_authentication).

```bash
gs://my-bucket
```
Comment on lines +53 to +59
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I haven't tested this. I only know how to configure this. Can you guys test this out on your dogfood instance?


### S3 backend

<Alert title="Note" level="warning">
Although S3 support is available, it is not thoroughly tested and being used by Sentry SaaS internally. Therefore, it is not something that Sentry folks will offer very good support for it.
</Alert>

```bash
# For regular AWS S3
s3://my-bucket?awssdk=v1&region=us-west-1&endpoint=amazonaws.com

# For other S3-compatible APIs
s3://my-bucket?awssdk=v1&region=any-region&endpoint=minio.yourcompany.com&s3ForcePathStyle=true&disableSSL
```

Additional environment variables should be provided:
- `AWS_ACCESS_KEY=foobar`
- `AWS_SECRET_KEY=foobar`
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit (here and elsewhere): change the value from foobar to something else, to make clear that the two keys will be different in practice. I would suggest something like your_secret_key or similar.

- `AWS_SESSION_TOKEN=foobar` (optional)

Further explanation on the query string options:
- `region`: The AWS region for requests.
- `endpoint`: The endpoint URL (hostname only or fully qualified URI).
- `disableSSL`: A value of "true" disables SSL when sending requests.
- `s3ForcePathStyle`: A value of "true" forces the request to use path-style addressing.
Loading