Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

elastic-package uses error.message as an indicator of pipeline failure #1392

Open
efd6 opened this issue Aug 14, 2023 · 9 comments
Open

elastic-package uses error.message as an indicator of pipeline failure #1392

efd6 opened this issue Aug 14, 2023 · 9 comments
Labels
good first issue Good for newcomers

Comments

@efd6
Copy link
Contributor

efd6 commented Aug 14, 2023

The elastic-package test failure criteria include a check for a non-present error.message field to indicate that there has not been a pipeline failure during pipeline processing. This use is at odds with the documented semantics of the field.

For error in general

These fields can represent errors of any kind.

Use them for errors that happen while fetching events or in cases where the event itself contains an error.

and for the specific field

Error message.

Together there indicate that the field should be available for use by data sources to include source-specific error states for events.

This is currently not possible because a data stream that uses error.message to hold a data stream-specific error state would be identified as an ingest/processing failure by elastic-package.

The event.kind field does support marking pipeline failures with the "pipeline_error" value. So it is possible to distinguish pipeline failures from other uses of error.message, but this is currently brittle as the pipeline failure event kind is not automatic while setting error.message is.

In order for error.message to be fully open to use in its documented semantics, elastic-package would need to not rely on its state to determine whether pipeline failure has occurred, and if event.kind is used to signal pipeline failure, that pipelines all set that field correctly in the case of a pipeline failure.

@jsoriano
Copy link
Member

In order for error.message to be fully open to use in its documented semantics, elastic-package would need to not rely on its state to determine whether pipeline failure has occurred

Yes, I think it would make sense to stop relying on this, and fail only if event.kind is pipeline_error.

if event.kind is used to signal pipeline failure, that pipelines all set that field correctly in the case of a pipeline failure.

Would it make sense to automatically add an on_failure handler that adds event.kind to all pipelines managed by fleet? This could be added in build time by elastic-package or on install time by Fleet.
@juliaElastic @joshdover wdyt? Do you know if we have something similar now?

@jsoriano
Copy link
Member

FTR, teams have been manually adding this on_failure handler to many packages elastic/integrations#6582

@joshdover
Copy link

We have a fleet final pipeline that could easily be updated to do this for all Fleet-managed pipelines: https://github.com/elastic/kibana/blob/12a10d98559fb6fee98d81c7bbb22cf654187bde/x-pack/plugins/fleet/server/constants/fleet_es_assets.ts#L18

@MakoWish
Copy link

MakoWish commented Aug 15, 2023

FTR, teams have been manually adding this on_failure handler to many packages elastic/integrations#6582

There are several PR's still pending approval/merge, but they should all be done once those are merged.

The only thing we may want to consider changing is that they were all initially done with a set processor, but I discussed with @efd6 about using an append processor instead. The reason for suggesting a change to append is to prevent the pipeline error from removing other values from event.kind that may affect the events' validity for Detection Rules, displaying on dashboards that are filtered by event.kind, and potentially other situations as well.

@ishleenk17
Copy link

Would it make sense to automatically add an on_failure handler that adds event.kind to all pipelines managed by fleet? This could be added in build time by elastic-package or on install time by Fleet.

If this is being done at Fleet level, we will not need the change done in pipelines . Eg: here

@ltflb-bgdi
Copy link

ltflb-bgdi commented Jun 21, 2024

Is there any timeline on this? I just ran into it (again) while working on a fix for the aws.cloudfront_log integration.
If skipping errors based on event.kind: pipeline_error you may add a configuration option to silence specific errors on testing.

@kpollich
Copy link
Member

We don't have a timeline on this, but it seems like a reasonable improvement to take on. I'm putting this in an upcoming sprint for the 8.17 cycle tentatively.

@jsoriano
Copy link
Member

Maybe we can disable the check based on error.message in cases where we can check if there were failures based on the failure store (#1832).

@ltflb-bgdi
Copy link

As a workaround I use a specific tag in the test log config and only add the error message if this tag does not exist.
This is far from ideal but does the job for the moment.

IMO the best solution would still be to ignore the presence of error.message in case of event.kind: pipeline_error, or even better only rely on event.kind: pipeline_error since this is the official way to indicate pipeline errors.

Side note: As of today, not all pipelines populate the event.kind field in case of errors (e.g. aws.cloudfront_logs).
Thus, errors of these pipelines do not show up in the built-in ingest error search, since this search filters for both error.message and event.kind: pipeline_error. I strongly recommend to make sure all existing pipelines add pipeline_error to the error handlers where missing.

In case where adding this field is automatically added, please make sure that it is only added to the "global" pipeline on_failure object. I.e. on_failure handlers of processors itself must not lead to pipeline_errors. Use-case. Catching a processor error, do some stuff and proceed without error must still be supported.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
good first issue Good for newcomers
Projects
None yet
Development

No branches or pull requests

7 participants