Skip to content

Adds a new dlq_custom_codes option to customize DLQ codes #1067

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 11 commits into from
Sep 15, 2022

Conversation

andsel
Copy link
Contributor

@andsel andsel commented Feb 16, 2022

Release notes

Adds dlq_custom_codes option to customize the list of HTTP codes that are considered eligible for DLQ.

What does this PR do?

Introduced a new configuration option, named dlq_custom_codes which accepts a list of HTTP error codes.
The list is to be considered as an addition to the already existing codes that the plugin uses (DLQ_CODES, DOC_SUCCESS_CODES, DOC_CONFLICT_CODE defined in module Common). If the list has an intersection with those error codes then it's reported as a configuration error.

Why is it important/What is the impact to the user?

This is a feature requested in #697, in some cases some error codes returned by Elasticsearch should be handled as DLQ messaged and not retried for ever. For example writing to a read only index report a 403, the index status would never change automatically so it's best to move these events to DLQ for a later reprocessing.
Under normal conditions the user shouldn't customize this list, but it's an helper in some circumstances.

Checklist

  • My code follows the style guidelines of this project
  • [ ] I have commented my code, particularly in hard-to-understand areas
  • I have made corresponding changes to the documentation
  • [ ] I have made corresponding change to the default configuration files (and/or docker env variables)
  • I have added tests that prove my fix is effective or that my feature works

Author's Checklist

  • test a version of the plugin with a real ES

How to test this PR locally

  • launch an ES instance, for example docker-compose -f es_and_kb.yml up
    es_and_kb.txt
  • from http://localhost:5601 Kibana UI (credential: elastic, changeme) create and index and put into read only:
PUT /test_index
PUT test_index/_settings 
{ 
  "index.blocks.read_only" : true 
}
  • configure a pipeline to write into:
input {
	stdin {}
}

output {
	elasticsearch {
		index => "test_index"
		hosts => "http://localhost:9200"
		user => "elastic"
		password => "changeme"
		dlq_custom_codes => [403]
	}
}	
  • enable DLQ in config/logstash.yml :
dead_letter_queue.enable: true
  • start the pipeline and generate an input and stop
  • check that the string you typed is present in DLQ
strings data/dead_letter_queue/main/1.log

Related issues

Use cases

HTTP error codes used to decide if an event index reply has to be moved into the DLQ are 400 and 404. These codes are hard coded, this PR aims the user to customize this list extending other than 400 and 404.

@andsel andsel marked this pull request as draft February 16, 2022 13:34
@andsel andsel changed the title Adds a new setting to list DLQ codes Adds a new dlq_custom_codes option to customize DLQ codes Feb 16, 2022
@andsel andsel marked this pull request as ready for review February 16, 2022 15:49
@andsel andsel self-assigned this Feb 17, 2022
@andsel andsel added the on hold label Feb 17, 2022
@andsel andsel marked this pull request as draft February 17, 2022 10:59
@andsel andsel force-pushed the feature/customizable_dlq_codes branch from f114b01 to 0698864 Compare August 18, 2022 09:35
@andsel andsel marked this pull request as ready for review August 18, 2022 10:39
@andsel andsel removed the on hold label Aug 18, 2022
@kaisecheng kaisecheng self-requested a review August 18, 2022 12:50
Comment on lines +531 to +532
This list is an addition to the ordinary error codes considered for this feature, 400 and 404.
It's considered a configuration error to re-use the same predefined codes for success, DLQ or conflict.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It looks like we already handle 200/201 as explicit success, 400/404 as DLQ, and 409 as conflict, and are declaring it a configuration error to re-use any of these.

If a user can't use this to intercept 409 conflicts, or 404's attempting to update records that don't exist, what error codes do we expect? Only authorization-related ones like 403? Others? Do we have real-world use-cases driving this feature?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This PR aim to solve request #697 which explicitly state that 403 error code could be eventually added as a DLQ trigger code.
While intersection with existing DLQ codes (400 and 404) could be seen as not necessarily a conflict, because both trigger the same feature, using error codes already configured as success (200/201) document conflict error (409) generates confusion in the precedence order when the code process responses.

Suppose that also 409 is customized as a dlq_custom_code what should take precedence? Processing it as a failed action or as DLQ event?

To keep the things simple I imagined to keep out of the set the codes that already has a meaning in the plugin.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@yaauie seems that this needs your feedback?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes. I can unblock this PR.

We can consider a later feature to enable this explicit custom-code routing to also intercept 400/404/409 that are already routed by the implementation, and if and when we do so we have a path to do it in a non-breaking way.

Co-authored-by: Ry Biesemeyer <yaauie@users.noreply.github.com>
@andsel andsel force-pushed the feature/customizable_dlq_codes branch from 5a81f0a to 2e763ef Compare August 22, 2022 13:36
@andsel andsel requested review from yaauie and kaisecheng August 24, 2022 15:42
Copy link
Contributor

@kaisecheng kaisecheng left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Once the CI is fixed, I can approve.

Copy link
Contributor

@kaisecheng kaisecheng left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. CI red due to missing 7.17.7-SNAPSHOT docker image

Copy link
Contributor

@yaauie yaauie left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM 👍🏼

Comment on lines +531 to +532
This list is an addition to the ordinary error codes considered for this feature, 400 and 404.
It's considered a configuration error to re-use the same predefined codes for success, DLQ or conflict.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes. I can unblock this PR.

We can consider a later feature to enable this explicit custom-code routing to also intercept 400/404/409 that are already routed by the implementation, and if and when we do so we have a path to do it in a non-breaking way.

@yaauie
Copy link
Contributor

yaauie commented Sep 15, 2022

I've merged in upstream and resolved conflicting version bump in changelog/gemspec, and will merge/release on successful CI.

@yaauie yaauie merged commit bbba6ae into logstash-plugins:main Sep 15, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Make 403 response configurable for going into DLQ
4 participants