Skip to content

DATAES-799 - Support optimistic locking for full update scenario using seq_no + primary_term #441

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 25 commits into from
Apr 29, 2020

Conversation

rpuch
Copy link
Contributor

@rpuch rpuch commented Apr 26, 2020

There is a couple of issues that need resolving. Please provide your opinion on them.

  1. SeqNoPrimaryTerm property is not written to a Document nor it is read from a Document during the mapping; it is 'implicitly transient', even if the corresponding field is marked with a @Field annotation. But if someone marks such a property with a @Field annotation, it will be added to a mapping. This looks like a (minor) discrepancy.
    a. Should we leave it as it is now?
    b. Or, if such a field is marked as a @Field, should we both map it (to/from a Document) and add it it a mapping?
    c. Or should we avoid reading/writing/adding it to a mapping in any case, even if it is explicitly annotated with @Field?
  2. ElasticsearchTemplate did not use any exception translator. As a result, Elasticsearch-specific exceptions are thrown whenever they occur. It seems logical to throw an OptimisticLockingFailureException if a seq_no-related conflict occurs, so I added such a conversion in just one method (indexing one). We need to decide whether it is ok to use an exception translator there (maybe it was not used there by design?), and, if yes, whether it should be used everywhere where ActionFuture result is obtained. My suggestion for now is to just use it in one indexing method, but expand its usage in a subsequent PR dedicated to exception translation in ElasticsearchTemplate as this PR is already pretty large. Anyway, I added a TODO.
  3. if_seq_no and version are not allowed by Elasticsearch in the same index request, it just returns an error for such a request. It may happen that an entity class has both SeqNoPrimaryTerm and @Version properties. In such a case, we could de one of the following:
    a. Throw an exception meaning that such a mapping is invalid
    b. Only send if_seq_no and do not send version. Here, the @Version property only works in a 'read-only' mode: it allows to peek at what version was assigned by the Elasticsearch cluster to a document.
    I've implemented option b for now, because it seems that the use-case with 'read-only' version field may be useful. But, at the same time, a programmer may be confused by his 'version-based optimistic locking' being silently disabled, so option a also seems viable.
  • You have read the Spring Data contribution guidelines.
  • There is a ticket in the bug tracker for the project in our JIRA.
  • You use the code formatters provided here and have them applied to your changes. Don’t submit any formatting related changes.
  • You submit test cases (unit or integration tests) that back your changes.
  • You added yourself as author in the headers of the classes you touched. Amend the date range in the Apache license header if needed. For new types, add the license header (copy from another file and set the current year only).

rpuch added 10 commits April 25, 2020 12:12
…g seq_no + primary_term.

Add SeqNoPrimaryTerm support to ElasticsearchPersistentEntity
…g seq_no + primary_term.

Add seqNo- and primaryTerm-related methods to Document
…g seq_no + primary_term.

Fill SeqNoPrimaryTerm during mapping when reading responses
…g seq_no + primary_term.

Pass seq_no and primary_term to IndexRequest
…g seq_no + primary_term.

Ignore SeqNoPrimaryTerm property during mapping
…g seq_no + primary_term.

Introduce SequenceNumbers
…g seq_no + primary_term.

Make sure that search responses contain seq_no and primary_term when the entity needs SeqNoPrimaryTerm to be filled
…g seq_no + primary_term.

Make sure that usage of SeqNoPrimaryTerm together with @Version does not cause index rejections
…g seq_no + primary_term.

Make tests for ElasticsearchTemplate pass
…g seq_no + primary_term.

Implement seq_no + primary_term support for ReactiveElasticsearchTemplate
@sothawo
Copy link
Collaborator

sothawo commented Apr 26, 2020

Just had a short glimpse, I will have a close look the next days

As for your questions:

  1. answer c). I don't think that these values should be mapped in the the _source of an elasticsearch document. When an entity is written that has a SeqNoPrimaryTerm these values are only used for constructing the request parameters and flags. If it were added to the document and then written to the _source, the value in elasticsearch that is in the
    document is different from what will be returned for _seq_no and _primary_term because on the writing elasticsearch increased the values. So it must not be written to the mappings as well. The MappingBuilder must check if a property is of type SeqNoPrimaryTerm and if so, skip it. We might consider logging a warning if the property has a @Field annotation.

  2. Having the ElasticsearchExceptionTranslator is a pretty recent addition, that we only added to the rest client (the reactive part had it before) after we changed all the calls to go through the execute(callback) method. We did not do this on purpose on the transport client up to now, as we hope that Elasticsearch 8 will be out soon, so we can get
    rid of the transport client then, and we try to keep the work on that part low. So use for this case as you do, but I don't think we need to rework this in the whole transport client.

  3. Not sure about that. In the Elasticsearch documentation about optimistic looking, there is no mention of the _version field. I would prefer answer b, but with an additional warning logged.

…g seq_no + primary_term.

Add refresh() calls
rpuch added 5 commits April 27, 2020 16:17
…g seq_no + primary_term.

Never add SeqNoPrimaryTerm property to mapping. Warn if it is marked for inclusion with @field or a similar annotation.
…g seq_no + primary_term.

Remove a TODO and leave ExceptionTranslator hard-wired for now
…g seq_no + primary_term.

Warn if SeqNoPrimaryTerm property is defined for a versioned entity.
…g seq_no + primary_term.

Refresh before searching
@rpuch
Copy link
Contributor Author

rpuch commented Apr 27, 2020

  1. I've implemented option c) (under no circumstances SeqNoPrimaryTerm property gets mapped or put into mapping). A warning is logged if such a property is annotated with @Field or a similar annotation.
  2. Usage of an exception translator is left only in 'index' scenario for ElasticsearchTemplate.
  3. Option b) is implemented. A warning is logged if an entity is both versioned and contains SeqNoPrimaryTerm property.

Copy link
Collaborator

@sothawo sothawo left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the biggest thing is how I would change the SeqNoPrimaryTerm class; the rest I think are minor topics.

* @author Roman Puchkovskiy
* @since 4.0
*/
public class SeqNoPrimaryTerm {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why should the fields be nullable? A document either has a SeqNoPrimaryTerm field set, then the fields of this are not null, or the document does not have it.

This should be an immutable value object. I would make this a final class, change the sequenceNumber and primaryTerm to long. The only constructor should be private and a factory method public static SeqNoPrimaryTerm of(long sequenceNumber, long primaryTerm) would allow for the creation of this object. Inside this method assert that the values aren't below the allowed values.

No need to set the fields in this object after creation. When a SeqNoPrimaryTermis created on reading a response both values are there or none.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@sothawo An immutable value object makes a lot of sense, but I was keeping the following in mind. Imagine that this feature is used in a web-application. A browser makes a request to a RESTful API and gets a response with the data of the entity for editing. Such data must include seqNo and primaryTerm, because they will need to be sent back to the server upon entity save process when the user is done editing. This means that there should be a way to reconstruct the SeqNoPrimaryTerm instance from the data sent from the client. Also, it would be great if such a reconstruction could be done automatically from, let's say, JSON, so that we do not force a programmer to reconstruct it manually.

A mutable structure makes it easy. But having this class an immutable value object, we force a programmer to call SeqNoPrimaryTerm.of() manually in some kind of a converter.

Is this ok?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've transformed SeqNoPrimaryTerm to an immutable value object, but the problem of its usage in a JSON DTO remains open and, I believe, needs an additional discussion.

Comment on lines 259 to 263
if (property.isVersionProperty()) {
if (hasSeqNoPrimaryTermProperty()) {
logger.warn("Both SeqNoPrimaryTerm and @Version properties are defined on {}. Version will not be sent in index requests when seq_no is sent!", getType());
}
}
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
if (property.isVersionProperty()) {
if (hasSeqNoPrimaryTermProperty()) {
logger.warn("Both SeqNoPrimaryTerm and @Version properties are defined on {}. Version will not be sent in index requests when seq_no is sent!", getType());
}
}
if (hasVersionProperty() && hasSeqNoPrimaryTermProperty()) {
logger.warn("Both SeqNoPrimaryTerm and @Version properties are defined on {}. Version will not be sent in index requests when seq_no is sent!", getType());
}

We can remove then the duplicate warning from line 254.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we just apply the suggested change, the warning will be emitted for each property added after both SeqNoPrimaryTerm property and a versioned property have been added, so there can be a lot of such warnings per entity class. I've made the check twice for the 2 cases: when SeqNoPrimaryTerm property is added before and after the version property.

I've extracted the duplicated logging to a common private method. Is it enough? If not, please elaborate.

@@ -167,6 +167,13 @@ private void mapEntity(XContentBuilder builder, @Nullable ElasticsearchPersisten
return;
}

if (property.isSeqNoPrimaryTermProperty()) {
logger.warn("Property {} of {} is annotated for inclusion in mapping, but its type is " + //
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

skipping the property here is correct, but the warning should only be logged, when property.isAnnotationPresent(Field.class)is true.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

that's skipping the property when it is of the correct type and it is annotated with @Field; skipping must be done as well when there is no @Field

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oops, my bad. Fixed, this time correctly, I hope :)

*/
@Override
public boolean hasSeqNo() {
return this.seqNo != null && SequenceNumbers.isAssignedSeqNo(this.seqNo);
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
return this.seqNo != null && SequenceNumbers.isAssignedSeqNo(this.seqNo);
return this.seqNo != null;

I wouldn't do the validity check here but at the place where a SeqNoPrimaryTermis constructed. Here in the MapDocument these are just numeric values coming from ES.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

*/
@Override
public boolean hasPrimaryTerm() {
return this.primaryTerm != null && SequenceNumbers.isAssignedPrimaryTerm(this.primaryTerm);
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
return this.primaryTerm != null && SequenceNumbers.isAssignedPrimaryTerm(this.primaryTerm);
return this.primaryTerm != null;

same as above

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

Comment on lines 221 to 230
private SeqNoPrimaryTerm fillSeqNoPrimaryTermFromDocument(Document document) {
SeqNoPrimaryTerm seqNoPrimaryTerm = new SeqNoPrimaryTerm();
if (document.hasSeqNo()) {
seqNoPrimaryTerm.setSequenceNumber(document.getSeqNo());
}
if (document.hasPrimaryTerm()) {
seqNoPrimaryTerm.setPrimaryTerm(document.getPrimaryTerm());
}
return seqNoPrimaryTerm;
}
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

would not be needed, when the factory method SeqNoPrimaryTerm.of(...) is used like above

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This logic has been moved to SeqNoPrimaryTerm.ofAssigned() which returns an Optional.

* @author Roman Puchkovskiy
* @since 4.0
*/
public class SequenceNumbers {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think we need this class at all if we change SeqNoPrimaryTermto an immutable data class and put these checks in the factory method.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The class has been removed

rpuch and others added 7 commits April 28, 2020 12:42
Co-Authored-By: Peter-Josef Meisch <[email protected]>
Co-Authored-By: Peter-Josef Meisch <[email protected]>
…g seq_no + primary_term.

Move SeqNoPrimaryTerm to core.query package
…g seq_no + primary_term.

Make SeqNoPrimaryTerm an immutable value object.
Only interpret seq_no and primary_term validity at the moment when SeqNoPrimaryTerm instance is constructed.
Remove SequenceNumbers.
…g seq_no + primary_term.

Reduce duplication.
…g seq_no + primary_term.

Only log a warning if a SeqNoPrimaryTerm property is annotated with @field.
@rpuch
Copy link
Contributor Author

rpuch commented Apr 28, 2020

Thank you for the review.

I've resolved (or commented on) all the issues. Please let me know if something still needs to be changed.

Copy link
Collaborator

@sothawo sothawo left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

mostly comments about the SeqNoPrimaryTerm class and instance creation.

private final long sequenceNumber;
private final long primaryTerm;

/**
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

your argument about the constructor or setter to be able to pass an entity out via REST and get it in again is valid (although I would never give out such information vi a REST API).

So we could do:

  • create one public constructor that takes the two arguments (Jackson is fine with that)
  • in the constructor, we should check that sequenceNumber >= 0 and primaryTerm >= 1. If not, throw an IllegaArgumentException. This constructor is invoked at two places:
    1. When mapping a ES response. If we have the two values there, they are assigned from Elasticsearch and this check should not fail.
    2. When the values come in from outside, this exception is fine if wrong values come in.
  • remove the factory of methods

@@ -167,6 +167,13 @@ private void mapEntity(XContentBuilder builder, @Nullable ElasticsearchPersisten
return;
}

if (property.isSeqNoPrimaryTermProperty()) {
logger.warn("Property {} of {} is annotated for inclusion in mapping, but its type is " + //
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

that's skipping the property when it is of the correct type and it is annotated with @Field; skipping must be done as well when there is no @Field

Comment on lines 201 to 206
if (targetEntity.hasSeqNoPrimaryTermProperty() && document.hasSeqNo() && document.hasPrimaryTerm()) {
SeqNoPrimaryTerm.ofAssigned(document.getSeqNo(), document.getPrimaryTerm()).ifPresent(seqNoPrimaryTerm -> {
ElasticsearchPersistentProperty property = targetEntity.getRequiredSeqNoPrimaryTermProperty();
targetEntity.getPropertyAccessor(result).setProperty(property, seqNoPrimaryTerm);
});
}
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
if (targetEntity.hasSeqNoPrimaryTermProperty() && document.hasSeqNo() && document.hasPrimaryTerm()) {
SeqNoPrimaryTerm.ofAssigned(document.getSeqNo(), document.getPrimaryTerm()).ifPresent(seqNoPrimaryTerm -> {
ElasticsearchPersistentProperty property = targetEntity.getRequiredSeqNoPrimaryTermProperty();
targetEntity.getPropertyAccessor(result).setProperty(property, seqNoPrimaryTerm);
});
}
if (targetEntity.hasSeqNoPrimaryTermProperty() && document.hasSeqNo() && document.hasPrimaryTerm()) {
SeqNoPrimaryTerm seqNoPrimaryTerm = new SeqNoPrimaryTerm(document.getSeqNo(), document.getPrimaryTerm());
ElasticsearchPersistentProperty property = targetEntity.getRequiredSeqNoPrimaryTermProperty();
targetEntity.getPropertyAccessor(result).setProperty(property, seqNoPrimaryTerm);
}

if the SeqNoPrimaryTerm constructor is added as I wrote there

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Elasticsearch can actually return 'unassigned' values for seq_no and primary_term. Earlier you suggested to return them 'as is', just as numbers, without any interpretation, from Document.getSeqNo() and Document.getPrimaryTerm(), so the analysis of their 'validity' was moved to the 'optional factory' method, namely SeqNoPrimaryTerm.ofAssigned(). But if we remove that factory method and just invoke a constructor that blows up on an 'unassigned' value, it will throw an exception in such cases.

I suggest to do the following:

  1. Make the constructor public to allow Jackson use it for instance construction
  2. Still leave ofAssigned() and use it here (in the mapping code) to sort out the cases when Elasticsearch returns unassigned values
  3. Add assertions to the constructor to avoid garbage going from some kind of a REST API.

What do you think?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Where is this documented? Then I'd suggest in MappingElasticsearchConverter:202 to check the values of document.getSeqNo() and document.getPrimaryTerm() with a simple if check and set the SeqNoPrimaryTerm if these are correct. This is the only place where this check is used.

As https://www.elastic.co/guide/en/elasticsearch/reference/current/optimistic-concurrency-control.html does not talk about potentially unassigned values I think it might be confusing to introduce this into the code of SeqNoPrimaryTerm.

We still need the `IllegalArgumentException in case of the user setting some bad values.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The search API does not return seq_no and primary_term by default, you need to specifically ask it to return the real values (using seq_no_primary_term field), otherwise it returns 'unassigned' values. We send seq_no_primary_term=true when doing searches, but it seems a bit fragile to hope that unassigned values will never be returned by Elasticserch. Maybe a user will manually send a request without seq_no_primary_term and will want to use our mapping code to map, or we've omitted some API that needs this to be specifically enabled.

…g seq_no + primary_term.

Still skip a property if it is not annotated with @field
…g seq_no + primary_term.

Remove static factories from SeqNoPrimaryTerm, make its constructor public, add assertions to the constructor
@rpuch
Copy link
Contributor Author

rpuch commented Apr 29, 2020

I've removed static factories from SeqNoPrimaryTerm, made its constructor public and added assertions to the constructor.

The current batch seems to be closed. Please let me know if something is missng.

@sothawo sothawo merged commit 9b620b3 into spring-projects:master Apr 29, 2020
@sothawo
Copy link
Collaborator

sothawo commented Apr 29, 2020

Thanks a lot for this contribution!

@rpuch
Copy link
Contributor Author

rpuch commented Apr 29, 2020

Thank you for your review and suggestions :)

@sothawo
Copy link
Collaborator

sothawo commented Apr 29, 2020

you're welcome! Glad to have someone to provide such quality contributions.

@rpuch rpuch deleted the DATAES-799 branch April 29, 2020 20:18
@benweet
Copy link

benweet commented Jun 29, 2020

Nice contribution indeed.

@rpuch SeqNoPrimaryTerm doesn't seem to be updated once indexing succeeds (or am I missing something?). Could it be updated in AbstractElasticsearchTemplate.save() like populateIdIfNecessary() does?

Thanks.

@sothawo
Copy link
Collaborator

sothawo commented Jun 30, 2020

thanks for finding this @benweet. I created an issue for this

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants