The batch job was simple. Fetch a list of incoming records, map each one to a JPA entity, collect them in a HashSet to deduplicate by business logic, persist the batch, then check the set to trigger follow-up work on the saved entities. The code looked correct. The tests passed. In production, the follow-up work never ran. Every set.contains(entity) call returned false. Entities were in the database. The set had no idea.
This bug was not in the business logic. It was not in the save. It was in equals() and hashCode(), and it was introduced the moment someone reached for the obvious implementation.
Why HashSet Loses Your Entity After Persist
A HashSet stores objects in buckets. The bucket is chosen by calling hashCode() on the object at insertion time. When you call contains(), Java computes the bucket from the current hashCode() and looks there. If hashCode() returns a different value than it did at insertion, Java looks in the wrong bucket and finds nothing.
JPA entities with @GeneratedValue have an id field that is null before persist. The database assigns a value during flush. Before persist, hashCode(id) resolves to Objects.hashCode(null), which is 0. After persist, hashCode(id) resolves to Long.hashCode(1001L), which is something else entirely. So the entity going in had one hashCode. Coming out of saveAll(), it has a different one. Java looks in the new bucket. The entity still lives in the old bucket. contains() returns false.
This is not a Hibernate bug. It is a violation of the hashCode contract, which Java documents in Object: the value returned by hashCode() must not change while the object is stored in a hash-based collection. Using a mutable field like id breaks that requirement.
The Contract
Java's equals() and hashCode() contract has three requirements that matter here.
First: if a.equals(b), then a.hashCode() == b.hashCode(). No exceptions.
Second: if a.hashCode() != b.hashCode(), then !a.equals(b). The contrapositive of the first rule.
Third: hashCode() must be consistent. Multiple calls on the same object must return the same value as long as nothing used in the equals() comparison has changed.
The third requirement is the one JPA entities violate. "As long as nothing used in equals has changed" sounds safe, but id is used in equals, and id changes. The escape hatch is there for truly immutable fields, not for fields that get assigned by an external system mid-lifecycle.
What You Probably Wrote
IDE-generated equals() and hashCode() based on id. Or Lombok's @EqualsAndHashCode with onlyExplicitlyIncluded = true and @EqualsAndHashCode.Include on the id field. Both look reasonable. Both produce the same broken behavior for any entity that goes into a hash-based collection before being saved.
The version with no configuration, just plain @EqualsAndHashCode, is worse. Lombok includes every non-static, non-transient field. If any of those fields change, the hashCode changes. If the entity has lazy-loaded associations, things get worse in a different way.
The Lombok Trap
Adding @EqualsAndHashCode to a JPA entity class tells Lombok to generate methods based on every field it can see. That includes @ManyToOne and @OneToMany associations.
@Entity
@EqualsAndHashCode
public class Order {
@Id
@GeneratedValue
private Long id;
@ManyToOne(fetch = FetchType.LAZY)
private Customer customer;
}When anything calls equals() on an Order - during a Set lookup, during Hibernate dirty checking, anywhere - Lombok's generated method accesses every field, including customer. Accessing a lazy association outside of an active Hibernate session throws LazyInitializationException. Inside a session, it fires a database query. Neither is what you wanted from an equality check.
There is a second problem. Two Order objects representing the same database row can return different hashCode() values depending on what Hibernate has loaded into memory. The first object has customer initialized. The second has a proxy. Their field states differ. Their hashCodes differ. They are not equal according to Lombok, even though they represent the same row.
The annotation to reach for on a JPA entity is not @EqualsAndHashCode. It is nothing. Write the methods explicitly, with full control over which fields they touch.
The Fix: UUID in the Constructor
Assign a UUID when the entity is constructed, before it ever reaches Hibernate. Base equals() and hashCode() on that UUID only. The UUID is stable from construction through persist through any number of subsequent loads from the database.
@Entity
public class Order {
@Id
@GeneratedValue
private Long id;
@Column(nullable = false, unique = true, updatable = false)
private UUID uuid = UUID.randomUUID();
@Override
public boolean equals(Object o) {
if (this == o) return true;
if (!(o instanceof Order)) return false;
Order other = (Order) o;
return uuid.equals(other.uuid);
}
@Override
public int hashCode() {
return uuid.hashCode();
}
}The database id stays as the primary key and the target of foreign keys. It is not replaced. It just does not participate in equality. To enforce identity at the database level, the UUID column needs a unique constraint, which the unique = true attribute handles via schema generation or a manual migration. Setting updatable = false prevents anyone from overwriting it through Hibernate.
Create an entity in memory, add it to a HashSet, persist it, look it up with contains(): it will be found. UUID did not change. Neither did the hashCode. Same bucket throughout.
What to Check Right Now
Grep your codebase for @EqualsAndHashCode on classes that also have @Entity. Every match is a problem. Remove the Lombok annotation and replace it with an explicit implementation based on a stable field.
Then find @Entity classes used as keys in a Map or stored in a Set. If the equals() and hashCode() on those classes touches id or any field that changes after construction, the collection is unreliable.
The failure tends not to show up in unit tests because unit tests usually persist once, check once, and never put an entity in a collection before saving. It shows up in production batch jobs, multi-step workflows, and anywhere that collects entities before the transaction commits. By the time the bug report arrives, the code has been in production for months and the test suite says everything is fine.
Comments (0)