JPA OneToMany 관계에서 페이징 처리 Dive Deep

TL;DR:

@ManyToOne 관계 페이지네이션 시 : Fetch Join / @EntityGraph / .setHint(“javax.persistence.fetchgraph”, entityManager.getEntityGraph(”…“)) 사용하여 페이지네이션 처리하면 된다. 별로 신경쓸 게 없다.
@OneToMany 관계 페이지네이션 시 : @BatchSize(size = 100 ~ 1000) / defaultbatchfetch_size 설정 사용하여 페이지네이션 처리하면 된다.

아무 생각 없이 JPA @OneToMany 관계에서 페이지네이션을 진행하면 OOM이 발생할 수 있다. 이에 대해 여러 블로그에서 해결책으로 @BatchSize(size = 1000) 를 적용하라는 가이드를 많이 찾을 수 있다.

신기하게도 @BatchSize 를 적용한 후에 기존에 1+N 이슈가 생길 만한 코드를 가져다 놓자 마치 미래에 내가 무엇을 가지고 올지 알았다는 것 마냥 미래에 조회할 id를 SQL in 절에 미리 조회한다.

소스코드를 파보며 확인해본 결과, org.hibernate.engine.spi.BatchFetchQueue 클래스가 이러한 신기한 마법을 구현해준다.
객체를 가져올 때, org.hibernate.engine.internal.StatefulPersistenceContext의 addUninitializedCollection 함수가 호출된다. 이 때 가져온 콜렉션의 갯수가 2개 이상이면 BatchFetchQueue에 넣는다.

@Override
public void addUninitializedCollection(CollectionPersister persister, PersistentCollection collection, Serializable id) {
	final CollectionEntry ce = new CollectionEntry( collection, persister, id, flushing );
	addCollection( collection, ce, id );
	if ( persister.getBatchSize() > 1 ) {
		getBatchFetchQueue().addBatchLoadableCollection( collection, ce );
	}
}

그 다음에 아직 로딩되지 않아 프록시를 물고 있는 @OneToMany 엔티티 목록을 순회하는 소스코드가 실행되면, org.hibernate.loader.collection.LegacyBatchingCollectionInitializer.initialize가 실행된다. 이 때 BatchFetchQueue에는 직전에 로딩되어 있는 엔티티를 함께 가지고 온다.

public static class LegacyBatchingCollectionInitializer extends BatchingCollectionInitializer {
private final int[] batchSizes;
private final Loader[] loaders;

public LegacyBatchingCollectionInitializer(
		QueryableCollection persister,
		int[] batchSizes,
		Loader[] loaders) {
	super( persister );
	this.batchSizes = batchSizes;
	this.loaders = loaders;
}

@Override
public void initialize(Serializable id, SharedSessionContractImplementor session)	throws HibernateException {
	Serializable[] batch = session.getPersistenceContextInternal().getBatchFetchQueue()
			.getCollectionBatch( collectionPersister(), id, batchSizes[0] );

	for ( int i=0; i<batchSizes.length-1; i++) {
		final int smallBatchSize = batchSizes[i];
		if ( batch[smallBatchSize-1]!=null ) {
			Serializable[] smallBatch = new Serializable[smallBatchSize];
			System.arraycopy(batch, 0, smallBatch, 0, smallBatchSize);
			loaders[i].loadCollectionBatch( session, smallBatch, collectionPersister().getKeyType() );
			return; //EARLY EXIT!
		}
	}

	loaders[batchSizes.length-1].loadCollection( session, id, collectionPersister().getKeyType() );
}
}

위와 같이 아직 순회하지도 않은 id를 알 수가 있는 것이다. 마치 장고의 prefetch_related 랑 비슷하면서 다른 느낌.. 장고는 로딩시점에 다같이 물고 오는데, 스프링은 로딩시점에는 원본 엔티티를 기억하긴 하되 실제로 사용하는 시점에 큐에 기억했던 것들을 몇 개씩 짤라서 가지고 온다.