spring - How to ensure batch remove and save with Hibernate - Stack Overflow

I am currently trying to implement a scheduled task that takes some data inserted by 3rd party and proc

I am currently trying to implement a scheduled task that takes some data inserted by 3rd party and process it to my database. I use JPA repositories everywhere, first I take external data and divide it into batches. Then I iterate over batches, transform data and try to delete existing data by some foreign key and save new data in it's place.

The problem is that after handling a batch of data (for example around 2k records) some data is missing in result tables and all records are marked as migrated.

I work with MsSql database, order_inserts are true, order updates are true.

Service:

@Slf4j
@Component
@AllArgsConstructor
public class GenerateInternalEntityJobScheduler implements JobScheduler {

    private final ExternalDataJpaRepo externalDataJpaRepo;
    private final InternalRepoOne internalRepoOne;
    private final InternalService internalService;
    private final InternalRepoTwo internalRepoTwo;

    private static final int BATCH_SIZE = 50;

    @Override
    @Transactional()
    public void run() {
        List<Long> idsMatchingCriteria = externalDataJpaRepo.findAllIdsByMigratedIsFalseAndContentIsNotNull();
        List<List<Long>> idsBatches;

        if (idsMatchingCriteria.size() < BATCH_SIZE) {
            idsBatches = List.of(idsMatchingCriteria);
        } else {
            idsBatches = ListUtils.partition(idsMatchingCriteria, BATCH_SIZE);
        }
        for (List<Long> list : idsBatches) {
            List<ExternalEntity> ExternalEntitiesToProcesses =
                externalDataJpaRepo.findAllById(list);
            List<Long> savedIds = persistBatchOfTransformedEntities(ExternalEntitiesToProcesses);
            List<ExternalEntity> migrated = ExternalEntitiesToProcesses
                .stream()
                .filter(cj -> savedIds.contains(cj.getForeignKey()))
                .toList();
            migrated.forEach(cj->cj.setMigrated(true));
            externalDataJpaRepo.saveAll(migrated);
        }
    }

    List<Long> persistBatchOfTransformedEntities(List<ExternalEntity> ExternalEntities) {

        //Some transformations on content field - simple string manipulation

        List<InternalEntityOne> internalEntityOneToUpdate = internalRepoOne.getAllByForeignKeyIn(foreignIdsList);
        List<InternalEntityTwo> internalEntityTwoToDelete = internalRepoTwo.getAllByForeignKeyIn(foreignIdsList);

        List<InternalEntityOne> savedInternalEntitiesOne = persistInternalEntitiesOne(internalEntityOneToUpdate, transformedInternalEntitiesOne);

        internalRepoTwo.deleteAll(internalEntityTwoToDelete);
        internalRepoTwo.flush();

        internalRepoTwo.saveAllAndFlush(transformedInterlanEntitiesTwo);

        return savedInternalEntitiesOne.stream().map(InternalEntityOne::getForeignKey).toList();
    }

    private List<InternalEntityOne> persistInternalEntitiesOne(List<InternalEntityOne> reasonsToUpdate,
                                                                                                                    List<InternalEntityOne> transformedReasons) {
        // build foreign key - object maps to facilitate access
        reasonsToUpdate.forEach(toUpdate -> {
            var sourceForUpdate = mapOfTransformedEntities.get(toUpdate.getForeignKey());
            toUpdate.setContent(sourceForUpdate.getContent());
            toUpdate.setCreationTime(LocalDateTime.now());
        });
        List<InternalEntityOne> newExternalEntities = transformedReasons.stream()
            .filter(entity -> !mapOfEntitiesToUpdate.containsKey(entity.getForeignKey()))
            .toList();
        return internalRepoOne.saveAllAndFlush(Stream.concat(reasonsToUpdate.stream(), newExternalEntities.stream()).toList());
    }
}

Here are my entities:

@Entity
@Data
@Table(name = "external_table")
@AllArgsConstructor
@NoArgsConstructor
public class ExternalEntity {

    @Id
    @Column(name = "id")
    private Long id;

    @Column(name = "foreign_id")
    private Long foreignId;

    @Column(name = "content")
    private String content;

    @Column(name = "migrated")
    private Boolean migrated;

    //... other fields

}

@Entity
@Getter
@Setter
@NoArgsConstructor
@Table(name = "uzasadnienie_klasyfikacja_final")
public class ClassifiedReasonToLegalis {

    @Id
    @GeneratedValue(strategy = GenerationType.IDENTITY)
    @Column(name = "id")
    private Long id;

    @Column(name = "foreign_id")
    private Long foreignId;

    @Column(name = "content")
    private String content;

}


@Entity
@Getter
@Setter
@Table(name = "internal_table_one")
public class InternalEntityOne {

    @Id
    @GeneratedValue(strategy = GenerationType.IDENTITY)
    @Column(name = "id")
    private Long id;

    @Column(name = "order_id")
    private Integer orderId;

    @Column(name = "foreign_id")
    private Long foreignId;

    @Column(name = "content")
    private String content;
}

I tried rewriting each step to separate transactions by anntotating main service as @Transactional(propagation = Propagation.NOT_SUPPORTED) and running each block of operations (delete + flush + safe) or single requests with new transaction. For that i made an util class:

@Component
public class TransactionUtil {

    @Transactional
    public <T> T withTransaction(Supplier<T> supplier) {
        return supplier.get();
    }

    @Transactional
    public void withTransaction(Runnable runnable) {
        runnable.run();
    }

    @Transactional(propagation = Propagation.REQUIRES_NEW)
    public <T> T withNewTransaction(Supplier<T> supplier) {
        return supplier.get();
    }

    @Transactional(propagation = Propagation.REQUIRES_NEW)
    public void withNewTransaction(Runnable runnable) {
        runnable.run();
    }

}

I also tried persisting data in one request by preparing all transformed data and persisting it at the end. Nothing helped

Below my HibernateConfig

@Configuration
@EnableTransactionManagement
@EnableJpaRepositories(basePackages = "my.package")
public class HibernateConfig {

    @Bean
    public PlatformTransactionManager transactionManager(EntityManagerFactory factory) {
        return new JpaTransactionManager(factory);
    }

    @Bean
    public LocalContainerEntityManagerFactoryBean entityManagerFactory(DataSource dataSource) {

        LocalContainerEntityManagerFactoryBean factory = new LocalContainerEntityManagerFactoryBean();

        factory.setDataSource(dataSource);
        factory.setJpaVendorAdapter(jpaVendorAdapter());
        factory.setPackagesToScan("my.package");
        factory.setJpaProperties(additionalProperties());

        return factory;
    }

    private JpaVendorAdapter jpaVendorAdapter() {
        HibernateJpaVendorAdapter vendorAdapter = new HibernateJpaVendorAdapter();
        vendorAdapter.setGenerateDdl(Boolean.FALSE);
        vendorAdapter.setShowSql(Boolean.FALSE);
        return vendorAdapter;
    }

    @Bean
    public HibernateExceptionTranslator hibernateExceptionTranslator() {
        return new HibernateExceptionTranslator();
    }


    private Properties additionalProperties() {
        Properties properties = new Properties();
        properties.setProperty("hibernate.order_inserts", "true");
        properties.setProperty("hibernate.order_updates", "true");
        return properties;
    }
}

发布者:admin,转转请注明出处:http://www.yc00.com/questions/1744994477a4605104.html

相关推荐

  • spring - How to ensure batch remove and save with Hibernate - Stack Overflow

    I am currently trying to implement a scheduled task that takes some data inserted by 3rd party and proc

    17小时前
    30

发表回复

评论列表(0条)

  • 暂无评论

联系我们

400-800-8888

在线咨询: QQ交谈

邮件:admin@example.com

工作时间:周一至周五,9:30-18:30,节假日休息

关注微信