machine learning - TrainingArguments: Do "packing" and "group_by_length" counteract each oth

In the HuggingFace's TrainingArguments and SFTConfig (inheriting from TrainingArguments), there ar

In the HuggingFace's TrainingArguments and SFTConfig (inheriting from TrainingArguments), there are two arguments for initializing SFTConfig():

  • group_by_length: Whether or not to group together samples of roughly the same length in the training dataset (to minimize padding applied and be more efficient). Only useful if applying dynamic padding.
  • packing: Whether to pack multiple sequences into a fixed-length format. Uses max_length to define sequence length.
config = SFTConfig(..., 
                   group_by_length=True, 
                   packing=True, ...)

Those arguments serve the purpose of reducing the effort to filling in paddings. However, when packing=True, it is pointless to use group_by_length=True. Shall we use both to increase the training performance? Do they counteract each other?

发布者:admin,转转请注明出处:http://www.yc00.com/questions/1744201281a4562889.html

相关推荐

发表回复

评论列表(0条)

  • 暂无评论

联系我们

400-800-8888

在线咨询: QQ交谈

邮件:admin@example.com

工作时间:周一至周五,9:30-18:30,节假日休息

关注微信