python - SFTTrainer Error : prepare_model_for_kbit_training() got an unexpected keyword argument 'gradient_checkpointing

admin•2025-04-18 17:46:09•questions•阅读4

I'm trying to fine-tune a model using SFTTrainer from trl.This is how my SFTConfig arguments look

I'm trying to fine-tune a model using SFTTrainer from trl.

This is how my SFTConfig arguments look like,

from trl import SFTConfig
training_arguments = SFTConfig(
       output_dir=output_dir,
       num_train_epochs=num_train_epochs,
       per_device_train_batch_size=per_device_train_batch_size,
       gradient_accumulation_steps=gradient_accumulation_steps,
       optim=optim,
       save_steps=save_steps,
       logging_steps=logging_steps,
       learning_rate=learning_rate,
       weight_decay=weight_decay,
       fp16=fp16,
       bf16=bf16,
       max_grad_norm=max_grad_norm,
       max_steps=max_steps,
       warmup_ratio=warmup_ratio,
       group_by_length=group_by_length,
       lr_scheduler_type=lr_scheduler_type,
       report_to="tensorboard",
       dataset_text_field="instruction",
       max_seq_length=None,
       packing=False,
       gradient_checkpointing=False,
   )

and this is my SFTTrainer block.

trainer = SFTTrainer(
   model=model,
   train_dataset=dataset,
   peft_config=peft_config,
   tokenizer=tokenizer,
   args=training_arguments,
)

The error comes from internal function SFTTrainer._prepare_model_for_kbit_training.

 """Prepares a quantized model for kbit training."""
    330 prepare_model_kwargs = {
    331     "use_gradient_checkpointing": args.gradient_checkpointing,
    332     "gradient_checkpointing_kwargs": args.gradient_checkpointing_kwargs or {},
    333 }

I tried passing gradient_checkpointing as False and gradient_checkpointing_kwargs as an empty dictionary, but no luck.

How can I avoid this error?

I'm trying to fine-tune a model using SFTTrainer from trl.

This is how my SFTConfig arguments look like,

from trl import SFTConfig
training_arguments = SFTConfig(
       output_dir=output_dir,
       num_train_epochs=num_train_epochs,
       per_device_train_batch_size=per_device_train_batch_size,
       gradient_accumulation_steps=gradient_accumulation_steps,
       optim=optim,
       save_steps=save_steps,
       logging_steps=logging_steps,
       learning_rate=learning_rate,
       weight_decay=weight_decay,
       fp16=fp16,
       bf16=bf16,
       max_grad_norm=max_grad_norm,
       max_steps=max_steps,
       warmup_ratio=warmup_ratio,
       group_by_length=group_by_length,
       lr_scheduler_type=lr_scheduler_type,
       report_to="tensorboard",
       dataset_text_field="instruction",
       max_seq_length=None,
       packing=False,
       gradient_checkpointing=False,
   )

and this is my SFTTrainer block.

trainer = SFTTrainer(
   model=model,
   train_dataset=dataset,
   peft_config=peft_config,
   tokenizer=tokenizer,
   args=training_arguments,
)

The error comes from internal function SFTTrainer._prepare_model_for_kbit_training.

 """Prepares a quantized model for kbit training."""
    330 prepare_model_kwargs = {
    331     "use_gradient_checkpointing": args.gradient_checkpointing,
    332     "gradient_checkpointing_kwargs": args.gradient_checkpointing_kwargs or {},
    333 }

I tried passing gradient_checkpointing as False and gradient_checkpointing_kwargs as an empty dictionary, but no luck.

How can I avoid this error?

Share Improve this question edited Apr 4 at 0:39 Starship Remembers Shadow 9934 gold badges15 silver badges28 bronze badges asked Mar 22 at 16:58 sabira kabeer 11

With gradient_checkpointing_kwargs={'use_reentrant':False} ? – rehaqds Commented Mar 22 at 22:46

Add a comment |

1 Answer 1

Sorted by: Reset to default 0

Using gradient_checkpointing_kwargs={'use_reentrant':False} instead of gradient_checkpointing=False might work.

发布者：admin，转转请注明出处：http://www.yc00.com/questions/1744307254a4567775.html

admin

questions
jquery - pass variable from html to javascript - Stack Overflow
<%= semantic_form_for :.........do |f| %><%= f.inputs do%><%= pluralize @size, '
admin
27分钟前
00
questions
automotive - VBox data reading with Python - Stack Overflow
I would like to check with the community some support with a task I have.I have an equipment called VB
admin
24分钟前
10
questions
javascript - document.getElementById in loop - Stack Overflow
Let's say I want to change class for all div's that have class "the_class" in the e
admin
24分钟前
00
questions
android - Directory.listSync only returns .thumbnails and not the files - Stack Overflow
I'm using an Android API 35 Google Play image emulator, and put some mp3s in the Music folder of t
admin
16分钟前
00
questions
javascript - Selecting Only One Button in a Set - Stack Overflow
I've looked through multiple responses and tutorials, but still seem to having trouble with this.
admin
16分钟前
00
questions
javascript - Keypress to change class in jQuery - Stack Overflow
I have a problem I can't seem to sort out.I have a form with a custom styled button (input type=bu
admin
16分钟前
00
questions
javascript - Override Chrome keyboard shortcuts in a user script - Stack Overflow
I wrote a user script that performs a certain operation on selected text in a textarea when pressing CT
admin
15分钟前
00
questions
javascript - How to run two describe blocks sequentially in a mocha-chai test suite - Stack Overflow
In one of my mocha-chai test, I have two describe blocks. In each describe block I have minimum two �
admin
13分钟前
00
questions
Download Confluence Pages as PDFs via API using Python - Stack Overflow
I’m trying to download Confluence pages in a space as PDFs using the Confluence API and a Python script
admin
12分钟前
00
questions
javascript - How can you get the height of an element with height:auto? - Stack Overflow
I'm trying to set the height of one element to the height of another by using javscript, and I
admin
11分钟前
00
questions
plugins - Hide Elementor Templates Menu from Admin
Closed. This question is off-topic. It is not currently accepting answers.Your question should be specific to WordPress.
admin
11分钟前
00
questions
javascript - Detect if a web, drop-down menu is going to display off screen - Stack Overflow
I have a CSS based simple drop down menu with multi-levels. The second or third level might go outside
admin
10分钟前
00
questions
Importing javascript file multiple times on same page - Stack Overflow
I have a javascript file called pendingAjaxCallsCounter.js with a variable "var pendingAjaxCalls&q
admin
10分钟前
00
questions
Add Amount Filter in Tally XML Request for Fetching Vouchers - Stack Overflow
I am using the following XML request to fetch a voucher from Tally based on Date and Narration, Now, I
admin
10分钟前
00
questions
javascript - Isotope masonry layout is not working after item size change - Stack Overflow
I have a simple masonry layout. And need to change elements size and position on click.Here is a jsfidd
admin
9分钟前
00
questions
wp kses - Why wp_kses() not working for rel, target of link in Wordpress
I am using below code for wp_kses(). But it's ignoring rel & target in the result. I want to show rel="nof
admin
7分钟前
00
questions
installation - How to make Wix Toolset ignore folders - Stack Overflow
I have a tricky situation.In myapplication a second one has to be launched on the very first run in
admin
3分钟前
00
questions
android - Creating custom BroadcastReceiver with custom action - Stack Overflow
I have created my own BroadcastReceiver like so:class BogusBroadcastReceiver: BroadcastReceiver() {ove
admin
2分钟前
00
questions
javascript - How long can NodeJS `setTimeout` wait? - Stack Overflow
Can NodeJS setTimeout delay excecution of function for a week? (assuming the server doesnt go down...)
admin
1分钟前
00
questions
Custom Tag Cloud widget missing tags
I was trying to create a custom tag cloud widget that will look like this:Also, when I tried including some new tags in
admin
1分钟前
00

发表回复

评论列表（0条）

暂无评论

python - SFTTrainer Error : prepare_model_for_kbit_training() got an unexpected keyword argument 'gradient_checkpointing

1 Answer 1

发表回复

评论列表（0条）

联系我们

400-800-8888

python - SFTTrainer Error : prepare_model_for_kbit_training() got an unexpected keyword argument &#39;gradient_checkpointing

1 Answer 1

相关推荐

发表回复

评论列表（0条）

联系我们

400-800-8888

python - SFTTrainer Error : prepare_model_for_kbit_training() got an unexpected keyword argument 'gradient_checkpointing