Edit: Adding a repex of 10 points of data.
data = {
"time": [327, 330, 333, 336, 339, 342, 345, 348, 351, 354],
"resistance": [140000.0, 139000.0, 137000.0, 136000.0, 134000.0,
134000.0, 133000.0, 132000.0, 132000.0, 131000.0]
}
df = pd.DataFrame(data)
I am trying to fit an XGBRegressor to predict future behavior of a sensor. Data is seasonal with 6 Cycles. When I do fit the data, the RMSE for the training decreases while for Testing it doesn't decrease much and starts increasing. If I change the learning rate to 1 and max depth to 3 it overfits on training data but is a straight line on the testing. Here is the code for the model Prediction of Training data
from xgboost import XGBRegressor
model = XGBRegressor(n_estimators = 1000, early_stopping_rounds = 50,
learning_rate = 0.01, max_depth = 5)
model.fit(X_train, Y_train,
eval_set = [(X_train, Y_train), (X_test, Y_test)],
verbose = 10)
# Make predictions on the training data
Y_train_pred = model.predict(X_train)
For reference: X has 3 features which are time (each point is 3 seconds), other 2 are
sin_time = 0.5 * np.sin(time) * 2 * np.sin(time) * time
sin_2pi_time = np.sin(2 * np.pi * time)
Y is resistance
Tried changing the parameters but with no luck and even If I overfit the model the training data prediction is a straight line Prediction of Testing data
Here is the decision tree of 5 max depth visualized by graphviz tree
Also, here is the importance of the 3 features printed by feature_importances_
[9.9349368e-01 5.7454295e-03 7.6084811e-04]
发布者:admin,转转请注明出处:http://www.yc00.com/questions/1745645513a4637946.html
评论列表(0条)