calculator_model_test

This model is a fine-tuned version of on the None dataset. It achieves the following results on the evaluation set:

Model description

More information needed

More information needed

More information needed

The following hyperparameters were used during training:

learning_rate: 0.001
train_batch_size: 512
eval_batch_size: 512
seed: 42
optimizer: Use OptimizerNames.ADAMW_TORCH_FUSED with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
lr_scheduler_type: linear
num_epochs: 40

Training Loss	Epoch	Step	Validation Loss
2.9502	1.0	6	2.2802
2.0549	2.0	12	1.7738
1.5997	3.0	18	1.4024
1.2779	4.0	24	1.1706
1.0815	5.0	30	1.0180
0.9888	6.0	36	0.9056
0.8501	7.0	42	0.8061
0.8250	8.0	48	0.7222
0.7338	9.0	54	0.6823
0.6561	10.0	60	0.6417
0.6298	11.0	66	0.6212
0.6047	12.0	72	0.5623
0.5525	13.0	78	0.5455
0.5400	14.0	84	0.4992
0.5045	15.0	90	0.4876
0.4888	16.0	96	0.4948
0.4609	17.0	102	0.4694
0.4535	18.0	108	0.4192
0.4206	19.0	114	0.4057
0.4170	20.0	120	0.3800
0.3543	21.0	126	0.3535
0.3753	22.0	132	0.3321
0.3173	23.0	138	0.3216
0.3158	24.0	144	0.2960
0.3168	25.0	150	0.2865
0.2720	26.0	156	0.2633
0.2552	27.0	162	0.2425
0.2640	28.0	168	0.2167
0.2276	29.0	174	0.1935
0.2228	30.0	180	0.1803
0.2011	31.0	186	0.1693
0.1959	32.0	192	0.1507
0.1897	33.0	198	0.1418
0.1649	34.0	204	0.1337
0.1770	35.0	210	0.1313
0.1466	36.0	216	0.1262
0.1632	37.0	222	0.1231
0.1553	38.0	228	0.1192
0.1453	39.0	234	0.1173
0.1475	40.0	240	0.1169

Safetensors

Model size

7.8M params

Tensor type

F32

Inference Providers NEW

This model isn't deployed by any Inference Provider. 🙋 Ask for provider support