calculator_model_test1

This model is a fine-tuned version of on the None dataset. It achieves the following results on the evaluation set:

Model description

More information needed

More information needed

More information needed

The following hyperparameters were used during training:

learning_rate: 0.001
train_batch_size: 512
eval_batch_size: 512
seed: 42
optimizer: Use OptimizerNames.ADAMW_TORCH_FUSED with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
lr_scheduler_type: linear
num_epochs: 40
mixed_precision_training: Native AMP

Training Loss	Epoch	Step	Validation Loss
2.9229	1.0	6	2.2326
2.0049	2.0	12	1.7103
1.5420	3.0	18	1.3654
1.2160	4.0	24	1.0594
1.0152	5.0	30	0.9351
0.9045	6.0	36	0.8313
0.8259	7.0	42	0.7719
0.7393	8.0	48	0.6679
0.6706	9.0	54	0.6165
0.6159	10.0	60	0.5675
0.5727	11.0	66	0.5366
0.5439	12.0	72	0.4941
0.5177	13.0	78	0.4715
0.4960	14.0	84	0.4656
0.4791	15.0	90	0.4849
0.4674	16.0	96	0.4290
0.4400	17.0	102	0.4014
0.4173	18.0	108	0.3741
0.3855	19.0	114	0.3403
0.3612	20.0	120	0.3264
0.3450	21.0	126	0.3056
0.3191	22.0	132	0.2809
0.2951	23.0	138	0.2809
0.2872	24.0	144	0.2439
0.2758	25.0	150	0.2237
0.2452	26.0	156	0.2104
0.2263	27.0	162	0.1879
0.2083	28.0	168	0.1739
0.1974	29.0	174	0.1584
0.1776	30.0	180	0.1451
0.1669	31.0	186	0.1376
0.1565	32.0	192	0.1234
0.1502	33.0	198	0.1140
0.1403	34.0	204	0.1007
0.1341	35.0	210	0.0964
0.1284	36.0	216	0.0911
0.1234	37.0	222	0.0865
0.1154	38.0	228	0.0845
0.1172	39.0	234	0.0826
0.1149	40.0	240	0.0812

Safetensors

Model size

7.8M params

Tensor type

F32

Inference Providers NEW

This model isn't deployed by any Inference Provider. 🙋 Ask for provider support