calculator_model_test1

This model is a fine-tuned version of on the None dataset. It achieves the following results on the evaluation set:

  • Loss: 0.0812

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 0.001
  • train_batch_size: 512
  • eval_batch_size: 512
  • seed: 42
  • optimizer: Use OptimizerNames.ADAMW_TORCH_FUSED with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
  • lr_scheduler_type: linear
  • num_epochs: 40
  • mixed_precision_training: Native AMP

Training results

Training Loss Epoch Step Validation Loss
2.9229 1.0 6 2.2326
2.0049 2.0 12 1.7103
1.5420 3.0 18 1.3654
1.2160 4.0 24 1.0594
1.0152 5.0 30 0.9351
0.9045 6.0 36 0.8313
0.8259 7.0 42 0.7719
0.7393 8.0 48 0.6679
0.6706 9.0 54 0.6165
0.6159 10.0 60 0.5675
0.5727 11.0 66 0.5366
0.5439 12.0 72 0.4941
0.5177 13.0 78 0.4715
0.4960 14.0 84 0.4656
0.4791 15.0 90 0.4849
0.4674 16.0 96 0.4290
0.4400 17.0 102 0.4014
0.4173 18.0 108 0.3741
0.3855 19.0 114 0.3403
0.3612 20.0 120 0.3264
0.3450 21.0 126 0.3056
0.3191 22.0 132 0.2809
0.2951 23.0 138 0.2809
0.2872 24.0 144 0.2439
0.2758 25.0 150 0.2237
0.2452 26.0 156 0.2104
0.2263 27.0 162 0.1879
0.2083 28.0 168 0.1739
0.1974 29.0 174 0.1584
0.1776 30.0 180 0.1451
0.1669 31.0 186 0.1376
0.1565 32.0 192 0.1234
0.1502 33.0 198 0.1140
0.1403 34.0 204 0.1007
0.1341 35.0 210 0.0964
0.1284 36.0 216 0.0911
0.1234 37.0 222 0.0865
0.1154 38.0 228 0.0845
0.1172 39.0 234 0.0826
0.1149 40.0 240 0.0812

Framework versions

  • Transformers 5.0.0
  • Pytorch 2.10.0+cpu
  • Datasets 4.0.0
  • Tokenizers 0.22.2
Downloads last month
24
Safetensors
Model size
7.8M params
Tensor type
F32
·
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support