Model Description

This model was pretrained using the Unsloth framework. Only lm_head and embedding part trained. Approximately 15% of the Turkish Wikipedia dataset was used during training.(4 hour -A100)

Training Details

Hyperparameters

Epochs: 1
Per-device batch size: 32
Gradient accumulation steps: 2
Effective batch size: 64
Learning rate: 5e-5

Model Information

Developed by: AhmetSemih
License: Apache 2.0

Downloads last month: 2

Safetensors

Model size

4B params

Tensor type

F32

BF16

Model tree for AhmetSemih/gemma3-4b-freezed-pretrain_final

Base model

google/gemma-3-4b-pt

Finetuned

google/gemma-3-4b-it

Finetuned

AhmetSemih/tr-gemma-128k-4b

Finetuned

(1)

this model

Finetunes

1 model

AhmetSemih
/

gemma3-4b-freezed-pretrain_final

Model Description

Training Details

Hyperparameters

Model Information

Model tree for AhmetSemih/gemma3-4b-freezed-pretrain_final

Dataset used to train AhmetSemih/gemma3-4b-freezed-pretrain_final