rankalign-v6-gemma-2-2b-it-d0.15-e2-hc-b2d-dbl-all-nv1-ng1-vlo-fsx-sm0.1
Fine-tuned checkpoint from the rankalign project.
Training Details
| Field | Value |
|---|---|
| Base model | google/gemma-2-2b-it |
| Version | v6 |
| Task | hypernym-concat-bananas-to-dogs-double-all |
| Epoch | 2 |
| Delta | 0.15 |
| Typicality correction | none |
| Length normalization | False |
| Preference loss weight | 1 |
| NLL validator weight | 1 |
| NLL generator weight | 1 |
| Validator log-odds | True |
| Force same-x | True |
| Semi-supervised ratio | 0.1 |
| Labeled-only ratio | None |
Reproducibility
Original checkpoint name: v6-google--gemma-2-2b-it-delta0.15-epoch2--hypernym-concat-bananas-to-dogs-double-all--d2g--random--alpha1.0--full-completion--nllv1.0--nllg1.0--force-same-x--vallogodds--semi0.1
To evaluate:
python scripts/eval_by_claude.py \
--model TAUR-dev/rankalign-v6-gemma-2-2b-it-d0.15-e2-hc-b2d-dbl-all-nv1-ng1-vlo-fsx-sm0.1 \
--task hypernym-bananas \
--split_type random --gen-shots zero --disc-shots few --validator-log-odds --save-scores-csv \
--self-typicality
python scripts/eval_by_claude.py \
--model TAUR-dev/rankalign-v6-gemma-2-2b-it-d0.15-e2-hc-b2d-dbl-all-nv1-ng1-vlo-fsx-sm0.1 \
--task hypernym-bazookas \
--split_type random --gen-shots zero --disc-shots few --validator-log-odds --save-scores-csv \
--self-typicality
python scripts/eval_by_claude.py \
--model TAUR-dev/rankalign-v6-gemma-2-2b-it-d0.15-e2-hc-b2d-dbl-all-nv1-ng1-vlo-fsx-sm0.1 \
--task hypernym-cabinets \
--split_type random --gen-shots zero --disc-shots few --validator-log-odds --save-scores-csv \
--self-typicality
python scripts/eval_by_claude.py \
--model TAUR-dev/rankalign-v6-gemma-2-2b-it-d0.15-e2-hc-b2d-dbl-all-nv1-ng1-vlo-fsx-sm0.1 \
--task hypernym-cars \
--split_type random --gen-shots zero --disc-shots few --validator-log-odds --save-scores-csv \
--self-typicality
python scripts/eval_by_claude.py \
--model TAUR-dev/rankalign-v6-gemma-2-2b-it-d0.15-e2-hc-b2d-dbl-all-nv1-ng1-vlo-fsx-sm0.1 \
--task hypernym-chairs \
--split_type random --gen-shots zero --disc-shots few --validator-log-odds --save-scores-csv \
--self-typicality
python scripts/eval_by_claude.py \
--model TAUR-dev/rankalign-v6-gemma-2-2b-it-d0.15-e2-hc-b2d-dbl-all-nv1-ng1-vlo-fsx-sm0.1 \
--task hypernym-crows \
--split_type random --gen-shots zero --disc-shots few --validator-log-odds --save-scores-csv \
--self-typicality
python scripts/eval_by_claude.py \
--model TAUR-dev/rankalign-v6-gemma-2-2b-it-d0.15-e2-hc-b2d-dbl-all-nv1-ng1-vlo-fsx-sm0.1 \
--task hypernym-diapers \
--split_type random --gen-shots zero --disc-shots few --validator-log-odds --save-scores-csv \
--self-typicality
python scripts/eval_by_claude.py \
--model TAUR-dev/rankalign-v6-gemma-2-2b-it-d0.15-e2-hc-b2d-dbl-all-nv1-ng1-vlo-fsx-sm0.1 \
--task hypernym-dogs \
--split_type random --gen-shots zero --disc-shots few --validator-log-odds --save-scores-csv \
--self-typicality
python scripts/eval_by_claude.py \
--model TAUR-dev/rankalign-v6-gemma-2-2b-it-d0.15-e2-hc-b2d-dbl-all-nv1-ng1-vlo-fsx-sm0.1 \
--task hypernym-dolls \
--split_type random --gen-shots zero --disc-shots few --validator-log-odds --save-scores-csv \
--self-typicality
python scripts/eval_by_claude.py \
--model TAUR-dev/rankalign-v6-gemma-2-2b-it-d0.15-e2-hc-b2d-dbl-all-nv1-ng1-vlo-fsx-sm0.1 \
--task hypernym-ducklings \
--split_type random --gen-shots zero --disc-shots few --validator-log-odds --save-scores-csv \
--self-typicality
python scripts/eval_by_claude.py \
--model TAUR-dev/rankalign-v6-gemma-2-2b-it-d0.15-e2-hc-b2d-dbl-all-nv1-ng1-vlo-fsx-sm0.1 \
--task hypernym-elephants \
--split_type random --gen-shots zero --disc-shots few --validator-log-odds --save-scores-csv \
--self-typicality
python scripts/eval_by_claude.py \
--model TAUR-dev/rankalign-v6-gemma-2-2b-it-d0.15-e2-hc-b2d-dbl-all-nv1-ng1-vlo-fsx-sm0.1 \
--task hypernym-guns \
--split_type random --gen-shots zero --disc-shots few --validator-log-odds --save-scores-csv \
--self-typicality
python scripts/eval_by_claude.py \
--model TAUR-dev/rankalign-v6-gemma-2-2b-it-d0.15-e2-hc-b2d-dbl-all-nv1-ng1-vlo-fsx-sm0.1 \
--task hypernym-hammers \
--split_type random --gen-shots zero --disc-shots few --validator-log-odds --save-scores-csv \
--self-typicality
python scripts/eval_by_claude.py \
--model TAUR-dev/rankalign-v6-gemma-2-2b-it-d0.15-e2-hc-b2d-dbl-all-nv1-ng1-vlo-fsx-sm0.1 \
--task hypernym-helmets \
--split_type random --gen-shots zero --disc-shots few --validator-log-odds --save-scores-csv \
--self-typicality
python scripts/eval_by_claude.py \
--model TAUR-dev/rankalign-v6-gemma-2-2b-it-d0.15-e2-hc-b2d-dbl-all-nv1-ng1-vlo-fsx-sm0.1 \
--task hypernym-jackets \
--split_type random --gen-shots zero --disc-shots few --validator-log-odds --save-scores-csv \
--self-typicality
python scripts/eval_by_claude.py \
--model TAUR-dev/rankalign-v6-gemma-2-2b-it-d0.15-e2-hc-b2d-dbl-all-nv1-ng1-vlo-fsx-sm0.1 \
--task hypernym-kayaks \
--split_type random --gen-shots zero --disc-shots few --validator-log-odds --save-scores-csv \
--self-typicality
python scripts/eval_by_claude.py \
--model TAUR-dev/rankalign-v6-gemma-2-2b-it-d0.15-e2-hc-b2d-dbl-all-nv1-ng1-vlo-fsx-sm0.1 \
--task hypernym-kites \
--split_type random --gen-shots zero --disc-shots few --validator-log-odds --save-scores-csv \
--self-typicality
python scripts/eval_by_claude.py \
--model TAUR-dev/rankalign-v6-gemma-2-2b-it-d0.15-e2-hc-b2d-dbl-all-nv1-ng1-vlo-fsx-sm0.1 \
--task hypernym-mirrors \
--split_type random --gen-shots zero --disc-shots few --validator-log-odds --save-scores-csv \
--self-typicality
- Downloads last month
- 151