Clear previous stable model files before V2 publication
Browse files- README.md +0 -183
- artifact_sync.log +0 -121
- best_model.pt +0 -3
- train.log +0 -194
- training_metadata.json +0 -1
- training_summary.json +0 -1
README.md
DELETED
|
@@ -1,183 +0,0 @@
|
|
| 1 |
-
---
|
| 2 |
-
tags:
|
| 3 |
-
- bioassay
|
| 4 |
-
- chemistry
|
| 5 |
-
- drug-discovery
|
| 6 |
-
- ranking
|
| 7 |
-
- assay-conditioning
|
| 8 |
-
- rdkit
|
| 9 |
-
- qwen
|
| 10 |
-
library_name: pytorch
|
| 11 |
-
---
|
| 12 |
-
|
| 13 |
-
# BioAssayAlign Qwen3-Embedding-0.6B Compatibility
|
| 14 |
-
|
| 15 |
-
BioAssayAlign is an assay-conditioned small-molecule ranking model.
|
| 16 |
-
|
| 17 |
-
Given a bioassay description and a list of candidate SMILES strings, it returns a compatibility score and ranks compounds for that assay.
|
| 18 |
-
|
| 19 |
-
This model is not a chatbot, not a generative chemistry model, and not a potency predictor. It is a retrieval/ranking model trained on a frozen public bioassay dataset derived from PubChem BioAssay and ChEMBL.
|
| 20 |
-
|
| 21 |
-
## What This Model Does
|
| 22 |
-
|
| 23 |
-
Input:
|
| 24 |
-
- a bioassay description, preferably with metadata
|
| 25 |
-
- a list of candidate SMILES strings
|
| 26 |
-
|
| 27 |
-
Output:
|
| 28 |
-
- one score per molecule
|
| 29 |
-
- a ranked list of molecules for that assay
|
| 30 |
-
|
| 31 |
-
Practical use cases:
|
| 32 |
-
- rank compounds for a new or proposed assay
|
| 33 |
-
- prioritize molecules before wet-lab screening
|
| 34 |
-
- compare ranking behavior across assay formulations
|
| 35 |
-
|
| 36 |
-
## Model Design
|
| 37 |
-
|
| 38 |
-
This artifact uses:
|
| 39 |
-
- assay encoder: `Qwen/Qwen3-Embedding-0.6B` (frozen)
|
| 40 |
-
- molecule representation:
|
| 41 |
-
- Morgan fingerprints, radii 2 and 3, 2048 bits each
|
| 42 |
-
- chirality-aware fingerprints
|
| 43 |
-
- MACCS keys
|
| 44 |
-
- 30 RDKit descriptors
|
| 45 |
-
- compatibility head:
|
| 46 |
-
- trainable projection layers and scorer MLP
|
| 47 |
-
|
| 48 |
-
The final score is learned by the compatibility head. It is not just a raw dot product.
|
| 49 |
-
|
| 50 |
-
## Training Data
|
| 51 |
-
|
| 52 |
-
The model was trained on the frozen prepared subset:
|
| 53 |
-
- dataset repo: `lighteternal/bioassayalign-frozen-v1`
|
| 54 |
-
- prepared subdir: `prepared/compat_a10_full_v1`
|
| 55 |
-
|
| 56 |
-
Prepared subset size:
|
| 57 |
-
- assays: `11,314`
|
| 58 |
-
- candidate-pool rows: `1,094,277`
|
| 59 |
-
- train groups: `196,385`
|
| 60 |
-
|
| 61 |
-
Split counts:
|
| 62 |
-
- train assays: `9,062`
|
| 63 |
-
- val assays: `1,128`
|
| 64 |
-
- test assays: `1,124`
|
| 65 |
-
|
| 66 |
-
Each train group contains:
|
| 67 |
-
- one assay
|
| 68 |
-
- one active compound
|
| 69 |
-
- multiple explicit same-assay inactive compounds
|
| 70 |
-
|
| 71 |
-
## Main Results
|
| 72 |
-
|
| 73 |
-
Best validation checkpoint:
|
| 74 |
-
- validation mean AUPRC: `0.6157`
|
| 75 |
-
- best epoch: `4`
|
| 76 |
-
|
| 77 |
-
Held-out test metrics:
|
| 78 |
-
- test mean AUPRC: `0.6384`
|
| 79 |
-
- test random-baseline AUPRC: `0.2655`
|
| 80 |
-
- test hit@10: `0.9760`
|
| 81 |
-
- test mean AUROC: `0.7865`
|
| 82 |
-
- test mean nDCG@50: `0.7598`
|
| 83 |
-
|
| 84 |
-
Post-run robustness checks:
|
| 85 |
-
- canonical assay text: `0.6391` mean AUPRC
|
| 86 |
-
- natural-sentence assay text: `0.6377`
|
| 87 |
-
- reordered template: `0.6384`
|
| 88 |
-
- dot-only ablation: `0.4411`
|
| 89 |
-
|
| 90 |
-
Interpretation:
|
| 91 |
-
- the model is robust to reasonable assay text reformatting
|
| 92 |
-
- the learned scorer materially outperforms raw dot-product scoring
|
| 93 |
-
|
| 94 |
-
## How To Use
|
| 95 |
-
|
| 96 |
-
### CLI
|
| 97 |
-
|
| 98 |
-
From the BioAssayAlign repo:
|
| 99 |
-
|
| 100 |
-
```bash
|
| 101 |
-
python -m bioassayalign.cli score-compatibility \
|
| 102 |
-
--model-dir /path/to/model_dir \
|
| 103 |
-
--assay-title "BTK kinase inhibitor binding assay" \
|
| 104 |
-
--description "In vitro kinase-domain binding assay for Bruton's tyrosine kinase" \
|
| 105 |
-
--organism "Homo sapiens" \
|
| 106 |
-
--readout "binding" \
|
| 107 |
-
--target-uniprot P06239 \
|
| 108 |
-
--smiles "CCO" \
|
| 109 |
-
--smiles "Cc1nc(Nc2ncc(C(=O)Nc3ccc(CN4CCN(C)CC4)c(C(F)(F)F)c3)cc2C#N)cc(C#Cc2cnc(N)c3ccc(Cl)cc23)n1"
|
| 110 |
-
```
|
| 111 |
-
|
| 112 |
-
### Python
|
| 113 |
-
|
| 114 |
-
```python
|
| 115 |
-
from bioassayalign.compat_inference import load_compatibility_model, rank_compounds
|
| 116 |
-
from bioassayalign.compat_inference import AssayQuery, serialize_assay_query
|
| 117 |
-
|
| 118 |
-
model = load_compatibility_model("/path/to/model_dir")
|
| 119 |
-
assay_text = serialize_assay_query(
|
| 120 |
-
AssayQuery(
|
| 121 |
-
title="BTK kinase inhibitor binding assay",
|
| 122 |
-
description="In vitro kinase-domain binding assay for Bruton's tyrosine kinase",
|
| 123 |
-
organism="Homo sapiens",
|
| 124 |
-
readout="binding",
|
| 125 |
-
target_uniprot=["P06239"],
|
| 126 |
-
)
|
| 127 |
-
)
|
| 128 |
-
|
| 129 |
-
results = rank_compounds(
|
| 130 |
-
model,
|
| 131 |
-
assay_text=assay_text,
|
| 132 |
-
smiles_list=[
|
| 133 |
-
"CCO",
|
| 134 |
-
"Cc1nc(Nc2ncc(C(=O)Nc3ccc(CN4CCN(C)CC4)c(C(F)(F)F)c3)cc2C#N)cc(C#Cc2cnc(N)c3ccc(Cl)cc23)n1",
|
| 135 |
-
],
|
| 136 |
-
)
|
| 137 |
-
```
|
| 138 |
-
|
| 139 |
-
## Input Guidance
|
| 140 |
-
|
| 141 |
-
Best practice is to provide structured assay information and let the application serialize it into a stable format.
|
| 142 |
-
|
| 143 |
-
Recommended fields:
|
| 144 |
-
- title
|
| 145 |
-
- description
|
| 146 |
-
- organism
|
| 147 |
-
- readout
|
| 148 |
-
- assay format
|
| 149 |
-
- assay type
|
| 150 |
-
- target UniProt IDs
|
| 151 |
-
|
| 152 |
-
The model is fairly robust to reasonable format changes, but missing metadata can still reduce performance.
|
| 153 |
-
|
| 154 |
-
SMILES should be canonicalized with RDKit when possible. Invalid SMILES should be rejected before scoring.
|
| 155 |
-
|
| 156 |
-
## Limitations
|
| 157 |
-
|
| 158 |
-
- The score is a ranking score, not a calibrated probability.
|
| 159 |
-
- The model does not predict exact potency values such as IC50.
|
| 160 |
-
- Some assays remain hard and are ranked only moderately well.
|
| 161 |
-
- The current molecule side is still feature-engineered rather than fully chemistry-transformer-native.
|
| 162 |
-
- Public assay data contains label noise and assay heterogeneity.
|
| 163 |
-
|
| 164 |
-
## Recommended Interpretation
|
| 165 |
-
|
| 166 |
-
Use this model as:
|
| 167 |
-
- a compound prioritization tool
|
| 168 |
-
- a ranking feature in a larger discovery workflow
|
| 169 |
-
- a way to compare candidate sets for a given assay
|
| 170 |
-
|
| 171 |
-
Do not use it as:
|
| 172 |
-
- a standalone medicinal chemistry decision engine
|
| 173 |
-
- a substitute for experimental validation
|
| 174 |
-
- a potency regressor
|
| 175 |
-
|
| 176 |
-
## Provenance
|
| 177 |
-
|
| 178 |
-
Project repo:
|
| 179 |
-
- `https://github.com/lighteternal/bioassayalign-private`
|
| 180 |
-
|
| 181 |
-
Evaluation artifacts:
|
| 182 |
-
- training summary is included in this model repo
|
| 183 |
-
- post-run robustness suite is stored separately in the corresponding post-run artifact repo
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
artifact_sync.log
DELETED
|
@@ -1,121 +0,0 @@
|
|
| 1 |
-
[2026-03-08T09:44:47+00:00] Starting artifact sync to lighteternal/BioAssayAlign-Qwen3-Embedding-0.6B-Compat-Hybrid
|
| 2 |
-
Repo created: https://huggingface.co/lighteternal/BioAssayAlign-Qwen3-Embedding-0.6B-Compat-Hybrid
|
| 3 |
-
Found 2 candidate files to upload
|
| 4 |
-
Running validation checks on files to upload...
|
| 5 |
-
Validation checks complete.
|
| 6 |
-
Starting upload...
|
| 7 |
-
All files have been processed! Exiting worker.
|
| 8 |
-
All files have been processed! Exiting worker.
|
| 9 |
-
INFO:huggingface_hub._upload_large_folder:All files have been processed! Exiting worker.
|
| 10 |
-
All files have been processed! Exiting worker.
|
| 11 |
-
INFO:huggingface_hub._upload_large_folder:All files have been processed! Exiting worker.
|
| 12 |
-
All files have been processed! Exiting worker.
|
| 13 |
-
INFO:huggingface_hub._upload_large_folder:All files have been processed! Exiting worker.
|
| 14 |
-
All files have been processed! Exiting worker.
|
| 15 |
-
INFO:huggingface_hub._upload_large_folder:All files have been processed! Exiting worker.
|
| 16 |
-
All files have been processed! Exiting worker.
|
| 17 |
-
INFO:huggingface_hub._upload_large_folder:All files have been processed! Exiting worker.
|
| 18 |
-
All files have been processed! Exiting worker.
|
| 19 |
-
INFO:huggingface_hub._upload_large_folder:All files have been processed! Exiting worker.
|
| 20 |
-
All files have been processed! Exiting worker.
|
| 21 |
-
INFO:huggingface_hub._upload_large_folder:All files have been processed! Exiting worker.
|
| 22 |
-
|
| 23 |
-
---------- 2026-03-08 09:44:58 (0:00:10) ----------
|
| 24 |
-
Files: hashed 2/2 (344.0/344.0) | pre-uploaded: 0/0 (0.0/344.0) | committed: 2/2 (344.0/344.0) | ignored: 0
|
| 25 |
-
Workers: hashing: 0 | get upload mode: 0 | pre-uploading: 0 | committing: 0 | waiting: 0
|
| 26 |
-
---------------------------------------------------
|
| 27 |
-
INFO:huggingface_hub._upload_large_folder:
|
| 28 |
-
---------- 2026-03-08 09:44:58 (0:00:10) ----------
|
| 29 |
-
Files: hashed 2/2 (344.0/344.0) | pre-uploaded: 0/0 (0.0/344.0) | committed: 2/2 (344.0/344.0) | ignored: 0
|
| 30 |
-
Workers: hashing: 0 | get upload mode: 0 | pre-uploading: 0 | committing: 0 | waiting: 0
|
| 31 |
-
---------------------------------------------------
|
| 32 |
-
[33mYou are about to upload a large folder to the Hub using `hf upload-large-folder`. This is a new feature so feedback is very welcome!
|
| 33 |
-
|
| 34 |
-
A few things to keep in mind:
|
| 35 |
-
- Repository limits still apply: https://huggingface.co/docs/hub/repositories-recommendations
|
| 36 |
-
- Do not start several processes in parallel.
|
| 37 |
-
- You can interrupt and resume the process at any time. The script will pick up where it left off except for partially uploaded files that would have to be entirely reuploaded.
|
| 38 |
-
- Do not upload the same folder to several repositories. If you need to do so, you must delete the `./.cache/huggingface/` folder first.
|
| 39 |
-
|
| 40 |
-
Some temporary metadata will be stored under `outputs/hf-compatibility-20260308-114319/.cache/huggingface`.
|
| 41 |
-
- You must not modify those files manually.
|
| 42 |
-
- You must not delete the `./.cache/huggingface/` folder while a process is running.
|
| 43 |
-
- You can delete the `./.cache/huggingface/` folder to reinitialize the upload state when process is not running. Files will have to be hashed and preuploaded again, except for already committed files.
|
| 44 |
-
|
| 45 |
-
If the process output is too verbose, you can disable the progress bars with `--no-bars`. You can also entirely disable the status report with `--no-report`.
|
| 46 |
-
|
| 47 |
-
For more details, run `hf upload-large-folder --help` or check the documentation at https://huggingface.co/docs/huggingface_hub/guides/upload#upload-a-large-folder.[0m
|
| 48 |
-
|
| 49 |
-
|
| 50 |
-
|
| 51 |
-
---------- 2026-03-08 09:44:47 (0:00:00) ----------
|
| 52 |
-
Files: hashed 2/2 (344.0/344.0) | pre-uploaded: 0/0 (0.0/344.0) (+2 unsure) | committed: 0/2 (0.0/344.0) | ignored: 0
|
| 53 |
-
Workers: hashing: 0 | get upload mode: 2 | pre-uploading: 0 | committing: 0 | waiting: 6
|
| 54 |
-
---------------------------------------------------
|
| 55 |
-
[2026-03-08T09:44:58+00:00] Finished artifact sync to lighteternal/BioAssayAlign-Qwen3-Embedding-0.6B-Compat-Hybrid
|
| 56 |
-
[2026-03-08T10:04:58+00:00] Starting artifact sync to lighteternal/BioAssayAlign-Qwen3-Embedding-0.6B-Compat-Hybrid
|
| 57 |
-
Repo created: https://huggingface.co/lighteternal/BioAssayAlign-Qwen3-Embedding-0.6B-Compat-Hybrid
|
| 58 |
-
Found 4 candidate files to upload
|
| 59 |
-
Running validation checks on files to upload...
|
| 60 |
-
Validation checks complete.
|
| 61 |
-
Starting upload...
|
| 62 |
-
Ignored metadata for 'train.log' (outdated). Will re-compute hash.
|
| 63 |
-
Ignored metadata for 'artifact_sync.log' (outdated). Will re-compute hash.
|
| 64 |
-
All files have been processed! Exiting worker.
|
| 65 |
-
All files have been processed! Exiting worker.
|
| 66 |
-
INFO:huggingface_hub._upload_large_folder:All files have been processed! Exiting worker.
|
| 67 |
-
All files have been processed! Exiting worker.
|
| 68 |
-
INFO:huggingface_hub._upload_large_folder:All files have been processed! Exiting worker.
|
| 69 |
-
All files have been processed! Exiting worker.
|
| 70 |
-
INFO:huggingface_hub._upload_large_folder:All files have been processed! Exiting worker.
|
| 71 |
-
All files have been processed! Exiting worker.
|
| 72 |
-
INFO:huggingface_hub._upload_large_folder:All files have been processed! Exiting worker.
|
| 73 |
-
All files have been processed! Exiting worker.
|
| 74 |
-
INFO:huggingface_hub._upload_large_folder:All files have been processed! Exiting worker.
|
| 75 |
-
All files have been processed! Exiting worker.
|
| 76 |
-
INFO:huggingface_hub._upload_large_folder:All files have been processed! Exiting worker.
|
| 77 |
-
All files have been processed! Exiting worker.
|
| 78 |
-
INFO:huggingface_hub._upload_large_folder:All files have been processed! Exiting worker.
|
| 79 |
-
|
| 80 |
-
---------- 2026-03-08 10:05:08 (0:00:10) ----------
|
| 81 |
-
Files: hashed 4/4 (19.4M/19.4M) | pre-uploaded: 1/1 (19.3M/19.4M) | committed: 4/4 (19.4M/19.4M) | ignored: 0
|
| 82 |
-
Workers: hashing: 0 | get upload mode: 0 | pre-uploading: 0 | committing: 0 | waiting: 0
|
| 83 |
-
---------------------------------------------------
|
| 84 |
-
INFO:huggingface_hub._upload_large_folder:
|
| 85 |
-
---------- 2026-03-08 10:05:08 (0:00:10) ----------
|
| 86 |
-
Files: hashed 4/4 (19.4M/19.4M) | pre-uploaded: 1/1 (19.3M/19.4M) | committed: 4/4 (19.4M/19.4M) | ignored: 0
|
| 87 |
-
Workers: hashing: 0 | get upload mode: 0 | pre-uploading: 0 | committing: 0 | waiting: 0
|
| 88 |
-
---------------------------------------------------
|
| 89 |
-
[33mYou are about to upload a large folder to the Hub using `hf upload-large-folder`. This is a new feature so feedback is very welcome!
|
| 90 |
-
|
| 91 |
-
A few things to keep in mind:
|
| 92 |
-
- Repository limits still apply: https://huggingface.co/docs/hub/repositories-recommendations
|
| 93 |
-
- Do not start several processes in parallel.
|
| 94 |
-
- You can interrupt and resume the process at any time. The script will pick up where it left off except for partially uploaded files that would have to be entirely reuploaded.
|
| 95 |
-
- Do not upload the same folder to several repositories. If you need to do so, you must delete the `./.cache/huggingface/` folder first.
|
| 96 |
-
|
| 97 |
-
Some temporary metadata will be stored under `outputs/hf-compatibility-20260308-114319/.cache/huggingface`.
|
| 98 |
-
- You must not modify those files manually.
|
| 99 |
-
- You must not delete the `./.cache/huggingface/` folder while a process is running.
|
| 100 |
-
- You can delete the `./.cache/huggingface/` folder to reinitialize the upload state when process is not running. Files will have to be hashed and preuploaded again, except for already committed files.
|
| 101 |
-
|
| 102 |
-
If the process output is too verbose, you can disable the progress bars with `--no-bars`. You can also entirely disable the status report with `--no-report`.
|
| 103 |
-
|
| 104 |
-
For more details, run `hf upload-large-folder --help` or check the documentation at https://huggingface.co/docs/huggingface_hub/guides/upload#upload-a-large-folder.[0m
|
| 105 |
-
|
| 106 |
-
|
| 107 |
-
|
| 108 |
-
---------- 2026-03-08 10:04:58 (0:00:00) ----------
|
| 109 |
-
Files: hashed 3/4 (29.5K/19.4M) | pre-uploaded: 0/0 (0.0/19.4M) (+4 unsure) | committed: 0/4 (0.0/19.4M) | ignored: 0
|
| 110 |
-
Workers: hashing: 1 | get upload mode: 3 | pre-uploading: 0 | committing: 0 | waiting: 4
|
| 111 |
-
---------------------------------------------------
|
| 112 |
-
[2026-03-08T10:05:08+00:00] Finished artifact sync to lighteternal/BioAssayAlign-Qwen3-Embedding-0.6B-Compat-Hybrid
|
| 113 |
-
[2026-03-08T10:07:22+00:00] Starting artifact sync to lighteternal/BioAssayAlign-Qwen3-Embedding-0.6B-Compat-Hybrid
|
| 114 |
-
Repo created: https://huggingface.co/lighteternal/BioAssayAlign-Qwen3-Embedding-0.6B-Compat-Hybrid
|
| 115 |
-
Found 5 candidate files to upload
|
| 116 |
-
Running validation checks on files to upload...
|
| 117 |
-
Validation checks complete.
|
| 118 |
-
Starting upload...
|
| 119 |
-
Ignored metadata for 'train.log' (outdated). Will re-compute hash.
|
| 120 |
-
Ignored metadata for 'artifact_sync.log' (outdated). Will re-compute hash.
|
| 121 |
-
Ignored metadata for 'best_model.pt' (outdated). Will re-compute hash.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
best_model.pt
DELETED
|
@@ -1,3 +0,0 @@
|
|
| 1 |
-
version https://git-lfs.github.com/spec/v1
|
| 2 |
-
oid sha256:4cb0cbef2264870b6e0fbbf6715973d4ca38f86d30da1031b9c7581fb1c90be8
|
| 3 |
-
size 19337314
|
|
|
|
|
|
|
|
|
|
|
|
train.log
DELETED
|
@@ -1,194 +0,0 @@
|
|
| 1 |
-
{"assay_model_name":"Qwen/Qwen3-Embedding-0.6B","assays":11314,"event":"compatibility_train_preflight","test_assays":1124,"train_groups":196385,"val_assays":1128}
|
| 2 |
-
|
| 3 |
-
|
| 4 |
-
[09:46:33] WARNING: not removing hydrogen atom without neighbors
|
| 5 |
-
[09:46:34] WARNING: not removing hydrogen atom without neighbors
|
| 6 |
-
[09:46:35] WARNING: not removing hydrogen atom without neighbors
|
| 7 |
-
[09:46:47] WARNING: not removing hydrogen atom without neighbors
|
| 8 |
-
[09:46:57] WARNING: not removing hydrogen atom without neighbors
|
| 9 |
-
[09:46:59] WARNING: not removing hydrogen atom without neighbors
|
| 10 |
-
[09:46:59] WARNING: not removing hydrogen atom without neighbors
|
| 11 |
-
[09:47:14] WARNING: not removing hydrogen atom without neighbors
|
| 12 |
-
[09:47:18] WARNING: not removing hydrogen atom without neighbors
|
| 13 |
-
[09:47:44] WARNING: not removing hydrogen atom without neighbors
|
| 14 |
-
[09:47:45] WARNING: not removing hydrogen atom without neighbors
|
| 15 |
-
[09:47:50] WARNING: not removing hydrogen atom without neighbors
|
| 16 |
-
[09:47:58] WARNING: not removing hydrogen atom without neighbors
|
| 17 |
-
[09:48:30] WARNING: not removing hydrogen atom without neighbors
|
| 18 |
-
[09:48:32] WARNING: not removing hydrogen atom without neighbors
|
| 19 |
-
[09:48:35] WARNING: not removing hydrogen atom without neighbors
|
| 20 |
-
[09:48:53] WARNING: not removing hydrogen atom without neighbors
|
| 21 |
-
[09:48:54] WARNING: not removing hydrogen atom without neighbors
|
| 22 |
-
[09:49:08] WARNING: not removing hydrogen atom without neighbors
|
| 23 |
-
[09:49:10] WARNING: not removing hydrogen atom without neighbors
|
| 24 |
-
[09:49:18] WARNING: not removing hydrogen atom without neighbors
|
| 25 |
-
[09:49:32] WARNING: not removing hydrogen atom without neighbors
|
| 26 |
-
[09:49:44] WARNING: not removing hydrogen atom without neighbors
|
| 27 |
-
[09:49:49] WARNING: not removing hydrogen atom without neighbors
|
| 28 |
-
[09:49:53] WARNING: not removing hydrogen atom without neighbors
|
| 29 |
-
[09:50:04] WARNING: not removing hydrogen atom without neighbors
|
| 30 |
-
[09:50:06] WARNING: not removing hydrogen atom without neighbors
|
| 31 |
-
[09:50:13] WARNING: not removing hydrogen atom without neighbors
|
| 32 |
-
[09:50:30] WARNING: not removing hydrogen atom without neighbors
|
| 33 |
-
[09:50:38] WARNING: not removing hydrogen atom without neighbors
|
| 34 |
-
[09:50:48] WARNING: not removing hydrogen atom without neighbors
|
| 35 |
-
[09:50:58] WARNING: not removing hydrogen atom without neighbors
|
| 36 |
-
[09:51:03] WARNING: not removing hydrogen atom without neighbors
|
| 37 |
-
[09:51:49] WARNING: not removing hydrogen atom without neighbors
|
| 38 |
-
[09:51:50] WARNING: not removing hydrogen atom without neighbors
|
| 39 |
-
[09:52:28] WARNING: not removing hydrogen atom without neighbors
|
| 40 |
-
[09:52:39] WARNING: not removing hydrogen atom without neighbors
|
| 41 |
-
[09:52:40] WARNING: not removing hydrogen atom without neighbors
|
| 42 |
-
[09:52:56] WARNING: not removing hydrogen atom without neighbors
|
| 43 |
-
[09:53:05] WARNING: not removing hydrogen atom without neighbors
|
| 44 |
-
[09:53:12] WARNING: not removing hydrogen atom without neighbors
|
| 45 |
-
[09:53:38] WARNING: not removing hydrogen atom without neighbors
|
| 46 |
-
[09:53:38] WARNING: not removing hydrogen atom without neighbors
|
| 47 |
-
[09:54:33] WARNING: not removing hydrogen atom without neighbors
|
| 48 |
-
[09:54:36] WARNING: not removing hydrogen atom without neighbors
|
| 49 |
-
[09:54:38] WARNING: not removing hydrogen atom without neighbors
|
| 50 |
-
[09:54:49] WARNING: not removing hydrogen atom without neighbors
|
| 51 |
-
[09:54:59] WARNING: not removing hydrogen atom without neighbors
|
| 52 |
-
[09:55:13] WARNING: not removing hydrogen atom without neighbors
|
| 53 |
-
[09:55:20] WARNING: not removing hydrogen atom without neighbors
|
| 54 |
-
[09:55:32] WARNING: not removing hydrogen atom without neighbors
|
| 55 |
-
[09:55:33] WARNING: not removing hydrogen atom without neighbors
|
| 56 |
-
[09:56:01] WARNING: not removing hydrogen atom without neighbors
|
| 57 |
-
[09:56:23] WARNING: not removing hydrogen atom without neighbors
|
| 58 |
-
[09:56:35] WARNING: not removing hydrogen atom without neighbors
|
| 59 |
-
[09:56:48] WARNING: not removing hydrogen atom without neighbors
|
| 60 |
-
[09:56:52] WARNING: not removing hydrogen atom without neighbors
|
| 61 |
-
[09:57:02] WARNING: not removing hydrogen atom without neighbors
|
| 62 |
-
[09:57:03] WARNING: not removing hydrogen atom without neighbors
|
| 63 |
-
[09:57:08] WARNING: not removing hydrogen atom without neighbors
|
| 64 |
-
[09:57:19] WARNING: not removing hydrogen atom without neighbors
|
| 65 |
-
[09:57:23] WARNING: not removing hydrogen atom without neighbors
|
| 66 |
-
[09:57:41] WARNING: not removing hydrogen atom without neighbors
|
| 67 |
-
[09:58:03] WARNING: not removing hydrogen atom without neighbors
|
| 68 |
-
[09:58:15] WARNING: not removing hydrogen atom without neighbors
|
| 69 |
-
[09:58:15] WARNING: not removing hydrogen atom without neighbors
|
| 70 |
-
[09:58:15] WARNING: not removing hydrogen atom without neighbors
|
| 71 |
-
[09:58:32] WARNING: not removing hydrogen atom without neighbors
|
| 72 |
-
[09:58:48] WARNING: not removing hydrogen atom without neighbors
|
| 73 |
-
[09:58:49] WARNING: not removing hydrogen atom without neighbors
|
| 74 |
-
[09:58:52] WARNING: not removing hydrogen atom without neighbors
|
| 75 |
-
[09:59:11] WARNING: not removing hydrogen atom without neighbors
|
| 76 |
-
[09:59:15] WARNING: not removing hydrogen atom without neighbors
|
| 77 |
-
[09:59:15] WARNING: not removing hydrogen atom without neighbors
|
| 78 |
-
[09:59:16] WARNING: not removing hydrogen atom without neighbors
|
| 79 |
-
[10:00:14] WARNING: not removing hydrogen atom without neighbors
|
| 80 |
-
[10:00:17] WARNING: not removing hydrogen atom without neighbors
|
| 81 |
-
[10:00:36] WARNING: not removing hydrogen atom without neighbors
|
| 82 |
-
[10:01:11] WARNING: not removing hydrogen atom without neighbors
|
| 83 |
-
[10:01:13] WARNING: not removing hydrogen atom without neighbors
|
| 84 |
-
[10:01:46] WARNING: not removing hydrogen atom without neighbors
|
| 85 |
-
[10:01:46] WARNING: not removing hydrogen atom without neighbors
|
| 86 |
-
[10:02:09] WARNING: not removing hydrogen atom without neighbors
|
| 87 |
-
[10:02:15] WARNING: not removing hydrogen atom without neighbors
|
| 88 |
-
[10:02:24] WARNING: not removing hydrogen atom without neighbors
|
| 89 |
-
[10:02:31] WARNING: not removing hydrogen atom without neighbors
|
| 90 |
-
[10:02:37] WARNING: not removing hydrogen atom without neighbors
|
| 91 |
-
[10:02:41] WARNING: not removing hydrogen atom without neighbors
|
| 92 |
-
[10:02:47] WARNING: not removing hydrogen atom without neighbors
|
| 93 |
-
[10:02:49] WARNING: not removing hydrogen atom without neighbors
|
| 94 |
-
[10:02:55] WARNING: not removing hydrogen atom without neighbors
|
| 95 |
-
[10:03:11] WARNING: not removing hydrogen atom without neighbors
|
| 96 |
-
[10:03:41] WARNING: not removing hydrogen atom without neighbors
|
| 97 |
-
[10:04:04] WARNING: not removing hydrogen atom without neighbors
|
| 98 |
-
[10:04:07] WARNING: not removing hydrogen atom without neighbors
|
| 99 |
-
[10:04:10] WARNING: not removing hydrogen atom without neighbors
|
| 100 |
-
[10:04:16] WARNING: not removing hydrogen atom without neighbors
|
| 101 |
-
[10:04:23] WARNING: not removing hydrogen atom without neighbors
|
| 102 |
-
{"epoch":1,"event":"compatibility_train_step","learning_rate":0.0015,"loss":2.3644846296310424,"step":50}
|
| 103 |
-
{"epoch":1,"event":"compatibility_train_step","learning_rate":0.0015,"loss":2.2224977469444274,"step":100}
|
| 104 |
-
{"epoch":1,"event":"compatibility_train_step","learning_rate":0.0015,"loss":2.1262132080396015,"step":150}
|
| 105 |
-
{"epoch":1,"event":"compatibility_train_step","learning_rate":0.0015,"loss":2.050005639195442,"step":200}
|
| 106 |
-
{"epoch":1,"event":"compatibility_train_step","learning_rate":0.0015,"loss":1.9856570315361024,"step":250}
|
| 107 |
-
{"epoch":1,"event":"compatibility_train_step","learning_rate":0.0015,"loss":1.9291270116964976,"step":300}
|
| 108 |
-
{"epoch":1,"event":"compatibility_train_step","learning_rate":0.0015,"loss":1.8802630840029035,"step":350}
|
| 109 |
-
{"epoch":1,"event":"compatibility_eval","step":384,"train_loss":1.8510667420667055,"val_assays":1128.0,"val_hit_at_10":0.9698581560283688,"val_mean_auprc":0.5811560403542264,"val_mean_auroc":0.7526937154907752,"val_mean_ndcg50":0.7133790025640342,"val_random_auprc_baseline":0.25914365280814605}
|
| 110 |
-
{"epoch":1,"event":"compatibility_checkpoint","step":384,"val_mean_auprc":0.5811560403542264}
|
| 111 |
-
{"epoch":2,"event":"compatibility_train_step","learning_rate":0.0015,"loss":1.2624734193086624,"step":400}
|
| 112 |
-
{"epoch":2,"event":"compatibility_train_step","learning_rate":0.0015,"loss":1.285111969167536,"step":450}
|
| 113 |
-
{"epoch":2,"event":"compatibility_train_step","learning_rate":0.0015,"loss":1.2734571222601265,"step":500}
|
| 114 |
-
{"epoch":2,"event":"compatibility_train_step","learning_rate":0.0015,"loss":1.2746561370700238,"step":550}
|
| 115 |
-
{"epoch":2,"event":"compatibility_train_step","learning_rate":0.0015,"loss":1.2721572882599301,"step":600}
|
| 116 |
-
{"epoch":2,"event":"compatibility_train_step","learning_rate":0.0015,"loss":1.2670463139849497,"step":650}
|
| 117 |
-
{"epoch":2,"event":"compatibility_train_step","learning_rate":0.0015,"loss":1.2620471285868295,"step":700}
|
| 118 |
-
{"epoch":2,"event":"compatibility_train_step","learning_rate":0.0015,"loss":1.251940252350979,"step":750}
|
| 119 |
-
{"epoch":2,"event":"compatibility_eval","step":768,"train_loss":1.250132103412103,"val_assays":1128.0,"val_hit_at_10":0.9725177304964538,"val_mean_auprc":0.6066642714499427,"val_mean_auroc":0.7693562606437709,"val_mean_ndcg50":0.7335741265847184,"val_random_auprc_baseline":0.25914365280814605}
|
| 120 |
-
{"epoch":2,"event":"compatibility_checkpoint","step":768,"val_mean_auprc":0.6066642714499427}
|
| 121 |
-
{"epoch":3,"event":"compatibility_train_step","learning_rate":0.0015,"loss":0.8434912599623203,"step":800}
|
| 122 |
-
{"epoch":3,"event":"compatibility_train_step","learning_rate":0.0015,"loss":0.8598075865245447,"step":850}
|
| 123 |
-
{"epoch":3,"event":"compatibility_train_step","learning_rate":0.0015,"loss":0.8738713797294733,"step":900}
|
| 124 |
-
{"epoch":3,"event":"compatibility_train_step","learning_rate":0.0015,"loss":0.8842672653250642,"step":950}
|
| 125 |
-
{"epoch":3,"event":"compatibility_train_step","learning_rate":0.0015,"loss":0.8921179429724299,"step":1000}
|
| 126 |
-
{"epoch":3,"event":"compatibility_train_step","learning_rate":0.0015,"loss":0.8977129036653126,"step":1050}
|
| 127 |
-
{"epoch":3,"event":"compatibility_train_step","learning_rate":0.0015,"loss":0.9017457371375647,"step":1100}
|
| 128 |
-
{"epoch":3,"event":"compatibility_train_step","learning_rate":0.0015,"loss":0.9038390150557014,"step":1150}
|
| 129 |
-
{"epoch":3,"event":"compatibility_eval","step":1152,"train_loss":0.9039763145274141,"val_assays":1128.0,"val_hit_at_10":0.9725177304964538,"val_mean_auprc":0.6130965132147473,"val_mean_auroc":0.7743368827669517,"val_mean_ndcg50":0.7393465535132597,"val_random_auprc_baseline":0.25914365280814605}
|
| 130 |
-
{"epoch":3,"event":"compatibility_checkpoint","step":1152,"val_mean_auprc":0.6130965132147473}
|
| 131 |
-
{"epoch":4,"event":"compatibility_train_step","learning_rate":0.0015,"loss":0.5691286449631056,"step":1200}
|
| 132 |
-
{"epoch":4,"event":"compatibility_train_step","learning_rate":0.0015,"loss":0.5835598956565468,"step":1250}
|
| 133 |
-
{"epoch":4,"event":"compatibility_train_step","learning_rate":0.0015,"loss":0.5985121682688996,"step":1300}
|
| 134 |
-
{"epoch":4,"event":"compatibility_train_step","learning_rate":0.0015,"loss":0.611610070473016,"step":1350}
|
| 135 |
-
{"epoch":4,"event":"compatibility_train_step","learning_rate":0.0015,"loss":0.6229955629715996,"step":1400}
|
| 136 |
-
{"epoch":4,"event":"compatibility_train_step","learning_rate":0.0015,"loss":0.6305465821251773,"step":1450}
|
| 137 |
-
{"epoch":4,"event":"compatibility_train_step","learning_rate":0.0015,"loss":0.6393588328327255,"step":1500}
|
| 138 |
-
{"epoch":4,"event":"compatibility_eval","step":1536,"train_loss":0.6438652696563849,"val_assays":1128.0,"val_hit_at_10":0.974290780141844,"val_mean_auprc":0.6156788594420544,"val_mean_auroc":0.7752029119692252,"val_mean_ndcg50":0.7416433895605566,"val_random_auprc_baseline":0.25914365280814605}
|
| 139 |
-
{"epoch":4,"event":"compatibility_checkpoint","step":1536,"val_mean_auprc":0.6156788594420544}
|
| 140 |
-
{"epoch":5,"event":"compatibility_train_step","learning_rate":0.0015,"loss":0.3648080038172858,"step":1550}
|
| 141 |
-
{"epoch":5,"event":"compatibility_train_step","learning_rate":0.0015,"loss":0.3697587540373206,"step":1600}
|
| 142 |
-
{"epoch":5,"event":"compatibility_train_step","learning_rate":0.0015,"loss":0.38665713683555003,"step":1650}
|
| 143 |
-
{"epoch":5,"event":"compatibility_train_step","learning_rate":0.0015,"loss":0.40182235001063926,"step":1700}
|
| 144 |
-
{"epoch":5,"event":"compatibility_train_step","learning_rate":0.0015,"loss":0.417484822395806,"step":1750}
|
| 145 |
-
{"epoch":5,"event":"compatibility_train_step","learning_rate":0.0015,"loss":0.4296003230141871,"step":1800}
|
| 146 |
-
{"epoch":5,"event":"compatibility_train_step","learning_rate":0.0015,"loss":0.43660333050284417,"step":1850}
|
| 147 |
-
{"epoch":5,"event":"compatibility_train_step","learning_rate":0.0015,"loss":0.44509183509009226,"step":1900}
|
| 148 |
-
{"epoch":5,"event":"compatibility_eval","step":1920,"train_loss":0.44760717629254265,"val_assays":1128.0,"val_hit_at_10":0.9698581560283688,"val_mean_auprc":0.616495170683535,"val_mean_auroc":0.774194749407381,"val_mean_ndcg50":0.7393880053786596,"val_random_auprc_baseline":0.25914365280814605}
|
| 149 |
-
{"epoch":6,"event":"compatibility_train_step","learning_rate":0.0015,"loss":0.2463798110683759,"step":1950}
|
| 150 |
-
{"epoch":6,"event":"compatibility_train_step","learning_rate":0.0015,"loss":0.2634849963709712,"step":2000}
|
| 151 |
-
{"epoch":6,"event":"compatibility_train_step","learning_rate":0.0015,"loss":0.27135874732182574,"step":2050}
|
| 152 |
-
{"epoch":6,"event":"compatibility_train_step","learning_rate":0.0015,"loss":0.2822398060725795,"step":2100}
|
| 153 |
-
{"epoch":6,"event":"compatibility_train_step","learning_rate":0.0015,"loss":0.2926340575451436,"step":2150}
|
| 154 |
-
{"epoch":6,"event":"compatibility_train_step","learning_rate":0.0015,"loss":0.3024993966200522,"step":2200}
|
| 155 |
-
{"epoch":6,"event":"compatibility_train_step","learning_rate":0.0015,"loss":0.3087626133452762,"step":2250}
|
| 156 |
-
{"epoch":6,"event":"compatibility_train_step","learning_rate":0.0015,"loss":0.31494911136595827,"step":2300}
|
| 157 |
-
{"epoch":6,"event":"compatibility_eval","step":2304,"train_loss":0.31519552278224877,"val_assays":1128.0,"val_hit_at_10":0.9636524822695035,"val_mean_auprc":0.6162184635996248,"val_mean_auroc":0.7743091610223706,"val_mean_ndcg50":0.7404873365964237,"val_random_auprc_baseline":0.25914365280814605}
|
| 158 |
-
{"epoch":7,"event":"compatibility_train_step","learning_rate":0.0015,"loss":0.18549147313055786,"step":2350}
|
| 159 |
-
{"epoch":7,"event":"compatibility_train_step","learning_rate":0.0015,"loss":0.19420565377610424,"step":2400}
|
| 160 |
-
{"epoch":7,"event":"compatibility_train_step","learning_rate":0.0015,"loss":0.1998668096143089,"step":2450}
|
| 161 |
-
{"epoch":7,"event":"compatibility_train_step","learning_rate":0.0015,"loss":0.2092713903331635,"step":2500}
|
| 162 |
-
{"epoch":7,"event":"compatibility_train_step","learning_rate":0.0015,"loss":0.21497820314716518,"step":2550}
|
| 163 |
-
{"epoch":7,"event":"compatibility_train_step","learning_rate":0.0015,"loss":0.222306654939579,"step":2600}
|
| 164 |
-
{"epoch":7,"event":"compatibility_train_step","learning_rate":0.0015,"loss":0.22804353384010365,"step":2650}
|
| 165 |
-
{"epoch":7,"event":"compatibility_eval","step":2688,"train_loss":0.23221179631038824,"val_assays":1128.0,"val_hit_at_10":0.9636524822695035,"val_mean_auprc":0.6132282464013502,"val_mean_auroc":0.7746565082001043,"val_mean_ndcg50":0.7372136719423908,"val_random_auprc_baseline":0.25914365280814605}
|
| 166 |
-
{"epoch":8,"event":"compatibility_train_step","learning_rate":0.0015,"loss":0.15015127634008726,"step":2700}
|
| 167 |
-
{"epoch":8,"event":"compatibility_train_step","learning_rate":0.0015,"loss":0.15300160502233812,"step":2750}
|
| 168 |
-
{"epoch":8,"event":"compatibility_train_step","learning_rate":0.0015,"loss":0.15580453457576887,"step":2800}
|
| 169 |
-
{"epoch":8,"event":"compatibility_train_step","learning_rate":0.0015,"loss":0.1627114885199217,"step":2850}
|
| 170 |
-
{"epoch":8,"event":"compatibility_train_step","learning_rate":0.0015,"loss":0.16738012355734716,"step":2900}
|
| 171 |
-
{"epoch":8,"event":"compatibility_train_step","learning_rate":0.0015,"loss":0.17170712226674756,"step":2950}
|
| 172 |
-
{"epoch":8,"event":"compatibility_train_step","learning_rate":0.0015,"loss":0.17700709311816937,"step":3000}
|
| 173 |
-
{"epoch":8,"event":"compatibility_train_step","learning_rate":0.0015,"loss":0.18268438453338423,"step":3050}
|
| 174 |
-
{"epoch":8,"event":"compatibility_eval","step":3072,"train_loss":0.18441374099211813,"val_assays":1128.0,"val_hit_at_10":0.9707446808510638,"val_mean_auprc":0.6141168309407303,"val_mean_auroc":0.7739918524357319,"val_mean_ndcg50":0.7395481035833408,"val_random_auprc_baseline":0.25914365280814605}
|
| 175 |
-
{"epoch":9,"event":"compatibility_train_step","learning_rate":0.0015,"loss":0.12521490481282985,"step":3100}
|
| 176 |
-
{"epoch":9,"event":"compatibility_train_step","learning_rate":0.0015,"loss":0.13352976825374824,"step":3150}
|
| 177 |
-
{"epoch":9,"event":"compatibility_train_step","learning_rate":0.0015,"loss":0.13558898825431243,"step":3200}
|
| 178 |
-
{"epoch":9,"event":"compatibility_train_step","learning_rate":0.0015,"loss":0.13953530081035045,"step":3250}
|
| 179 |
-
{"epoch":9,"event":"compatibility_train_step","learning_rate":0.0015,"loss":0.14306921724295407,"step":3300}
|
| 180 |
-
{"epoch":9,"event":"compatibility_train_step","learning_rate":0.0015,"loss":0.14744489575675923,"step":3350}
|
| 181 |
-
{"epoch":9,"event":"compatibility_train_step","learning_rate":0.0015,"loss":0.15277234837412834,"step":3400}
|
| 182 |
-
{"epoch":9,"event":"compatibility_train_step","learning_rate":0.0015,"loss":0.1570363242829603,"step":3450}
|
| 183 |
-
{"epoch":9,"event":"compatibility_eval","step":3456,"train_loss":0.1574952612457667,"val_assays":1128.0,"val_hit_at_10":0.9698581560283688,"val_mean_auprc":0.6166779819480247,"val_mean_auroc":0.775318689795727,"val_mean_ndcg50":0.7403587250173033,"val_random_auprc_baseline":0.25914365280814605}
|
| 184 |
-
{"epoch":10,"event":"compatibility_train_step","learning_rate":0.0015,"loss":0.11384461100467226,"step":3500}
|
| 185 |
-
{"epoch":10,"event":"compatibility_train_step","learning_rate":0.0015,"loss":0.11280246621909294,"step":3550}
|
| 186 |
-
{"epoch":10,"event":"compatibility_train_step","learning_rate":0.0015,"loss":0.115515859099105,"step":3600}
|
| 187 |
-
{"epoch":10,"event":"compatibility_train_step","learning_rate":0.0015,"loss":0.1211496555659267,"step":3650}
|
| 188 |
-
{"epoch":10,"event":"compatibility_train_step","learning_rate":0.0015,"loss":0.12314411196246987,"step":3700}
|
| 189 |
-
{"epoch":10,"event":"compatibility_train_step","learning_rate":0.0015,"loss":0.1273973005488008,"step":3750}
|
| 190 |
-
{"epoch":10,"event":"compatibility_train_step","learning_rate":0.0015,"loss":0.13036615805440518,"step":3800}
|
| 191 |
-
{"epoch":10,"event":"compatibility_eval","step":3840,"train_loss":0.13350581079960464,"val_assays":1128.0,"val_hit_at_10":0.9716312056737588,"val_mean_auprc":0.6150183917880391,"val_mean_auroc":0.7742141511107639,"val_mean_ndcg50":0.7402258258282143,"val_random_auprc_baseline":0.25914365280814605}
|
| 192 |
-
{"best_epoch":4,"best_val_mean_auprc":0.6156788594420544,"epoch":10,"event":"compatibility_early_stop","step":3840}
|
| 193 |
-
{"best_epoch":4,"best_metrics":{"epoch":4,"event":"compatibility_eval","step":1536,"train_loss":0.6438652696563849,"val_assays":1128.0,"val_hit_at_10":0.974290780141844,"val_mean_auprc":0.6156788594420544,"val_mean_auroc":0.7752029119692252,"val_mean_ndcg50":0.7416433895605566,"val_random_auprc_baseline":0.25914365280814605},"best_val_mean_auprc":0.6156788594420544,"created_at":"2026-03-08T10:07:20.070829+00:00","event":"compatibility_train_complete","metadata_path":"outputs/hf-compatibility-20260308-114319/training_metadata.json","output_dir":"outputs/hf-compatibility-20260308-114319","test_metrics":{"test_assays":1124.0,"test_hit_at_10":0.9759786476868327,"test_mean_auprc":0.6384098191264667,"test_mean_auroc":0.7865290084811912,"test_mean_ndcg50":0.7598000983788723,"test_random_auprc_baseline":0.2655493166806895},"train_groups":196385}
|
| 194 |
-
{"best_epoch":4,"best_metrics":{"epoch":4,"event":"compatibility_eval","step":1536,"train_loss":0.6438652696563849,"val_assays":1128.0,"val_hit_at_10":0.974290780141844,"val_mean_auprc":0.6156788594420544,"val_mean_auroc":0.7752029119692252,"val_mean_ndcg50":0.7416433895605566,"val_random_auprc_baseline":0.25914365280814605},"best_val_mean_auprc":0.6156788594420544,"created_at":"2026-03-08T10:07:20.070829+00:00","event":"compatibility_train_complete","metadata_path":"outputs/hf-compatibility-20260308-114319/training_metadata.json","output_dir":"outputs/hf-compatibility-20260308-114319","test_metrics":{"test_assays":1124.0,"test_hit_at_10":0.9759786476868327,"test_mean_auprc":0.6384098191264667,"test_mean_auroc":0.7865290084811912,"test_mean_ndcg50":0.7598000983788723,"test_random_auprc_baseline":0.2655493166806895},"train_groups":196385}
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
training_metadata.json
DELETED
|
@@ -1 +0,0 @@
|
|
| 1 |
-
{"compat_prepared_manifest_sha256":"4914586b2e3b50b50421bda92517e75b6ef579532211bd88ee62724c23285518","config":{"assay_batch_size":64,"assay_model_name":"Qwen/Qwen3-Embedding-0.6B","assay_task_description":"Given a bioassay description and metadata, represent the assay for ranking compatible small molecules.","batch_size":512,"dropout":0.12,"early_stopping_min_delta":0.001,"early_stopping_patience":6,"fingerprint_bits":2048,"fingerprint_radii":[2,3],"hidden_dim":1024,"learning_rate":0.0015,"log_every_steps":50,"manifest_path":"data/hf_compatibility_prepared/prepared/compat_a10_full_v1/DATASET_MANIFEST.json","max_epochs":40,"max_train_groups":0,"output_dir":"outputs/hf-compatibility-20260308-114319","prepared_dir":"data/hf_compatibility_prepared/prepared/compat_a10_full_v1","projection_dim":512,"seed":3407,"use_chirality":true,"use_maccs":true,"use_rdkit_descriptors":true,"weight_decay":0.0001},"created_at":"2026-03-08T10:04:35.734104+00:00","feature_counts":{"assays":11314,"molecule_dim":4293,"molecules":415917,"train_groups":196385},"framework":"pytorch_head_only_compatibility_ranking","manifest_sha256":"e4766477b64860952258cb4b76567b83061d5de44bb5f3b322ecdfe54f19910b","molecule_feature_spec":{"descriptor_mean":[398.1134033203125,3.2746927738189697,83.00942993164062,27.86279296875,1.5429785251617432,5.28916597366333,5.361064910888672,3.433716297149658,2.5244076251983643,0.9093088507652283,0.5992806553840637,0.306065171957016,7.550338268280029,0.9157764911651611,1.0545973777770996,0.0012430364731699228,13.69067668914795,0.007576030679047108,0.6998150944709778,3.2688372135162354,3.1023762226104736,0.4586491882801056,0.010610289871692657,0.3970455527305603,0.24474835395812988,0.05192382261157036,0.0060973702929914,13.792627334594727,0.02658703550696373,0.08220870792865753],"descriptor_names":["mol_wt","logp","tpsa","heavy_atoms","hbd","hba","rot_bonds","ring_count","aromatic_rings","aliphatic_rings","saturated_rings","fraction_csp3","heteroatoms","amide_bonds","fragments","formal_charge","max_atomic_num","metal_atom_count","halogen_count","nitrogen_count","oxygen_count","sulfur_count","phosphorus_count","fluorine_count","chlorine_count","bromine_count","iodine_count","aromatic_atom_count","spiro_atoms","bridgehead_atoms"],"descriptor_std":[159.82997131347656,1.820827603340149,59.71180725097656,11.337861061096191,2.0899147987365723,2.7096076011657715,4.585837364196777,1.3724687099456787,1.160254716873169,1.0937219858169556,0.946469247341156,0.19584713876247406,4.076158046722412,1.4967330694198608,0.30233079195022583,0.10375913232564926,7.15020751953125,0.12212952226400375,1.1844125986099243,2.331219434738159,2.53597092628479,0.6712223887443542,0.12304858863353729,1.0357189178466797,0.573158860206604,0.24683673679828644,0.09013637900352478,6.031373500823975,0.1726188212633133,0.539986789226532],"fingerprint_bits":2048,"fingerprint_radii":[2,3],"use_chirality":true,"use_maccs":true,"use_rdkit_descriptors":true},"prepared_manifest":{"file_hashes":{"DATASET_MANIFEST.json":"e4766477b64860952258cb4b76567b83061d5de44bb5f3b322ecdfe54f19910b","compat_assays.parquet":"a44525a1d11f556d93bf5c407d3be32354ed27188c7fdd3dd1db9c048f663429","compat_candidate_pools.parquet":"94e0f483466ce9a3d2b1f2c652cf2806488b00d9b4a954c323297e0d4a05b7a5","compat_train_groups.parquet":"f506eb0b30402d66ae4a29565d34e03b8756e16d74a51db1a5fd7ffa6b375690"},"prepared_at":"2026-03-07T22:32:05.283930+00:00","row_counts":{"compat_assays":11314,"compat_candidate_pools":1094277,"compat_train_groups":196385},"seed":3407,"selection_report":{"avg_candidates_per_assay":96.71884391019975,"candidate_pool_rows":1094277,"dropped_after_conflicts_or_caps":1475,"eligible_after_count_thresholds":11314,"mean_train_groups_per_train_assay":1.0,"selected_assays":11314,"total_conflicting_compounds_removed":33273,"train_groups":196385},"sharding":{"merged_from":["shard-00-of-08","shard-01-of-08","shard-02-of-08","shard-03-of-08","shard-04-of-08","shard-05-of-08","shard-06-of-08","shard-07-of-08"],"num_shards":8},"source_dataset_hashes":{"assays_sha256":"4b220df37625a4b006bb5232871c956c150648fb7eac0448b17067f76a06b7b5","measurements_sha256":"1c7cb702e7d694f4c4750e139f09280419ec9d5bcd115f9d0311dfe4c2985ade"},"source_manifest_sha256":"e4766477b64860952258cb4b76567b83061d5de44bb5f3b322ecdfe54f19910b","split_counts":{"test":1124,"train":9062,"val":1128},"strategy":"compatibility_ranking_v1_sharded_merge","thresholds":{"max_actives_per_assay":32,"max_inactives_per_assay":128,"max_train_groups":0,"min_actives_per_assay":4,"min_inactives_per_assay":16,"negative_sets_per_positive":1,"negatives_per_example":7},"train_group_split_counts":{"test":0,"train":196385,"val":0}}}
|
|
|
|
|
|
training_summary.json
DELETED
|
@@ -1 +0,0 @@
|
|
| 1 |
-
{"best_epoch":4,"best_metrics":{"epoch":4,"event":"compatibility_eval","step":1536,"train_loss":0.6438652696563849,"val_assays":1128.0,"val_hit_at_10":0.974290780141844,"val_mean_auprc":0.6156788594420544,"val_mean_auroc":0.7752029119692252,"val_mean_ndcg50":0.7416433895605566,"val_random_auprc_baseline":0.25914365280814605},"best_val_mean_auprc":0.6156788594420544,"created_at":"2026-03-08T10:07:20.070829+00:00","event":"compatibility_train_complete","metadata_path":"outputs/hf-compatibility-20260308-114319/training_metadata.json","output_dir":"outputs/hf-compatibility-20260308-114319","test_metrics":{"test_assays":1124.0,"test_hit_at_10":0.9759786476868327,"test_mean_auprc":0.6384098191264667,"test_mean_auroc":0.7865290084811912,"test_mean_ndcg50":0.7598000983788723,"test_random_auprc_baseline":0.2655493166806895},"train_groups":196385}
|
|
|
|
|
|