Instructions to use Taykhoom/BERT-updated with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- Transformers
How to use Taykhoom/BERT-updated with Transformers:
# Use a pipeline as a high-level helper from transformers import pipeline pipe = pipeline("feature-extraction", model="Taykhoom/BERT-updated", trust_remote_code=True)# Load model directly from transformers import AutoModel model = AutoModel.from_pretrained("Taykhoom/BERT-updated", trust_remote_code=True, dtype="auto") - Notebooks
- Google Colab
- Kaggle
BERT-updated
Standard BERT architecture with flash_attention_2 and sdpa support added.
This is a shared code repository — it contains no pretrained weights. It is used as the code backend for biological sequence models that share the vanilla BERT architecture (post-LN transformer, learned absolute position embeddings) but have model-specific vocabularies and hyperparameters:
Each of those repos stores weights, tokenizer, and config; their auto_map in
config.json points here for the modeling code.
What was changed from stock transformers.BertModel
The standard HF BertModel (transformers 4.57.6) supports sdpa but not
flash_attention_2. This repo adds a complete attn_implementation dispatch:
| Backend | Class | Notes |
|---|---|---|
eager |
BertSelfAttention |
Standard scaled dot-product, identical to original BERT |
sdpa |
BertSdpaSelfAttention |
F.scaled_dot_product_attention, bool mask -> additive float mask |
flash_attention_2 |
BertFlashSelfAttention |
flash_attn_varlen_func for padded inputs, flash_attn_func for unpadded |
The rest of the architecture (embeddings, FFN, pooler, weight layout) is unchanged.
Usage
Do not load this repo directly. Load one of the model repos listed above:
from transformers import AutoTokenizer, AutoModel
tokenizer = AutoTokenizer.from_pretrained("Taykhoom/RNABERT", trust_remote_code=True)
model = AutoModel.from_pretrained("Taykhoom/RNABERT", trust_remote_code=True)
# Flash Attention 2
model = AutoModel.from_pretrained("Taykhoom/UTRBERT-3mer", trust_remote_code=True,
attn_implementation="flash_attention_2")
Credits
Modeling code authored primarily by Claude Code and reviewed manually by Taykhoom Dalal.
License
Apache 2.0.
- Downloads last month
- 27