deepakint commited on
Commit
1a5890f
·
verified ·
1 Parent(s): d016824

Remove emoji checkmarks and warning signs

Browse files
Files changed (1) hide show
  1. README.md +11 -11
README.md CHANGED
@@ -190,9 +190,9 @@ for label, names in sorted(grouped.items()):
190
  | Parameter | Value |
191
  |---|---|
192
  | Learning rate | 2e-05 |
193
- | Batch size | 16 (×2 gradient accumulation = 32 effective) |
194
  | Epochs | 3 |
195
- | Optimizer | AdamW (β₁=0.9, β₂=0.999, ε=1e-08) |
196
  | LR scheduler | Cosine with 10% warmup |
197
  | Seed | 42 |
198
 
@@ -206,19 +206,19 @@ for label, names in sorted(grouped.items()):
206
 
207
  **Note:** The best checkpoint (epoch ~2, lowest validation loss 0.0606) was selected as the final model, achieving **90.6% F1**.
208
 
209
- ## Strengths & Limitations
210
 
211
  ### Strengths
212
- - **Cross-domain**: Works on patents, papers, news, and political documents with a single model
213
- - **Multilingual**: Handles both English and German text
214
- - **Rich entity types**: 15 entity types covering people, organizations, locations, biological entities, diseases, instruments, and more
215
- - **Fast**: ~5ms per document on CPU — suitable for processing millions of documents
216
- - **Long context**: Inherits ModernBERT's 8,192 token context window
217
 
218
  ### Limitations
219
- - ⚠️ **Conference/product names**: May fragment uncommon compound names (e.g., "NeurIPS" split tokens) — use confidence thresholding (>0.5) to filter
220
- - ⚠️ **Languages**: Optimized for English and German; other languages may work but are untested
221
- - ⚠️ **Domain drift**: Performance is best on patent, scientific, political, and news text — may degrade on informal text (social media, chat)
222
 
223
  ## Recommended Post-Processing
224
 
 
190
  | Parameter | Value |
191
  |---|---|
192
  | Learning rate | 2e-05 |
193
+ | Batch size | 16 (x2 gradient accumulation = 32 effective) |
194
  | Epochs | 3 |
195
+ | Optimizer | AdamW |
196
  | LR scheduler | Cosine with 10% warmup |
197
  | Seed | 42 |
198
 
 
206
 
207
  **Note:** The best checkpoint (epoch ~2, lowest validation loss 0.0606) was selected as the final model, achieving **90.6% F1**.
208
 
209
+ ## Strengths and Limitations
210
 
211
  ### Strengths
212
+ - **Cross-domain**: Works on patents, papers, news, and political documents with a single model
213
+ - **Multilingual**: Handles both English and German text
214
+ - **Rich entity types**: 15 entity types covering people, organizations, locations, biological entities, diseases, instruments, and more
215
+ - **Fast**: ~5ms per document on CPU — suitable for processing millions of documents
216
+ - **Long context**: Inherits ModernBERT's 8,192 token context window
217
 
218
  ### Limitations
219
+ - **Conference/product names**: May fragment uncommon compound names (e.g., "NeurIPS" split into tokens) — use confidence thresholding (>0.5) to filter
220
+ - **Languages**: Optimized for English and German; other languages may work but are untested
221
+ - **Domain drift**: Performance is best on patent, scientific, political, and news text — may degrade on informal text (social media, chat)
222
 
223
  ## Recommended Post-Processing
224