Qwen 3.5 2B β TFLite (.tflite)
Qwen 3.5 2B as raw TFLite format for on-device inference with the TFLite Interpreter API.
For LiteRT-LM Engine usage, use the bundled version instead: paulsp94/Qwen3.5-2B-LiteRT-LM
What's this
Raw .tflite model file β use this if you're building your own inference pipeline with the TFLite Interpreter API directly. If you want the ready-to-use LiteRT-LM bundle with tokenizer included, use the LiteRT-LM version instead.
Architecture
| Base model | Qwen/Qwen3.5-2B |
| Layers | 24 total: 18Γ GatedDeltaNet linear + 6Γ GQA full attention |
| Quantization | int8 dynamic |
| Format | TFLite (.tflite) |
| Size | ~1.9 GB |
Files
qwen35_2b.tfliteβ The converted modeltokenizer.jsonβ BPE tokenizer (you'll need to handle tokenization yourself)tokenizer_config.jsonβ Tokenizer configurationconfig.jsonβ Original model config
Conversion
Source: allot/tools/model-export
- Downloads last month
- 109
Inference Providers NEW
This model isn't deployed by any Inference Provider. π Ask for provider support