Falcon-H1-Tiny
Collection
A series of extremely small, yet powerful language models redefining capabilities at small scale
•
22 items
•
Updated
•
27
For more details about the training protocol of this model, please refer to the Falcon-H1-Tiny technical blogpost.
Currently to use this model you can either rely on Hugging Face transformers, vLLM, sglang, llama.cpp, ollama or mlx library. You should use this model for Python code generation or Python fill-in-the-middle task. The FIM formart is the following:
<|prefix|>{prefix}<|suffix|>{suffix}<|middle|>
llama.cpp
You can find all GGUF files compatible with llama.cpp under our official collection - an example setup could be:
brew install llama.cpp
pip install huggingface_hub
hf download tiiuae/Falcon-H1-Tiny-90M-Coder-GGUF Falcon-H1-Tiny-90M-Coder-GGUF-Q8_0.gguf --local-dir ./
llama-cli ./Falcon-H1-Tiny-90M-Coder-GGUF-Q8_0.gguf -cnv
ollama
ollama run hf.co/tiiuae/Falcon-H1-Tiny-90M-Coder-GGUF:Q8_0
For detailed evaluation of Falcon-H1-Tiny series, please refer to our technical blogpost
If the Falcon-H1-Tiny family of models were helpful to your work, feel free to give us a cite.
@misc{falcon_h1_tiny,
title={Falcon-H1-Tiny: A series of extremely small, yet powerful language models redefining capabilities at small scale},
author={Falcon-LLM Team},
year={2026},
}
1-bit
2-bit
3-bit
4-bit
5-bit
6-bit
8-bit
16-bit