Falcon-H1 Collection Falcon-H1 Family of Hybrid-Head Language Models (Transformer-SSM), including 0.5B, 1.5B, 1.5B-Deep, 3B, 7B, and 34B (pretrained & instruction-tuned). β’ 39 items β’ Updated Jan 9 β’ 59
Falcon-H1: A Family of Hybrid-Head Language Models Redefining Efficiency and Performance Paper β’ 2507.22448 β’ Published Jul 30, 2025 β’ 70
Beyond Chains of Thought: Benchmarking Latent-Space Reasoning Abilities in Large Language Models Paper β’ 2504.10615 β’ Published Apr 14, 2025 β’ 2
Scaling up Test-Time Compute with Latent Reasoning: A Recurrent Depth Approach Paper β’ 2502.05171 β’ Published Feb 7, 2025 β’ 152
β UI is a good thing π β Collection cool spaces with a cool UI, what could be better? β’ 5 items β’ Updated May 5, 2025 β’ 30
view article Article I Clicked βI Agreeβ, But What Am I Really Consenting To? Mar 26, 2025 β’ 24
Model Hubs and Beyond: Analyzing Model Popularity, Performance, and Documentation Paper β’ 2503.15222 β’ Published Mar 19, 2025 β’ 1
The AI Community Building the Future? A Quantitative Analysis of Development Activity on Hugging Face Hub Paper β’ 2405.13058 β’ Published May 20, 2024 β’ 2
SpaceByte: Towards Deleting Tokenization from Large Language Modeling Paper β’ 2404.14408 β’ Published Apr 22, 2024 β’ 7
T-FREE: Tokenizer-Free Generative LLMs via Sparse Representations for Memory-Efficient Embeddings Paper β’ 2406.19223 β’ Published Jun 27, 2024 β’ 11
Does Time Have Its Place? Temporal Heads: Where Language Models Recall Time-specific Information Paper β’ 2502.14258 β’ Published Feb 20, 2025 β’ 26
Foundation Text-Generation Models Below 360M Parameters Collection Great candidates for fine-tuning targeting Wllama and Transformers.js for mobile devices, ordered by number of parameters. β’ 42 items β’ Updated about 1 month ago β’ 41
Finch: Prompt-guided Key-Value Cache Compression Paper β’ 2408.00167 β’ Published Jul 31, 2024 β’ 17