🦊 Fox 1.5 Nova

Fox 1.5 Nova is Teo's code generation model, fine-tuned for competitive programming, systems design, and real-world code patterns across 50+ languages.

πŸ† Comparison

Metric 🦊 Fox 1.5 Nova (Qwen3B) Claude Mythos
Parameters ~3B ~200B+
Speed ~2.6 tok/s (4-bit) N/A (API only)
Size 2GB (4-bit) / 5.8GB (fp16) ~80GB
RAM Required ~8GB ~256GB
VRAM Required ~4GB N/A
Cost Free $5-25 / 1M tokens
Runs on CPU βœ… Yes ❌ No
Internet Required ❌ No βœ… Yes

πŸ“Š Benchmark Results

Test Case Tokens Time Speed
Prime checker 52 20.5s 2.5 tok/s
Binary search 88 33.3s 2.6 tok/s
Stack class 45 17.1s 2.6 tok/s
Quicksort 84 31.8s 2.6 tok/s
Fibonacci DP 72 27.5s 2.6 tok/s
Average - - 2.6 tok/s

Code Quality Examples

Prime checker output:

def is_prime(n):
    if n < 2:
        return False
    for i in range(2, int(n**0.5) + 1):
        if n % i == 0:
            return False
    return True

Binary search output:

def binary_search(arr, target):
    left, right = 0, len(arr) - 1
    while left <= right:
        mid = (left + right) // 2
        if arr[mid] == target:
            return mid
        elif arr[mid] < target:
            left = mid + 1
        else:
            right = mid - 1
    return -1

πŸ“Š Specs

Metric Value
Base Model Qwen2.5-3B-Instruct
Fine-tune Method QLoRA (4-bit NF4)
LoRA r 16
LoRA alpha 32
Max Length 1024 tokens
Trainable Params ~30M
Training Steps 250
Epochs 10

πŸ’» Hardware

  • Training: NVIDIA RTX 3050 (6GB VRAM) via QLoRA + Unsloth
  • Inference: ~4GB VRAM (4-bit) or 8GB+ RAM

πŸš€ Usage

from transformers import AutoTokenizer, AutoModelForCausalLM, BitsAndBytesConfig

model_name = "teolm30/Fox-1.5-Nova"
bnb_config = BitsAndBytesConfig(load_in_4bit=True, bnb_4bit_quant_type="nf4")
tokenizer = AutoTokenizer.from_pretrained(model_name, trust_remote_code=True)
model = AutoModelForCausalLM.from_pretrained(model_name, quantization_config=bnb_config, device_map="auto")

prompt = "Write a Python LRU cache"
messages = [{"role": "user", "content": prompt}]
text = tokenizer.apply_chat_template(messages, tokenize=False, add_generation_prompt=True)
inputs = tokenizer(text, return_tensors="pt").to("cuda")
outputs = model.generate(**inputs, max_new_tokens=512, do_sample=True, temperature=0.7)
print(tokenizer.decode(outputs[0], skip_special_tokens=True))

⚠️ Limitations

  • Speed limited to ~2.6 tok/s on 4-bit (faster at fp16 with more VRAM)
  • Smaller 3B model β€” optimized for local deployment on modest hardware
  • For larger 7B model, see teolm30/Fox-1.5-Nova-7B
  • No built-in tool-use (use OpenClaw agent framework)

πŸ”— Links


🦊 Built by FoxModelClaw agent for Teo's FoxOS development.

Downloads last month
1,215
Safetensors
Model size
4B params
Tensor type
F32
Β·
F16
Β·
U8
Β·
Inference Providers NEW
This model isn't deployed by any Inference Provider. πŸ™‹ Ask for provider support

Model tree for teolm30/Fox-1.5-Nova

Base model

Qwen/Qwen2.5-3B
Adapter
(1211)
this model