Flystral β LoRA Fine-tuned Ministral 3B for Drone Flight Control
LoRA adapter for real-time drone telemetry prediction from camera images, built for the Louise AI Safety Drone Escort system.
What it does
Given a drone camera frame, the model outputs a telemetry vector (velocity, orientation, altitude adjustments) that drives autonomous flight control. This enables the drone to react to visual obstacles and environmental conditions in real-time during pedestrian escort missions.
Training
| Parameter | Value |
|---|---|
| Base model | mistralai/Ministral-3-3B-Instruct-2512-BF16 |
| Method | LoRA (PEFT) |
| LoRA rank (r) | 4 |
| LoRA alpha | 8 |
| Target modules | q_proj, v_proj |
| Task type | CAUSAL_LM |
| Steps | 500 |
| Learning rate | 2e-4 |
| Gradient accumulation | 8 |
| Grad clipping | 0.3 |
| Precision | bfloat16 |
| Hardware | Google Colab T4 GPU (15 GB VRAM) |
| Training time | ~35 minutes |
| PEFT version | 0.18.1 |
Dataset
AirSim RGB+Depth Drone Flight 10K β 1,000 RGB frames (320Γ320) from Microsoft AirSim simulator, each paired with a numpy telemetry array containing velocity/orientation data.
Each training example pairs a drone camera image with a telemetry vector (50 float values) representing the drone's state. The model learns to predict these vectors from visual input.
Training loss
Step 64/500 loss=10.6414
Step 128/500 loss=9.5537
Step 192/500 loss=7.0885
Step 256/500 loss=4.6498
Step 320/500 loss=3.1225
Step 384/500 loss=2.4410
Step 448/500 loss=1.9873
Step 500/500 loss=1.7251
Loss decreased from 10.6 β 1.7 over 500 steps, confirming the adapter learned to map visual features to telemetry predictions.
Usage
import torch
from transformers import AutoProcessor, Mistral3ForConditionalGeneration
from peft import PeftModel
from PIL import Image
processor = AutoProcessor.from_pretrained("mistralai/Ministral-3-3B-Instruct-2512-BF16")
model = Mistral3ForConditionalGeneration.from_pretrained(
"mistralai/Ministral-3-3B-Instruct-2512-BF16",
torch_dtype=torch.bfloat16,
)
model = PeftModel.from_pretrained(model, "BenBarr/flystral")
model = model.merge_and_unload().cuda().eval()
img = Image.open("drone_frame.jpg").convert("RGB")
messages = [{"role": "user", "content": [
{"type": "image"},
{"type": "text", "text": "Output the raw telemetry for this frame."},
]}]
text = processor.apply_chat_template(messages, add_generation_prompt=True)
inputs = processor(text=text, images=[img], return_tensors="pt").to("cuda")
with torch.no_grad():
output_ids = model.generate(**inputs, max_new_tokens=200, do_sample=False)
result = processor.decode(output_ids[0][inputs["input_ids"].shape[-1]:], skip_special_tokens=True)
print(result) # Telemetry vector: vx, vy, vz, yaw_rate, ...
Architecture
The adapter sits in the Louise multi-agent drone escort system:
- Flystral (this model) β flight control from camera images
- Helpstral β safety/threat assessment from camera images (Pixtral 12B)
- Louise β conversational safety companion (Ministral 3B)
When the fine-tuned endpoint is available, Flystral uses this adapter. When offline, it falls back to agentic mode on the base Ministral 3B via the Mistral API with function calling.
Developed by
Ben Barrett β Mistral Worldwide Hackathon 2026
Model tree for BenBarr/flystral
Base model
mistralai/Ministral-3-3B-Base-2512