Project ARGUS — LFM2.5-VL Military Satellite Detection Adapter

Autonomous Reconnaissance & Ground Understanding System Hackathon: Liquid AI x DPhi Space "AI in Space"

Overview

This is a LoRA adapter fine-tuned on top of LiquidAI/LFM2.5-VL-450M for military object detection in satellite imagery. It enables the base VLM to output structured JSON tactical reports directly from overhead reconnaissance images — replacing traditional multi-stage YOLO detection pipelines with a single unified inference pass.

Key Capabilities

Capability Description
Military Vehicle Detection Tanks, APCs, trucks, artillery, civilian vehicles
Aerial Asset Detection Aircraft, helicopters, UAVs at airfields
Naval Detection Ships, submarines, harbor installations
Infrastructure Analysis Bridges, storage tanks, port cranes, helipads
Threat Assessment LOW / MEDIUM / HIGH classification per target
Tactical Reasoning Natural language assessment for each detection

Training Details

  • Base Model: LiquidAI/LFM2.5-VL-450M
  • Method: QLoRA (4-bit) via Unsloth
  • LoRA Config: r=16, alpha=32, all linear layers
  • Trainable Parameters: 1,376,256 / 450,095,104 (0.31%)
  • Training Data: 3,512 samples (MVRSD military vehicles + DOTA aerial objects)
  • Epochs: 3 (1,317 steps)
  • Final Loss: 0.4017
  • Hardware: NVIDIA T4 GPU
  • Training Time: ~60 minutes

Datasets

Dataset Samples Source
MVRSD 12 (demo) Military Vehicle Remote Sensing Dataset
DOTA 3,500 Large-scale aerial object detection

Usage

With PEFT (recommended)

from transformers import AutoProcessor, AutoModelForImageTextToText
from peft import PeftModel
from PIL import Image

# Load base + adapter
base = AutoModelForImageTextToText.from_pretrained(
    "LiquidAI/LFM2.5-VL-450M",
    device_map="auto",
    torch_dtype="auto",
)
model = PeftModel.from_pretrained(base, "johnny711/argus-lfm-lora")
model = model.merge_and_unload()  # merge for faster inference

processor = AutoProcessor.from_pretrained("LiquidAI/LFM2.5-VL-450M")

# Run detection
image = Image.open("satellite_image.jpg")
prompt = """You are an orbital intelligence analyst examining satellite imagery \
from a defense reconnaissance satellite at ~800 km altitude.

Detect ALL military-relevant objects visible in this image. For each object, provide:
- "label": specific type of object
- "bbox": normalized bounding box [x1, y1, x2, y2] in [0,1]
- "threat_level": "LOW", "MEDIUM", or "HIGH"
- "confidence": 0.0 to 1.0
- "reasoning": brief tactical assessment

Return a JSON array. If no targets visible, return: []"""

messages = [{"role": "user", "content": [
    {"type": "image", "image": image},
    {"type": "text", "text": prompt},
]}]

inputs = processor.apply_chat_template(
    messages, add_generation_prompt=True,
    return_tensors="pt", return_dict=True, tokenize=True,
).to(model.device)

outputs = model.generate(**inputs, max_new_tokens=512, temperature=0.1)
new_tokens = outputs[:, inputs["input_ids"].shape[1]:]
result = processor.batch_decode(new_tokens, skip_special_tokens=True)[0]
print(result)

Example Output

[
  {
    "label": "Small Military Vehicle",
    "bbox": [0.0, 0.3438, 0.0645, 0.0664],
    "threat_level": "LOW",
    "confidence": 0.85,
    "reasoning": "Small Military Vehicle detected near tree cover, likely concealed staging area"
  },
  {
    "label": "Naval Vessel",
    "bbox": [0.0547, 0.5625, 0.0664, 0.0527],
    "threat_level": "HIGH",
    "confidence": 0.82,
    "reasoning": "Naval Vessel visible in desert terrain, limited concealment"
  }
]

Project ARGUS Architecture

Satellite Image (GigaPixel)
        |
  [Phase 1] LFM2.5-VL + LoRA  -->  JSON detections (this model)
        |
  [Phase 2] Depth Anything 3   -->  3D reality check (decoy filtering)
        |
  [Phase 3] Report Assembly    -->  Tactical JSON downlink (bytes, not GB)

The Problem: Military satellites capture massive images but have limited downlink bandwidth. Sending gigabytes of raw imagery to ground stations wastes hours.

Our Solution: Run AI at the edge (in orbit). This adapter enables a 450M-parameter VLM to perform unified detection, classification, and tactical reasoning in a single inference pass — producing a tiny JSON report instead of raw imagery.

Developed by

License

Apache 2.0


This lfm2_vl model was trained 2x faster with Unsloth

Downloads last month

-

Downloads are not tracked for this model. How to track
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for johnny711/argus-lfm-lora

Adapter
(8)
this model

Dataset used to train johnny711/argus-lfm-lora