rally-2b-v2

Browser-oriented ONNX export of a Gemma 4 Heretic checkpoint packaged for WebGPU / Transformers.js.

Capabilities

  • Supported inputs: text, image, audio, video

Version Notes

  • This is the enhanced browser v2 package for this model family.
  • Compared to the v1 package at thomasjvu/rally-2b, it adds support for audio, video.
  • The lighter v1 package remains available at thomasjvu/rally-2b if you only need text, image.

Provenance

  • Source model: /home/jovyan/work/heretic-to-onnx/build/phala_gpu_tee/rally-2b-direct/inputs/source
  • Base model for inherited processor assets: google/gemma-4-E2B-it
  • Architecture family: gemma4_conditional_generation
  • Expected architecture: Gemma4ForConditionalGeneration
  • Target dtype: q4f16
  • Target device: webgpu

Expected ONNX Sessions

  • vision_encoder_q4f16.onnx
  • audio_encoder_q4f16.onnx
  • embed_tokens_q4f16.onnx
  • decoder_model_merged_q4f16.onnx

Usage

Load this repo with Transformers.js using the model's transformers.js_config metadata and WebGPU backend.

Downloads last month
13
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support