gte_compressed

Compact multilingual sentence encoder compressed from alibaba-NLP/gte-multilingual-base (26x compression).

Model Details

Property Value
Base model alibaba-NLP/gte-multilingual-base
Architecture new (encoder)
Hidden dim 384 (from 768)
Layers 4 (from 12)
Intermediate 1536
Attention heads 6
Vocab size 8,675 (from 250,048)
Parameters ~10.6M
Model size (FP32) 48.8MB
Compression 26x
Distilled No

Quick Start

from sentence_transformers import SentenceTransformer

model = SentenceTransformer("gte_compressed", trust_remote_code=True)

sentences = [
    "Hello, how are you?",
    "์•ˆ๋…•ํ•˜์„ธ์š”, ์ž˜ ์ง€๋‚ด์„ธ์š”?",
    "ใ“ใ‚“ใซใกใฏใ€ๅ…ƒๆฐ—ใงใ™ใ‹๏ผŸ",
    "ไฝ ๅฅฝ๏ผŒไฝ ๅฅฝๅ—๏ผŸ",
]

embeddings = model.encode(sentences)
print(embeddings.shape)  # (4, 384)

MTEB Evaluation Results

Overall Average: 29.76%

Task Group Average
Classification 34.67%
Clustering 27.32%
STS 27.56%

Classification

Task Average Details
AmazonCounterfactualClassification 56.65% en: 60.42%, en-ext: 59.21%, ja: 53.79%, de: 53.19%
Banking77Classification 23.46% default: 23.46%
ImdbClassification 53.23% default: 53.23%
MTOPDomainClassification 29.15% es: 33.54%, th: 29.9%, en: 29.72%, hi: 29.19%, de: 28.41%
MassiveIntentClassification 12.03% zh-CN: 22.47%, en: 20.25%, id: 18.92%, vi: 18.23%, tr: 17.5%
MassiveScenarioClassification 15.96% zh-CN: 26.89%, en: 22.5%, id: 21.64%, ms: 21.62%, es: 21.45%
ToxicConversationsClassification 49.59% default: 49.59%
TweetSentimentExtractionClassification 37.27% default: 37.27%

Clustering

Task Average Details
ArXivHierarchicalClusteringP2P 46.6% default: 46.6%
ArXivHierarchicalClusteringS2S 46.16% default: 46.16%
BiorxivClusteringP2P.v2 9.23% default: 9.23%
MedrxivClusteringP2P.v2 19.99% default: 19.99%
MedrxivClusteringS2S.v2 18.6% default: 18.6%
StackExchangeClustering.v2 38.67% default: 38.67%
StackExchangeClusteringP2P.v2 31.97% default: 31.97%
TwentyNewsgroupsClustering.v2 7.32% default: 7.32%

STS

Task Average Details
BIOSSES 14.57% default: 14.57%
SICK-R 39.12% default: 39.12%
STS12 33.18% default: 33.18%
STS13 33.48% default: 33.48%
STS14 30.91% default: 30.91%
STS15 36.95% default: 36.95%
STS17 13.25% en-en: 47.75%, es-es: 45.85%, ar-ar: 28.57%, ko-ko: 25.03%, en-ar: 12.16%
STS22.v2 10.85% zh: 41.25%, it: 35.25%, ar: 32.81%, es: 32.76%, tr: 18.57%
STSBenchmark 35.75% default: 35.75%

Training

Created via multi-method model compression (no additional training):

  1. Teacher: alibaba-NLP/gte-multilingual-base (12L, 768d, 277M params)
  2. Layer pruning: 12 โ†’ 4 layers (uniform selection)
  3. Hidden dim: 768 โ†’ 384
  4. Vocab pruning: 250,048 โ†’ 8,675 (90% cumulative frequency)
  5. Compression ratio: 26x

Supported Languages (18)

ko, en, ja, zh, es, fr, de, pt, it, ru, ar, hi, th, vi, id, tr, nl, pl

Downloads last month
7
Safetensors
Model size
12.8M params
Tensor type
F32
ยท
Inference Providers NEW
This model isn't deployed by any Inference Provider. ๐Ÿ™‹ Ask for provider support