pplx-embed-v1-0.6b-mlx

MLX conversion of perplexity-ai/pplx-embed-v1-0.6b for Apple Silicon.

This is a standard embedding model. It takes a list of texts and returns one embedding matrix for the batch.

Important Loading Note

This artifact is not loadable through vanilla mlx_lm.load() because MLX-LM does not natively support Perplexity's custom bidirectional_pplx_qwen3 model type. The repository includes a small pplx_mlx_convert loader package for this artifact.

Source Code

Conversion and validation code lives in https://github.com/thehumanworks/pplx-mlx.

Install

pip install mlx mlx-lm transformers huggingface_hub numpy

Usage

import sys
from huggingface_hub import snapshot_download

repo_path = snapshot_download("agentmish/pplx-embed-v1-0.6b-mlx")
sys.path.insert(0, repo_path)

from pplx_mlx_convert import load_embedder

embedder = load_embedder(repo_path)
texts = [
    "Scientists explore the universe driven by curiosity.",
    "Children learn through curious exploration.",
    "Historical discoveries began with curious questions.",
]

embeddings = embedder.encode(texts)
print(embeddings.shape)  # (3, 1024)
print(embeddings.dtype)  # int8

The model natively produces unnormalized int8 embeddings by default. Use cosine similarity for comparison. embedder.encode(..., quantization="none") returns float32 pooled embeddings, and embedder.encode(..., quantization="binary") returns binary tanh embeddings.

Conversion Details

  • Source model: perplexity-ai/pplx-embed-v1-0.6b
  • Source revision: see conversion.json
  • Converted dtype: bfloat16
  • Embedding dimension: 1024
  • Output root expected by this workspace: artifacts/mlx/pplx-embed-v1-0.6b

Validation

Local MLX smoke validation passed with finite raw float embeddings and int8 embedding output shapes [[2, 1024]].

Compared against the original Transformers remote-code float32 model on sample text inputs:

  • cosine similarities: 0.9997950, 0.9997987, 0.9997995
  • int8 delta: max absolute int8 delta 1; mean absolute int8 delta 0.191

The MLX artifact is bfloat16 while the reference path used float32, so int8 values are not expected to be bit-identical.

License

The source model is MIT licensed. This conversion preserves the MIT license.

Downloads last month
92
Safetensors
Model size
0.6B params
Tensor type
BF16
·
MLX
Hardware compatibility
Log In to add your hardware

Quantized

Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for agentmish/pplx-embed-v1-0.6b-mlx

Finetuned
(4)
this model