Resemble Enhance FP16 Quantized
FP16 (half-precision) quantized version of Resemble Enhance for mobile deployment.
Model Information
- Original Model: ResembleAI/resemble-enhance
- Quantization: FP16 (half precision)
- Size Reduction: 50% (from FP32)
- Parameters: 356,414,076
- Model Size: 679.81 MB
Usage
This FP16 quantized model is optimized for:
- iOS devices: Compatible with Apple Neural Engine (ANE)
- Mobile deployment: Reduced memory footprint
- Faster inference: 2-3x faster than FP32 on supported hardware
Loading the Model
import torch
# Load FP16 state dict
state_dict = torch.load("mp_rank_00_model_states_fp16.pt", map_location="cpu")
# Load into model (model must be converted to FP16 first)
model = YourResembleEnhanceModel()
model = model.half() # Convert to FP16
model.load_state_dict(state_dict)
model.eval()
Conversion to CoreML (iOS)
For iOS deployment, convert to CoreML:
import coremltools as ct
# Convert PyTorch model to CoreML
mlmodel = ct.convert(
model,
inputs=[ct.TensorType(name="input", shape=input_shape)],
minimum_deployment_target=ct.target.iOS16
)
# Save as .mlmodel
mlmodel.save("ResembleEnhanceFP16.mlmodel")
Performance
- Size: 679.81 MB (50% reduction from FP32)
- Inference Speed: 2-3x faster on Apple Neural Engine
- Quality: Minimal perceptual loss compared to FP32
Original Model
This is a quantized version of ResembleAI/resemble-enhance.
For more information about the original model, please refer to the original repository.
License
This model follows the same license as the original Resemble Enhance model.
Model tree for aoiandroid/resemble-enhance-fp16-quantized
Base model
ResembleAI/resemble-enhance