Quantization Python - Search News

Improving Post-Training Quantization via Probabilistic Programming

Abstract: Post-training quantization (PTQ) is an effective solution for deploying deep neural networks on edge devices with limited resources. PTQ is especially attractive because it does not require ...

IEEE

COMQ: A Backpropagation-Free Algorithm for Post-Training Quantization

Abstract: Post-training quantization (PTQ) has emerged as a practical approach to compress large neural networks, making them highly efficient for deployment. However, effectively reducing these ...

InfoWorld

Google’s Gemma 4 shines on local systems – both big and small

We tried out Google’s new family of multi-modal models with variants compact enough to work on local devices. They work well.

GitHub

APEX -- Adaptive Precision for EXpert Models

Beats Q8_0 perplexity at half the size -- and even beats F16. APEX outperforms Unsloth Dynamic 2.0 (UD) quantizations on perplexity, HellaSwag, and inference speed while being 2x smaller: APEX ...

Some results have been hidden because they may be inaccessible to you

Show inaccessible results