About 25,800 results
Open links in new tab
  1. Model Quantization: Concepts, Methods, and Why It Matters

    Nov 24, 2025 · Model quantization makes it possible to deploy increasingly complex deep learning models in resource-constrained environments without sacrificing significant model …

  2. What is Quantization - GeeksforGeeks

    Nov 6, 2025 · Quantization is a model optimization technique that reduces the precision of numerical values such as weights and activations in models to make them faster and more …

  3. Model optimization techniques - AWS Prescriptive Guidance

    Learn about optimization techniques to improve gen AI model performance such as pruning, quantization, model compilation, speculative decoding, and artifact storage.

  4. A Visual Guide to Quantization - by Maarten Grootendorst

    Jul 22, 2024 · In this post, I will introduce the field of quantization in the context of language modeling and explore concepts one by one to develop an intuition about the field. We will …

  5. Quantization - Hugging Face

    Quantization is a technique to reduce the computational and memory costs of running inference by representing the weights and activations with low-precision data types like 8-bit integer …

  6. Model Quantization 1: Basic Concepts | by Florian June | Medium

    Oct 24, 2023 · Quantization of deep learning models is a memory optimization technique that reduces memory space by sacrificing some accuracy. In the era of large language models, …

  7. Quantization — PyTorch 2.9 documentation

    Oct 9, 2019 · The Quantization API Reference contains documentation of quantization APIs, such as quantization passes, quantized tensor operations, and supported quantized modules and …

  8. Post-training quantization | TensorFlow Model Optimization

    Aug 3, 2022 · Improve latency, processing, and power usage, and get access to integer-only hardware accelerators by making sure both weights and activations are quantized. This …

  9. Model Quantization: Deep Learning Optimization | Ultralytics

    Model quantization is a transformative technique in machine learning designed to reduce the computational and memory costs of running neural networks.

  10. Quantization Tutorial in TensorFlow for ML Models

    Jul 23, 2025 · What is Quantization in Machine Learning? Quantization in machine learning refers to the process of reducing the precision of a model's weights and activations from floating …