Model Quantization

About 25,800 results

Open links in new tab

Any time

nvidia.com
https://developer.nvidia.com › blog › model...
Model Quantization: Concepts, Methods, and Why It Matters
Nov 24, 2025 · Model quantization makes it possible to deploy increasingly complex deep learning models in resource-constrained environments without sacrificing significant model …
geeksforgeeks.org
https://www.geeksforgeeks.org › deep-learning › ...
What is Quantization - GeeksforGeeks
Nov 6, 2025 · Quantization is a model optimization technique that reduces the precision of numerical values such as weights and activations in models to make them faster and more …
amazon.com
https://docs.aws.amazon.com › ... › model-optimization.html
Model optimization techniques - AWS Prescriptive Guidance
Learn about optimization techniques to improve gen AI model performance such as pruning, quantization, model compilation, speculative decoding, and artifact storage.
maartengrootendorst.com
https://newsletter.maartengrootendorst.com › a...
A Visual Guide to Quantization - by Maarten Grootendorst
Jul 22, 2024 · In this post, I will introduce the field of quantization in the context of language modeling and explore concepts one by one to develop an intuition about the field. We will …
huggingface.co
https://huggingface.co › ... › concept_guides › quantization
Quantization - Hugging Face
Quantization is a technique to reduce the computational and memory costs of running inference by representing the weights and activations with low-precision data types like 8-bit integer …
medium.com
https://medium.com › @florian_algo
Model Quantization 1: Basic Concepts | by Florian June | Medium
Oct 24, 2023 · Quantization of deep learning models is a memory optimization technique that reduces memory space by sacrificing some accuracy. In the era of large language models, …
pytorch.org
https://docs.pytorch.org › docs › stable › quantization
Quantization — PyTorch 2.9 documentation
Oct 9, 2019 · The Quantization API Reference contains documentation of quantization APIs, such as quantization passes, quantized tensor operations, and supported quantized modules and …
tensorflow.org
https://www.tensorflow.org › model_optimization › ...
Post-training quantization | TensorFlow Model Optimization
Aug 3, 2022 · Improve latency, processing, and power usage, and get access to integer-only hardware accelerators by making sure both weights and activations are quantized. This …
ultralytics.com
https://www.ultralytics.com › glossary › model-quantization
Model Quantization: Deep Learning Optimization | Ultralytics
Model quantization is a transformative technique in machine learning designed to reduce the computational and memory costs of running neural networks.
geeksforgeeks.org
https://www.geeksforgeeks.org › deep-learning › ...
Quantization Tutorial in TensorFlow for ML Models
Jul 23, 2025 · What is Quantization in Machine Learning? Quantization in machine learning refers to the process of reducing the precision of a model's weights and activations from floating …

Some results have been removed
Pagination
- 1
- 2
- 3
- Next

Model Quantization: Concepts, Methods, and Why It Matters

What is Quantization - GeeksforGeeks

Model optimization techniques - AWS Prescriptive Guidance

A Visual Guide to Quantization - by Maarten Grootendorst

Quantization - Hugging Face

Model Quantization 1: Basic Concepts | by Florian June | Medium

Quantization — PyTorch 2.9 documentation

Post-training quantization | TensorFlow Model Optimization

Model Quantization: Deep Learning Optimization | Ultralytics

Quantization Tutorial in TensorFlow for ML Models