From 5526644bb8851252609a1b1050708347596ccb8e Mon Sep 17 00:00:00 2001 From: cyndwith Date: Wed, 11 Sep 2024 15:45:40 -0700 Subject: [PATCH 1/3] Reorganizing the documentation for Quark v0.2.0 - Moving getting started, user guide, example, APIs to PyTorch and ONNX frameworks - Formating the landing page to have overview --- docs/index.rst | 35 ++++++++++++++++++++------- docs/onnx/index.rst | 20 +++++++++++++++ docs/onnx/onnx_adv_features.rst | 20 +++++++++++++++ docs/onnx/onnx_apis.rst | 24 ++++++++++++++++++ docs/onnx/onnx_examples.rst | 17 +++++++++++++ docs/onnx/user_guide.rst | 27 +++++++++++++++++++++ docs/onnx_overview.rst | 19 +++++++++++++++ docs/pytorch/index.rst | 23 ++++++++++++++++++ docs/pytorch/pytorch_adv_features.rst | 20 +++++++++++++++ docs/pytorch/pytorch_apis.rst | 21 ++++++++++++++++ docs/pytorch/pytorch_examples.rst | 18 ++++++++++++++ docs/pytorch/user_guide.rst | 26 ++++++++++++++++++++ docs/pytorch_overview.rst | 18 ++++++++++++++ 13 files changed, 279 insertions(+), 9 deletions(-) create mode 100644 docs/onnx/index.rst create mode 100644 docs/onnx/onnx_adv_features.rst create mode 100644 docs/onnx/onnx_apis.rst create mode 100644 docs/onnx/onnx_examples.rst create mode 100644 docs/onnx/user_guide.rst create mode 100644 docs/onnx_overview.rst create mode 100644 docs/pytorch/index.rst create mode 100644 docs/pytorch/pytorch_adv_features.rst create mode 100644 docs/pytorch/pytorch_apis.rst create mode 100644 docs/pytorch/pytorch_examples.rst create mode 100644 docs/pytorch/user_guide.rst create mode 100644 docs/pytorch_overview.rst diff --git a/docs/index.rst b/docs/index.rst index f130f06..41c6d41 100644 --- a/docs/index.rst +++ b/docs/index.rst @@ -9,16 +9,33 @@ Welcome to Quark's documentation! .. toctree:: :maxdepth: 1 - - What's New - Quark Overview + :caption: Release Notes + + Release V0.2.0 + +.. toctree:: + :maxdepth: 1 + :caption: Getting Started + Installation - Getting Started - Highlight Features - User Guide - APIs - Examples - Release Note + Quark Overview + +.. toctree:: + :maxdepth: 1 + :caption: PyTorch + + Quark with PyTorch + +.. toctree:: + :maxdepth: 1 + :caption: ONNX + + Quark with ONNX + +.. toctree:: + :maxdepth: 1 + :caption: FAQ + FAQ .. diff --git a/docs/onnx/index.rst b/docs/onnx/index.rst new file mode 100644 index 0000000..76fe8d5 --- /dev/null +++ b/docs/onnx/index.rst @@ -0,0 +1,20 @@ +Quark with ONNX! +=================== + +.. toctree:: + :maxdepth: 1 + + Getting Started + User Guide + Examples + APIs + Advanced Features + +.. + ------------ + + ##################################### + License + ##################################### + + Quark is licensed under MIT License. Refer to the LICENSE file for the full license text and copyright notice. diff --git a/docs/onnx/onnx_adv_features.rst b/docs/onnx/onnx_adv_features.rst new file mode 100644 index 0000000..a4e1177 --- /dev/null +++ b/docs/onnx/onnx_adv_features.rst @@ -0,0 +1,20 @@ +Advanced Features +================== + +This page introduces some key features of Quark. Please refere to the +`user guide <./user_guide.html>`__ for the more details of other features +of Quark. + + +Quark for ONNX +-------------- + +- `AdaRound and AdaQuant <./tutorial_adaround_adaquant.html>`__ +- `Mixed Precision <./tutorial_mix_precision.html>`__ + +.. raw:: html + + diff --git a/docs/onnx/onnx_apis.rst b/docs/onnx/onnx_apis.rst new file mode 100644 index 0000000..dd7448d --- /dev/null +++ b/docs/onnx/onnx_apis.rst @@ -0,0 +1,24 @@ +Quark APIs for ONNX +=================== + +**User facing APIs:** + +.. toctree:: + :maxdepth: 2 + + Quantization <../autoapi/quark/onnx/quantization/api/index.rst> + Optimization <../autoapi/quark/onnx/optimize/index.rst> + Calibration <../autoapi/quark/onnx/calibrate/index.rst> + ONNX Quantizer <../autoapi/quark/onnx/onnx_quantizer/index.rst> + QDQ Quantizer <../autoapi/quark/onnx/qdq_quantizer/index.rst> + Configuration <../autoapi/quark/onnx/quantization/config/config/index.rst> + Quantization Utilities <../autoapi/quark/onnx/quant_utils/index.rst> + +.. + ------------ + + ##################################### + License + ##################################### + + Quark is licensed under MIT License. Refer to the LICENSE file for the full license text and copyright notice. \ No newline at end of file diff --git a/docs/onnx/onnx_examples.rst b/docs/onnx/onnx_examples.rst new file mode 100644 index 0000000..6b90549 --- /dev/null +++ b/docs/onnx/onnx_examples.rst @@ -0,0 +1,17 @@ +Examples +======== + +Examples to run Quark for ONNX. + +* `Image Classification Quantization <../quark_example_onnx_image_classification_gen.html>`__ +* `Fast Finetune AdaRound <../quark_examples_onnx_adaround_gen.html>`__ +* `Fast Finetune AdaQuant <../quark_example_onnx_adaquant_gen.html>`__ +* `Mixed Precision <../quark_onnx_example_mixed_precision_gen.html>`__ +* `Cross-Layer Equalization <../quark_example_onnx_cle_gen.html>`__ + +.. raw:: html + + diff --git a/docs/onnx/user_guide.rst b/docs/onnx/user_guide.rst new file mode 100644 index 0000000..8de0931 --- /dev/null +++ b/docs/onnx/user_guide.rst @@ -0,0 +1,27 @@ +Quark for ONNX +============== + +There are several steps to quantize a floating-point model with +``Quark for ONNX``: + +1. Load original float model +2. Set quantization configuration +3. Define datareader +4. Use the Quark API to perform in-place replacement of the model's modules with quantized module. + +More details: + +* `Configuring Quark for ONNX <./onnx/user_guide_config_description.html>`__ +* `Adding Calibration Datasets <./onnx/user_guide_datareader.html>`__ +* `Feature Description <./onnx/user_guide_feature_description.html>`__ +* `Supported Datatype and OpType <./onnx/user_guide_supported_optype_datatype.html>`__ +* `Accuracy Improvement <./onnx/user_guide_accuracy_improvement.html>`__ +* `Optional Utilities <./onnx/user_guide_optional_utilities.html>`__ +* `Tools <./onnx/user_guide_tools.html>`__ + +.. raw:: html + + diff --git a/docs/onnx_overview.rst b/docs/onnx_overview.rst new file mode 100644 index 0000000..6b45d5f --- /dev/null +++ b/docs/onnx_overview.rst @@ -0,0 +1,19 @@ +ONNX +==== + +.. toctree:: + :maxdepth: 1 + + Getting Started + User Guide + Examples + APIs + Advanced Features + + +.. raw:: html + + \ No newline at end of file diff --git a/docs/pytorch/index.rst b/docs/pytorch/index.rst new file mode 100644 index 0000000..edd3056 --- /dev/null +++ b/docs/pytorch/index.rst @@ -0,0 +1,23 @@ +Quark with PyTorch! +=================== + +**Quark** is a deep learning model quantization toolkit for quantizing models from PyTorch, ONNX and other frameworks. +It provides easy-to-use APIs for quantization and more advanced features than native frameworks, in support for multiple HW backends. + +.. toctree:: + :maxdepth: 1 + + Getting Started + User Guide + Examples + APIs + Advanced Features + +.. + ------------ + + ##################################### + License + ##################################### + + Quark is licensed under MIT License. Refer to the LICENSE file for the full license text and copyright notice. diff --git a/docs/pytorch/pytorch_adv_features.rst b/docs/pytorch/pytorch_adv_features.rst new file mode 100644 index 0000000..6d12364 --- /dev/null +++ b/docs/pytorch/pytorch_adv_features.rst @@ -0,0 +1,20 @@ +Advanced Features +================== + +This page introduces some key features of Quark. Please refere to the +`user guide <./user_guide.html>`__ for the more details of other features +of Quark. + +Quark for PyTorch +----------------- + +- `Bridge from Quark to llama.cpp <./tutorial_gguf.html>`__ +- `Using MX (Microscaling) with Quark <./tutorial_mx.html>`__ + + +.. raw:: html + + diff --git a/docs/pytorch/pytorch_apis.rst b/docs/pytorch/pytorch_apis.rst new file mode 100644 index 0000000..6b5a06e --- /dev/null +++ b/docs/pytorch/pytorch_apis.rst @@ -0,0 +1,21 @@ +Quark APIs for PyTorch +====================== + +**User facing APIs:** + +.. toctree:: + :maxdepth: 1 + + Quantization <../autoapi/quark/torch/quantization/api/index.rst> + Export <../autoapi/quark/torch/export/api/index.rst> + Quantizer Configuration <../autoapi/quark/torch/quantization/config/config/index.rst> + Exporter Configuration <../autoapi/quark/torch/export/config/config/index.rst> + +.. + ------------ + + ##################################### + License + ##################################### + + Quark is licensed under MIT License. Refer to the LICENSE file for the full license text and copyright notice. \ No newline at end of file diff --git a/docs/pytorch/pytorch_examples.rst b/docs/pytorch/pytorch_examples.rst new file mode 100644 index 0000000..90331a3 --- /dev/null +++ b/docs/pytorch/pytorch_examples.rst @@ -0,0 +1,18 @@ +Examples +======== + +Examples to run Quark for Pytorch. + +* `Language Model Quantization & Export <../quark_example_torch_llm_gen.html>`__ +* `Diffusion Model Quantization & Export <../quark_example_torch_diffusers_gen.html>`__ +* `Vision Model Quantization using Quark FX Graph Mode <../quark_example_torch_vision_gen.html>`__ +* `Extension for Pytorch-light (AMD internal project) <../quark_example_torch_pytorch_light_gen.html>`__ +* `Extension for Brevitas <../quark_example_torch_brevitas_gen.html>`__ + + +.. raw:: html + + diff --git a/docs/pytorch/user_guide.rst b/docs/pytorch/user_guide.rst new file mode 100644 index 0000000..727ce83 --- /dev/null +++ b/docs/pytorch/user_guide.rst @@ -0,0 +1,26 @@ +Quark for PyTorch +========== + +There are several steps to quantize a floating-point model with +``Quark for PyTorch``: + +1. Load original float model +2. Set quantization configuration +3. Define dataloader +4. Use the Quark API to perform in-place replacement of the model's modules with quantized module. +5. (Optional) Export quantized model to other format such as ONNX + +More details: + +* `Configuring Quark for PyTorch <./pytorch/user_guide_config_description.html>`__ +* `Adding Calibration Datasets <./pytorch/user_guide_dataloader.html>`__ +* `Exporting for ONNX & Json-Safetensors & GGUF <./pytorch/user_guide_exporting.html>`__ +* `Feature Description <./pytorch/user_guide_feature_description.html>`__ + + +.. raw:: html + + diff --git a/docs/pytorch_overview.rst b/docs/pytorch_overview.rst new file mode 100644 index 0000000..4600194 --- /dev/null +++ b/docs/pytorch_overview.rst @@ -0,0 +1,18 @@ +PyTorch +======= + +.. toctree:: + :maxdepth: 1 + + Getting Started + User Guide + Examples + APIs + Advanced Features + +.. raw:: html + + From 614e0c0d35a9951134f34cc4dbbad39f07dde095 Mon Sep 17 00:00:00 2001 From: cyndwith Date: Wed, 11 Sep 2024 16:59:20 -0700 Subject: [PATCH 2/3] clean-up and move files to pytorch/onnx directory - Removed some of the duplicate files --- docs/apis.rst | 43 ----------- docs/onnx/onnx_examples.rst | 10 +-- .../{ => onnx}/quark_example_onnx_cle_gen.rst | 0 ..._example_onnx_image_classification_gen.rst | 0 .../quark_examples_onnx_adaround_gen.rst | 0 ...quark_onnx_example_mixed_precision_gen.rst | 0 ..._onnx_image_classification_example_gen.rst | 0 docs/onnx/user_guide.rst | 14 ++-- docs/pytorch/pytorch_examples.rst | 8 +- .../quark_example_onnx_adaquant_gen.rst | 0 .../quark_example_torch_brevitas_gen.rst | 0 .../quark_example_torch_diffusers_gen.rst | 0 .../quark_example_torch_llm_gen.rst | 0 .../quark_example_torch_pytorch_light_gen.rst | 0 .../quark_example_torch_vision_gen.rst | 0 .../quark_torch_brevitas_example_gen.rst | 0 .../quark_torch_diffusers_example_gen.rst | 6 +- .../quark_torch_llm_example_gen.rst | 0 .../quark_torch_pytorch_light_example_gen.rst | 0 .../quark_torch_vision_example_gen.rst | 0 docs/pytorch/user_guide.rst | 10 +-- docs/tutorial.rst | 15 ---- docs/whats_new.rst | 73 ------------------- 23 files changed, 26 insertions(+), 153 deletions(-) delete mode 100644 docs/apis.rst rename docs/{ => onnx}/quark_example_onnx_cle_gen.rst (100%) rename docs/{ => onnx}/quark_example_onnx_image_classification_gen.rst (100%) rename docs/{ => onnx}/quark_examples_onnx_adaround_gen.rst (100%) rename docs/{ => onnx}/quark_onnx_example_mixed_precision_gen.rst (100%) rename docs/{ => onnx}/quark_onnx_image_classification_example_gen.rst (100%) rename docs/{ => pytorch}/quark_example_onnx_adaquant_gen.rst (100%) rename docs/{ => pytorch}/quark_example_torch_brevitas_gen.rst (100%) rename docs/{ => pytorch}/quark_example_torch_diffusers_gen.rst (100%) rename docs/{ => pytorch}/quark_example_torch_llm_gen.rst (100%) rename docs/{ => pytorch}/quark_example_torch_pytorch_light_gen.rst (100%) rename docs/{ => pytorch}/quark_example_torch_vision_gen.rst (100%) rename docs/{ => pytorch}/quark_torch_brevitas_example_gen.rst (100%) rename docs/{ => pytorch}/quark_torch_diffusers_example_gen.rst (98%) rename docs/{ => pytorch}/quark_torch_llm_example_gen.rst (100%) rename docs/{ => pytorch}/quark_torch_pytorch_light_example_gen.rst (100%) rename docs/{ => pytorch}/quark_torch_vision_example_gen.rst (100%) delete mode 100644 docs/tutorial.rst delete mode 100644 docs/whats_new.rst diff --git a/docs/apis.rst b/docs/apis.rst deleted file mode 100644 index 672e2b0..0000000 --- a/docs/apis.rst +++ /dev/null @@ -1,43 +0,0 @@ -Quark APIs -==== - -**User facing APIs:** - -Quark for Pytorch -~~~~~~~~~~~~~~~~~ - -.. toctree:: - :maxdepth: 1 - - Quantization - Export - Quantizer Configuration - Exporter Configuration - -.. - ------------ - -Quark for ONNX -~~~~~~~~~~~~~~~~~ - -.. toctree:: - :maxdepth: 1 - - Quantization - Optimization - Calibration - ONNX Quantizer - QDQ Quantizer - Configuration - Quantization Utilities - - - -.. - ------------ - - ##################################### - License - ##################################### - - Quark is licensed under MIT License. Refer to the LICENSE file for the full license text and copyright notice. \ No newline at end of file diff --git a/docs/onnx/onnx_examples.rst b/docs/onnx/onnx_examples.rst index 6b90549..40576b7 100644 --- a/docs/onnx/onnx_examples.rst +++ b/docs/onnx/onnx_examples.rst @@ -3,11 +3,11 @@ Examples Examples to run Quark for ONNX. -* `Image Classification Quantization <../quark_example_onnx_image_classification_gen.html>`__ -* `Fast Finetune AdaRound <../quark_examples_onnx_adaround_gen.html>`__ -* `Fast Finetune AdaQuant <../quark_example_onnx_adaquant_gen.html>`__ -* `Mixed Precision <../quark_onnx_example_mixed_precision_gen.html>`__ -* `Cross-Layer Equalization <../quark_example_onnx_cle_gen.html>`__ +* `Image Classification Quantization <./quark_example_onnx_image_classification_gen.html>`__ +* `Fast Finetune AdaRound <./quark_examples_onnx_adaround_gen.html>`__ +* `Fast Finetune AdaQuant <./quark_example_onnx_adaquant_gen.html>`__ +* `Mixed Precision <./quark_onnx_example_mixed_precision_gen.html>`__ +* `Cross-Layer Equalization <./quark_example_onnx_cle_gen.html>`__ .. raw:: html diff --git a/docs/quark_example_onnx_cle_gen.rst b/docs/onnx/quark_example_onnx_cle_gen.rst similarity index 100% rename from docs/quark_example_onnx_cle_gen.rst rename to docs/onnx/quark_example_onnx_cle_gen.rst diff --git a/docs/quark_example_onnx_image_classification_gen.rst b/docs/onnx/quark_example_onnx_image_classification_gen.rst similarity index 100% rename from docs/quark_example_onnx_image_classification_gen.rst rename to docs/onnx/quark_example_onnx_image_classification_gen.rst diff --git a/docs/quark_examples_onnx_adaround_gen.rst b/docs/onnx/quark_examples_onnx_adaround_gen.rst similarity index 100% rename from docs/quark_examples_onnx_adaround_gen.rst rename to docs/onnx/quark_examples_onnx_adaround_gen.rst diff --git a/docs/quark_onnx_example_mixed_precision_gen.rst b/docs/onnx/quark_onnx_example_mixed_precision_gen.rst similarity index 100% rename from docs/quark_onnx_example_mixed_precision_gen.rst rename to docs/onnx/quark_onnx_example_mixed_precision_gen.rst diff --git a/docs/quark_onnx_image_classification_example_gen.rst b/docs/onnx/quark_onnx_image_classification_example_gen.rst similarity index 100% rename from docs/quark_onnx_image_classification_example_gen.rst rename to docs/onnx/quark_onnx_image_classification_example_gen.rst diff --git a/docs/onnx/user_guide.rst b/docs/onnx/user_guide.rst index 8de0931..0bc6f17 100644 --- a/docs/onnx/user_guide.rst +++ b/docs/onnx/user_guide.rst @@ -11,13 +11,13 @@ There are several steps to quantize a floating-point model with More details: -* `Configuring Quark for ONNX <./onnx/user_guide_config_description.html>`__ -* `Adding Calibration Datasets <./onnx/user_guide_datareader.html>`__ -* `Feature Description <./onnx/user_guide_feature_description.html>`__ -* `Supported Datatype and OpType <./onnx/user_guide_supported_optype_datatype.html>`__ -* `Accuracy Improvement <./onnx/user_guide_accuracy_improvement.html>`__ -* `Optional Utilities <./onnx/user_guide_optional_utilities.html>`__ -* `Tools <./onnx/user_guide_tools.html>`__ +* `Configuring Quark for ONNX <./user_guide_config_description.html>`__ +* `Adding Calibration Datasets <./user_guide_datareader.html>`__ +* `Feature Description <./user_guide_feature_description.html>`__ +* `Supported Datatype and OpType <./user_guide_supported_optype_datatype.html>`__ +* `Accuracy Improvement <./user_guide_accuracy_improvement.html>`__ +* `Optional Utilities <./user_guide_optional_utilities.html>`__ +* `Tools <./user_guide_tools.html>`__ .. raw:: html diff --git a/docs/pytorch/pytorch_examples.rst b/docs/pytorch/pytorch_examples.rst index 90331a3..5df7d4e 100644 --- a/docs/pytorch/pytorch_examples.rst +++ b/docs/pytorch/pytorch_examples.rst @@ -3,10 +3,10 @@ Examples Examples to run Quark for Pytorch. -* `Language Model Quantization & Export <../quark_example_torch_llm_gen.html>`__ -* `Diffusion Model Quantization & Export <../quark_example_torch_diffusers_gen.html>`__ -* `Vision Model Quantization using Quark FX Graph Mode <../quark_example_torch_vision_gen.html>`__ -* `Extension for Pytorch-light (AMD internal project) <../quark_example_torch_pytorch_light_gen.html>`__ +* `Language Model Quantization & Export <./quark_example_torch_llm_gen.html>`__ +* `Diffusion Model Quantization & Export <./quark_example_torch_diffusers_gen.html>`__ +* `Vision Model Quantization using Quark FX Graph Mode <./quark_example_torch_vision_gen.html>`__ +* `Extension for Pytorch-light (AMD internal project) <./quark_example_torch_pytorch_light_gen.html>`__ * `Extension for Brevitas <../quark_example_torch_brevitas_gen.html>`__ diff --git a/docs/quark_example_onnx_adaquant_gen.rst b/docs/pytorch/quark_example_onnx_adaquant_gen.rst similarity index 100% rename from docs/quark_example_onnx_adaquant_gen.rst rename to docs/pytorch/quark_example_onnx_adaquant_gen.rst diff --git a/docs/quark_example_torch_brevitas_gen.rst b/docs/pytorch/quark_example_torch_brevitas_gen.rst similarity index 100% rename from docs/quark_example_torch_brevitas_gen.rst rename to docs/pytorch/quark_example_torch_brevitas_gen.rst diff --git a/docs/quark_example_torch_diffusers_gen.rst b/docs/pytorch/quark_example_torch_diffusers_gen.rst similarity index 100% rename from docs/quark_example_torch_diffusers_gen.rst rename to docs/pytorch/quark_example_torch_diffusers_gen.rst diff --git a/docs/quark_example_torch_llm_gen.rst b/docs/pytorch/quark_example_torch_llm_gen.rst similarity index 100% rename from docs/quark_example_torch_llm_gen.rst rename to docs/pytorch/quark_example_torch_llm_gen.rst diff --git a/docs/quark_example_torch_pytorch_light_gen.rst b/docs/pytorch/quark_example_torch_pytorch_light_gen.rst similarity index 100% rename from docs/quark_example_torch_pytorch_light_gen.rst rename to docs/pytorch/quark_example_torch_pytorch_light_gen.rst diff --git a/docs/quark_example_torch_vision_gen.rst b/docs/pytorch/quark_example_torch_vision_gen.rst similarity index 100% rename from docs/quark_example_torch_vision_gen.rst rename to docs/pytorch/quark_example_torch_vision_gen.rst diff --git a/docs/quark_torch_brevitas_example_gen.rst b/docs/pytorch/quark_torch_brevitas_example_gen.rst similarity index 100% rename from docs/quark_torch_brevitas_example_gen.rst rename to docs/pytorch/quark_torch_brevitas_example_gen.rst diff --git a/docs/quark_torch_diffusers_example_gen.rst b/docs/pytorch/quark_torch_diffusers_example_gen.rst similarity index 98% rename from docs/quark_torch_diffusers_example_gen.rst rename to docs/pytorch/quark_torch_diffusers_example_gen.rst index 78d8854..7229fc2 100644 --- a/docs/quark_torch_diffusers_example_gen.rst +++ b/docs/pytorch/quark_torch_diffusers_example_gen.rst @@ -40,6 +40,7 @@ Run with SDXL Without Quantization ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ - Run original SDXL: + -------------------------------------- .. code:: @@ -51,6 +52,7 @@ Calibration and Export SafeTensor ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ - Run Calibration: + -------------------------------------- .. code:: @@ -61,6 +63,7 @@ Load SafeTensor and Test ~~~~~~~~~~~~~~~~~~~~~~~~ - Load and Test: + -------------------------------------- .. code:: @@ -68,9 +71,10 @@ Load SafeTensor and Test python quantize_sdxl.py --input_scheme {'per-tensor'} --weight_scheme {'per-tensor', 'per-channel'} --test_data_tsv_file_path {your calibration dataset file path} --load --test Load SafeTensor and Run with a prompt -~~~~~~~~~~~~~~~~~~~~~~~~ +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ - Load and Run: + -------------------------------------- .. code:: diff --git a/docs/quark_torch_llm_example_gen.rst b/docs/pytorch/quark_torch_llm_example_gen.rst similarity index 100% rename from docs/quark_torch_llm_example_gen.rst rename to docs/pytorch/quark_torch_llm_example_gen.rst diff --git a/docs/quark_torch_pytorch_light_example_gen.rst b/docs/pytorch/quark_torch_pytorch_light_example_gen.rst similarity index 100% rename from docs/quark_torch_pytorch_light_example_gen.rst rename to docs/pytorch/quark_torch_pytorch_light_example_gen.rst diff --git a/docs/quark_torch_vision_example_gen.rst b/docs/pytorch/quark_torch_vision_example_gen.rst similarity index 100% rename from docs/quark_torch_vision_example_gen.rst rename to docs/pytorch/quark_torch_vision_example_gen.rst diff --git a/docs/pytorch/user_guide.rst b/docs/pytorch/user_guide.rst index 727ce83..c1b5b0e 100644 --- a/docs/pytorch/user_guide.rst +++ b/docs/pytorch/user_guide.rst @@ -1,5 +1,5 @@ Quark for PyTorch -========== +================= There are several steps to quantize a floating-point model with ``Quark for PyTorch``: @@ -12,10 +12,10 @@ There are several steps to quantize a floating-point model with More details: -* `Configuring Quark for PyTorch <./pytorch/user_guide_config_description.html>`__ -* `Adding Calibration Datasets <./pytorch/user_guide_dataloader.html>`__ -* `Exporting for ONNX & Json-Safetensors & GGUF <./pytorch/user_guide_exporting.html>`__ -* `Feature Description <./pytorch/user_guide_feature_description.html>`__ +* `Configuring Quark for PyTorch <./user_guide_config_description.html>`__ +* `Adding Calibration Datasets <./user_guide_dataloader.html>`__ +* `Exporting for ONNX & Json-Safetensors & GGUF <./user_guide_exporting.html>`__ +* `Feature Description <./user_guide_feature_description.html>`__ .. raw:: html diff --git a/docs/tutorial.rst b/docs/tutorial.rst deleted file mode 100644 index 8026cd5..0000000 --- a/docs/tutorial.rst +++ /dev/null @@ -1,15 +0,0 @@ -Tutorial -======== - -Quark for PyTorch ------------------ - -- `Bridge from Quark to llama.cpp <./pytorch/tutorial_gguf.html>`__ -- `Using MX (Microscaling) with Quark <./pytorch/tutorial_mx.html>`__ - -.. raw:: html - - diff --git a/docs/whats_new.rst b/docs/whats_new.rst deleted file mode 100644 index 60f8cb2..0000000 --- a/docs/whats_new.rst +++ /dev/null @@ -1,73 +0,0 @@ -What's New -========== - -New Features (Version 0.2.0) ----------------------------- - -- **Quark for PyTorch** - - - **PyTorch Quantizer Enhancements**: - - - Post Training Quantization (PTQ) and Quantization-Aware Training (QAT) are now supported in FX graph mode. - - Introduced quantization support of the following modules: torch.nn.Conv2d. - - - **Data Types**: - - - `OCP Microscaling (MX) is supported. Valid element data types include INT8, FP8_E4M3, FP4, FP6_E3M2, and FP6_E2M3. <./pytorch/tutorial_mx.html>`__ - - - **Export Capabilities**: - - - `Quantized models can now be exported in GGUF format. The exported GGUF model is runnable with llama.cpp. Only Llama2 is supported for now. <./pytorch/tutorial_gguf.html>`__ - - Introduced Quark's native Json-Safetensors export format, which is identical to AutoFP8 and AutoAWQ when used for FP8 and AWQ quantization. - - - **Model Support**: - - - Added support for SDXL model quantization in eager mode, including fp8 per-channel and per-tensor quantization. - - Added support for PTQ and QAT of CNN models in graph mode, including architectures like ResNet. - - - **Integration with other toolkits**: - - - Provided the integrated example with APL(AMD Pytorch-light,internal project name), supporting the invocation of APL's INT-K, BFP16, and BRECQ. - - Introduced the experimental Quark extension interface, enabling seamless integration of Brevitas for Stable Diffusion and Imagenet classification model quantization. - -- **Quark for ONNX** - - - **ONNX Quantizer Enhancements**: - - - Multiple optimization and refinement strategies for different deployment backends. - - Supported automatic mixing precision to balance accuracy and performance. - - - **Quantization Strategy**: - - - Supported symmetric and asymmetric quantization. - - Supported float scale, INT16 scale and power-of-two scale. - - Supported static quantization and weight-only quantization. - - - **Quantization Granularity**: - - - Supported for per-tensor and per-channel granularity. - - - **Data Types**: - - - Multiple data types are supported, including INT32/UINT32, Float16, Bfloat16, INT16/UINT16, INT8/UINT8 and BFP. - - - **Calibration Methods**: - - - MinMax, Entropy and Percentile for float scale. - - MinMax for INT16 scale. - - NonOverflow and MinMSE for power-of-two scale. - - - **Custom operations**: - - - "BFPFixNeuron" which supports block floating-point data type. - - "VitisQuantizeLinear" and "VitisDequantizeLinear" which support INT32/UINT32, Float16, Bfloat16, INT16/UINT16 quantization. - - "VitisInstanceNormalization" and "VitisLSTM" which have customized Bfloat16 kernels. - - All custom operations only support running on CPU. - - - **Advanced Quantization Algorithms**: - - - Supported CLE, BiasCorrection, AdaQuant, AdaRound and SmoothQuant. - - - **Operating System Support**: - - - Linux and Windows. From 24d1e27839533dc8bb90d34ce4f04632abce49f0 Mon Sep 17 00:00:00 2001 From: cyndwith Date: Wed, 11 Sep 2024 17:03:07 -0700 Subject: [PATCH 3/3] clean-up and remove the duplicate files for examples, getting started, user guide etc. --- docs/example.rst | 28 --------------------- docs/example_gen.rst | 32 ------------------------ docs/getting_started.rst | 14 ----------- docs/highlight_features.rst | 24 ------------------ docs/user_guide.rst | 49 ------------------------------------- 5 files changed, 147 deletions(-) delete mode 100644 docs/example.rst delete mode 100644 docs/example_gen.rst delete mode 100644 docs/getting_started.rst delete mode 100644 docs/highlight_features.rst delete mode 100644 docs/user_guide.rst diff --git a/docs/example.rst b/docs/example.rst deleted file mode 100644 index d031bc1..0000000 --- a/docs/example.rst +++ /dev/null @@ -1,28 +0,0 @@ -Examples -======== - -Quark for Pytorch ------------------ - -* `Language Model Quantization & Export <./quark_example_torch_llm_gen.html>`__ -* `Diffusion Model Quantization & Export <./quark_example_torch_diffusers_gen.html>`__ -* `Vision Model Quantization using Quark FX Graph Mode <./quark_example_torch_vision_gen.html>`__ -* `Extension for Pytorch-light (AMD internal project) <./quark_example_torch_pytorch_light_gen.html>`__ -* `Extension for Brevitas <./quark_example_torch_brevitas_gen.html>`__ - - -Quark for ONNX --------------- - -* `Image Classification Quantization <./quark_example_onnx_image_classification_gen.html>`__ -* `Fast Finetune AdaRound <./quark_examples_onnx_adaround_gen.html>`__ -* `Fast Finetune AdaQuant <./quark_example_onnx_adaquant_gen.html>`__ -* `Mixed Precision <./quark_onnx_example_mixed_precision_gen.html>`__ -* `Cross-Layer Equalization <./quark_example_onnx_cle_gen.html>`__ - -.. raw:: html - - diff --git a/docs/example_gen.rst b/docs/example_gen.rst deleted file mode 100644 index 1820fba..0000000 --- a/docs/example_gen.rst +++ /dev/null @@ -1,32 +0,0 @@ -Examples -======== - -Quark for Pytorch ------------------ - -- `Language Model Quantization & - Export <./quark_torch_llm_example_gen.html>`__ - -- `Diffusion Model Quantization & - Export <./quark_torch_diffusers_example_gen.html>`__ - -- `Vision Model Quantization using Quark FX Graph - Mode <./quark_torch_vision_example_gen.html>`__ - -- `Extension for Pytorch-light(AMD internal - project) <./quark_torch_pytorch_light_example_gen.html>`__ - -- `Extension for Brevitas <./quark_torch_brevitas_example_gen.html>`__ - -Quark for ONNX --------------- - -- `Image Classification - Quantization <./quark_onnx_image_classification_example_gen.html>`__ - -.. raw:: html - - diff --git a/docs/getting_started.rst b/docs/getting_started.rst deleted file mode 100644 index c9c16f5..0000000 --- a/docs/getting_started.rst +++ /dev/null @@ -1,14 +0,0 @@ -Getting Started -=============== - -This page will introduce how to run Quark for the first time. - -- `Getting Started with Quark for PyTorch <./pytorch/getting_started.html>`__ -- `Getting Started with Quark for ONNX <./onnx/getting_started.html>`__ - -.. raw:: html - - diff --git a/docs/highlight_features.rst b/docs/highlight_features.rst deleted file mode 100644 index 0a03e06..0000000 --- a/docs/highlight_features.rst +++ /dev/null @@ -1,24 +0,0 @@ -Highlight Features -================== - -This page introduces some key features of Quark. Please refere to the -`user guide <./user_guide.html>`__ for the more details of other features -of Quark. - -Quark for PyTorch ------------------ - -- `Bridge from Quark to llama.cpp <./pytorch/tutorial_gguf.html>`__ -- `Using MX (Microscaling) with Quark <./pytorch/tutorial_mx.html>`__ - -Quark for ONNX --------------- - -- `AdaRound and AdaQuant <./onnx/tutorial_adaround_adaquant.html>`__ -- `Mixed Precision <./onnx/tutorial_mix_precision.html>`__ -.. raw:: html - - diff --git a/docs/user_guide.rst b/docs/user_guide.rst deleted file mode 100644 index 37ccfce..0000000 --- a/docs/user_guide.rst +++ /dev/null @@ -1,49 +0,0 @@ -User Guide -========== - -Quark for PyTorch ------------------ - -There are several steps to quantize a floating-point model with -``Quark for PyTorch``: - -1. Load original float model -2. Set quantization configuration -3. Define dataloader -4. Use the Quark API to perform in-place replacement of the model's modules with quantized module. -5. (Optional) Export quantized model to other format such as ONNX - -More details: - -* `Configuring Quark for PyTorch <./pytorch/user_guide_config_description.html>`__ -* `Adding Calibration Datasets <./pytorch/user_guide_dataloader.html>`__ -* `Exporting for ONNX & Json-Safetensors & GGUF <./pytorch/user_guide_exporting.html>`__ -* `Feature Description <./pytorch/user_guide_feature_description.html>`__ - -Quark for ONNX --------------- - -There are several steps to quantize a floating-point model with -``Quark for ONNX``: - -1. Load original float model -2. Set quantization configuration -3. Define datareader -4. Use the Quark API to perform in-place replacement of the model's modules with quantized module. - -More details: - -* `Configuring Quark for ONNX <./onnx/user_guide_config_description.html>`__ -* `Adding Calibration Datasets <./onnx/user_guide_datareader.html>`__ -* `Feature Description <./onnx/user_guide_feature_description.html>`__ -* `Supported Datatype and OpType <./onnx/user_guide_supported_optype_datatype.html>`__ -* `Accuracy Improvement <./onnx/user_guide_accuracy_improvement.html>`__ -* `Optional Utilities <./onnx/user_guide_optional_utilities.html>`__ -* `Tools <./onnx/user_guide_tools.html>`__ - -.. raw:: html - -