# Executorch

> The Neutron partitioner API allows for configuration of the model delegation to Neutron. Passing an``NeutronPartitioner``instance with no additional parameters will run as much of the model as possi

## Pages

- [Nxp Partitioner](backends-nxp-nxp-partitioner.md): ===============
- [Samsung Op Support](backends-samsung-samsung-op-support.md): ================
- [Backend Partitioner](backends-template-backend-partitioner.md): ==========================
- [Vulkan Op Support](backends-vulkan-vulkan-op-support.md): ================
- [Xnnpack Partitioner](backends-xnnpack-xnnpack-partitioner.md): ===============
- [Prerequisite | ETRecord - ExecuTorch Record](etrecord.md): Prerequisite | ETRecord - ExecuTorch Record
- [Runtime API Reference](executorch-runtime-api-reference.md): Runtime API Reference
- [Export API Reference](export-to-executorch-api-reference.md): Export API Reference
- [Setting Up ExecuTorch](getting-started-setup.md): Setting Up ExecuTorch
- [Inspector APIs](model-inspector.md): Inspector APIs
- [Runtime Python API Reference](runtime-python-api-reference.md): Runtime Python API Reference
- [Advanced](advanced-topics-section.md): (advanced-topics-section)=
- [Backends](android-backends.md): (android-backends)=
- [Examples & Demos](android-examples.md): - [Working with LLMs - Android Examples](https://github.com/meta-pytorch/executorch-examples/blob/main/llm/android/Ll...
- [Android](android-section.md): (android-section)=
- [API Life Cycle and Deprecation Policy](api-life-cycle.md): Each API of ExecuTorch falls into one of the following life cycle states:
- [API](api-section.md): (api-section)=
- [Cadence Xtensa Backend (Legacy / Outdated)](archive-backends-cadence-legacy.md): **⚠️ THIS DOCUMENTATION IS OUTDATED AND NO LONGER MAINTAINED**
- [Backend & Delegates](backend-delegate-advanced.md): (backend-delegate-advanced)=
- [Third-Party Dependency Management for Backend Delegates](backend-delegates-dependencies.md): Disclaimer: We are planning to restructure the repository around delegates.
- [Integrating a Backend Delegate into ExecuTorch](backend-delegates-integration.md): Disclaimer: We are planning to restructure the repository around delegates.
- [Backend Development](backend-development.md): :maxdepth: 1
- [Arm Ethos-U Backend](backends-arm-ethos-u-arm-ethos-u-overview.md): The Arm&reg; Ethos&trade;-U backend targets Edge/IoT-type AI use-cases by enabling optimal execution of quantized mod...
- [Partitioner API](backends-arm-ethos-u-arm-ethos-u-partitioner.md): The`EthosUPartitioner`controls what parts of a model is delegated to the Arm Ethos-U backend. Below is a reference ...
- [Quantization](backends-arm-ethos-u-arm-ethos-u-quantization.md): The Arm Ethos-U delegate only supports the execution of quantized models. To quantize a model so that is supported by...
- [Arm Ethos-U Troubleshooting](backends-arm-ethos-u-arm-ethos-u-troubleshooting.md): This page describes common issues that you may encounter when using the Arm Ethos-U backend and how to debug and reso...
- [Arm Ethos-U Backend Tutorials](backends-arm-ethos-u-tutorials-arm-ethos-u-tutorials.md): **→{doc}`ethos-u-getting-started`**
- [Getting Started Tutorial](backends-arm-ethos-u-tutorials-ethos-u-getting-started.md): ::::{grid} 2
- [Arm VGF Backend](backends-arm-vgf-arm-vgf-overview.md): The Arm&reg; VGF backend is the ExecuTorch solution for lowering PyTorch models to VGF compatible hardware.
- [Partitioner API](backends-arm-vgf-arm-vgf-partitioner.md): The`VgfPartitioner`controls what parts of a model is delegated to the Arm VGF backend. Below is a reference of the ...
- [Quantization](backends-arm-vgf-arm-vgf-quantization.md): The Arm VGF delegate can be used to execute quantized models. To quantize a model so that is supported by this delega...
- [Arm VGF Troubleshooting](backends-arm-vgf-arm-vgf-troubleshooting.md): This page describes common issues that you may encounter when using the Arm VGF backend and how to debug and resolve ...
- [Arm VGF Backend Tutorials](backends-arm-vgf-tutorials-arm-vgf-tutorials.md): **→{doc}`vgf-getting-started`**
- [Getting Started Tutorial](backends-arm-vgf-tutorials-vgf-getting-started.md): ::::{grid} 2
- [Op support](backends-coreml-coreml-op-support.md): The Core ML backend supports almost all PyTorch operators.
- [Core ML Backend](backends-coreml-coreml-overview.md): Core ML delegate is the ExecuTorch solution to take advantage of Apple's [Core ML framework](https://developer.apple....
- [Partitioner API](backends-coreml-coreml-partitioner.md): The Core ML partitioner API allows for configuration of the model delegation to Core ML. Passing a`CoreMLPartitioner...
- [Quantization](backends-coreml-coreml-quantization.md): To quantize a PyTorch model for the Core ML backend, use the`CoreMLQuantizer`.`Quantizers`are backend specific, w...
- [Troubleshooting](backends-coreml-coreml-troubleshooting.md): This page describes common issues that you may encounter when using the Core ML backend and how to debug and resolve ...
- [MPS Backend](backends-mps-mps-overview.md): MPS delegate is the ExecuTorch solution to take advantage of Apple's GPU for on-device ML using the [MPS Graph](https...
- [NXP eIQ Neutron Backend](backends-nxp-nxp-overview.md): This manual page is dedicated to introduction NXP eIQ Neutron backend.
- [NXP eIQ Neutron Quantization](backends-nxp-nxp-quantization.md): The eIQ Neutron NPU requires the operators delegated to be quantized. To quantize the PyTorch model for the Neutron b...
- [Preparing a Model for NXP eIQ Neutron Backend](backends-nxp-tutorials-nxp-basic-tutorial.md): This guide demonstrating the use of ExecuTorch AoT flow to convert a PyTorch model to ExecuTorch
- [NXP Tutorials](backends-nxp-tutorials-nxp-tutorials.md): **→{doc}`nxp-basic-tutorial`— Lower and run a model on the NXP eIQ Neutron backend.**
- [Samsung Exynos Backend](backends-samsung-samsung-overview.md): ExecuTorch's Samsung Exynos backend enables the execution of ExecuTorch models on
- [Partitioner API](backends-samsung-samsung-partitioner.md): The`EnnPartitioner`API is the primary entrypoint when exporting a model to the Samsung
- [Quantization](backends-samsung-samsung-quantization.md): The Exynos backend currently supports executing statically quantized 8-bit models.
- [{BACKEND_NAME} Architecture and Internals](backends-template-backend-arch-internals.md): This page covers internal implementation details of the backend, and is mainly aimed at contributors and heavy power ...
- [Backend Template](backends-template-backend-overview.md): Provide a brief overview/description of the backend. At a high-level, what does it do? Consider linking to top-level ...
- [{BACKEND_NAME} Quantization](backends-template-backend-quantization.md): Document quantization schemes and flows for the backend. This should include a description of each scheme and a code ...
- [{BACKEND_NAME} Troubleshooting](backends-template-backend-troubleshooting.md): This page describes common issues that you may encounter when using the {BACKEND_NAME} backend and how to debug and r...
- [Using {FEATURE} on {BACKEND_NAME}](backends-template-guides-backend-basic-guide.md): This is a placeholder guide.
- [{BACKEND_NAME} Guides](backends-template-guides-backend-guides.md): **→{doc}`{backend_name}-basic-guide`— Guide description.**
- [Preparing a Model for {BACKEND_NAME}](backends-template-tutorials-backend-basic-tutorial.md): This is a placeholder tutorial.
- [{BACKEND_NAME} Tutorials](backends-template-tutorials-backend-tutorials.md): **→{doc}`{backend_name}-basic-tutorial`— Lower and run a model on the {BACKEND_NAME} backend.**
- [Exporting Llama 3.2 1B/3B Instruct to ExecuTorch Vulkan and running on device](backends-vulkan-tutorials-etvk-llama-tutorial.md): This tutorial assumes that you have a working local copy of the ExecuTorch repo,
- [Executing and profiling an ExecuTorch Vulkan model on device](backends-vulkan-tutorials-etvk-profiling-tutorial.md): This tutorial assumes that you have a working local copy of the ExecuTorch repo,
- [Vulkan Backend Tutorials](backends-vulkan-tutorials-vulkan-tutorials.md): **→{doc}`etvk-profiling-tutorial`**
- [Vulkan Backend](backends-vulkan-vulkan-overview.md): The ExecuTorch Vulkan (ET-VK) backend enables ExecuTorch models to execute on
- [Partitioner API](backends-vulkan-vulkan-partitioner.md): is a Python class that controls what operators in a model can or should be
- [Quantization](backends-vulkan-vulkan-quantization.md): The Vulkan backend currently supports execution of quantized linear layers,
- [Troubleshooting](backends-vulkan-vulkan-troubleshooting.md): This page describes common issues that you may encounter when using the Vulkan
- [Architecture and Internals](backends-xnnpack-xnnpack-arch-internals.md): This is a high-level overview of the ExecuTorch XNNPACK backend delegate. This high performance delegate is aimed to ...
- [XNNPACK Backend](backends-xnnpack-xnnpack-overview.md): The XNNPACK delegate is the ExecuTorch solution for CPU execution on mobile CPUs. [XNNPACK](https://github.com/google...
- [Quantization](backends-xnnpack-xnnpack-quantization.md): The XNNPACK delegate can also be used as a backend to execute symmetrically quantized models. To quantize a PyTorch m...
- [Troubleshooting](backends-xnnpack-xnnpack-troubleshooting.md): This page describes common issues that you may encounter when using the XNNPACK backend and how to debug and resolve ...
- [Cadence Xtensa Backend](backends-cadence.md): In this tutorial we will walk you through the process of getting setup to build ExecuTorch for Cadence Xtensa DSPs an...
- [MediaTek Backend](backends-mediatek.md): The MediaTek backend enables acceleration of PyTorch models on edge devices with MediaTek Neuron Processing Units (NP...
- [Backends](backends-overview.md): ExecuTorch backends provide hardware acceleration for specific hardware targets, enabling models to run efficiently o...
- [Qualcomm AI Engine Backend](backends-qualcomm.md): In this tutorial we will walk you through the process of getting started to
- [Building and Running ExecuTorch with OpenVINO Backend](build-run-openvino.md): In this tutorial we will walk you through the process of setting up the prerequisites, building OpenVINO backend libr...
- [Bundled Program -- a Tool for ExecuTorch Model Validation](bundled-io.md): `BundledProgram`is a wrapper around the core ExecuTorch program designed to help users wrapping test cases with the ...
- [Backend Dialect](compiler-backend-dialect.md): _Backend dialect_ is a special variant of [edge dialect](ir-exir.md), because it contains backend specific nodes and ...
- [Custom Compiler Passes and Partitioners](compiler-custom-compiler-passes.md): Passes can be roughly categorized into a couple of axes:
- [Understanding Backends and Delegates](compiler-delegate-and-partitioner.md): Audience: Vendors, Backend Delegate developers, who are interested in integrating their own compilers and hardware as...
- [Compiler Entry Points](compiler-entry-points.md): :maxdepth: 1
- [Compiler & IR](compiler-ir-advanced.md): (compiler-ir-advanced)=
- [Memory Planning](compiler-memory-planning.md): Audience: Backend integrators and embedded developers who are interested in customizing the regions of memory ExecuTo...
- [Concepts](concepts.md): This page provides an overview of key concepts and terms used throughout the ExecuTorch documentation. It is intended...
- [Contributing to ExecuTorch](contributing.md): Thank you for your interest in contributing to ExecuTorch! We want to make
- [Debugging Delegation](debug-backend-delegate.md): We provide a list of util functions to give users insights on what happened to the graph modules during the`to_backe...
- [Delegate Debugging](delegate-debugging.md): [Delegate backends](compiler-delegate-and-partitioner.md) are a prominent component of on-device models due to their ...
- [Backends](desktop-backends.md): (desktop-backends)=
- [Desktop & Laptop Platforms](desktop-section.md): (desktop-section)=
- [Tools](developer-tools.md): :maxdepth: 1
- [Introduction to the ExecuTorch Developer Tools](devtools-overview.md): ExecuTorch has been designed with [productivity](intro-overview.md) as one of its core objectives and the ExecuTorch ...
- [Devtools Tutorial](devtools-tutorial.md): Please refer to the [Developer Tools tutorial](tutorials/devtools-integration-tutorial) for a w...
- [Edge](edge-platforms-section.md): (edge-platforms-section)=
- [Backends](embedded-backends.md): (embedded-backends)=
- [Embedded Systems](embedded-section.md): (embedded-section)=
- [Prerequisite | ETDump - ExecuTorch Dump](etdump.md): ETDump (ExecuTorch Dump) is one of the core components of the ExecuTorch Developer Tools. It is the mechanism through...
- [Lowering a Model as a Delegate](examples-end-to-end-to-lower-model-to-delegate.md): Audience: ML Engineers, who are interested in applying delegates to accelerate their program in runtime.
- [Examples](examples.md): :maxdepth: 1
- [Exporting to ExecuTorch](export-overview.md): One of the important steps in getting your PyTorch programs ready for execution
- [Running an ExecuTorch Model Using the Module Extension in C++](extension-module.md): **Author:** [Anthony Shoumikhin](https://github.com/shoumikhin)
- [Managing Tensor Memory in C++](extension-tensor.md): **Author:** [Anthony Shoumikhin](https://github.com/shoumikhin)
- [File Formats](file-formats-advanced.md): (file-formats-advanced)=
- [Architecture and Components](getting-started-architecture.md): This page describes the technical architecture of ExecuTorch and its individual components. This document is targeted...
- [Getting Started with ExecuTorch](getting-started.md): This section is intended to describe the necessary steps to take a PyTorch model and run it using ExecuTorch. To use ...
- [Welcome to the ExecuTorch Documentation](index.md): (home)=
- [How ExecuTorch Works](intro-how-it-works.md): This page describes how ExecuTorch works and its key benefits.
- [ExecuTorch Overview](intro-overview.md): **ExecuTorch** is an end-to-end solution for enabling on-device inference
- [Intro](intro-section.md): (intro-section)=
- [Backends](ios-backends.md): (ios-backends)=
- [Examples & Demos](ios-examples.md): - [iOS LLM Examples Repository](https://github.com/meta-pytorch/executorch-examples/tree/main/llm/apple)
- [iOS](ios-section.md): (ios-section)=
- [Export IR Specification](ir-exir.md): Export IR is an intermediate representation (IR) for the result of
- [Definition of the Core ATen Operator Set](ir-ops-set-definition.md): This page provides the description and background of the Core ATen Operator Set (opset). This page is recommended rea...
- [IR Specification](ir-specification.md): :maxdepth: 1
- [Kernel Library Deep Dive](kernel-library-advanced.md): (kernel-library-advanced)=
- [Kernel Registration](kernel-library-custom-aten-kernel.md): At the last stage of [ExecuTorch model exporting](export-overview.md), we lower the operators in the dialect to the _...
- [Overview of ExecuTorch’s Kernel Libraries](kernel-library-overview.md): This page provides a description of the Portable Kernel Library and the Optimized Kernel Library, which are the defau...
- [Kernel Library Selective Build](kernel-library-selective-build.md): _Selective build_ is a build mode on ExecuTorch that uses model metadata to guide ExecuTorch build. This build mode c...
- [Kernel Library](kernel-library.md): :maxdepth: 1
- [Run Llama 3 3B Instruct on Android (with Qualcomm AI Engine Direct Backend)](llm-build-run-llama3-qualcomm-ai-engine-direct-backend.md): This tutorial demonstrates how to export and run the Llama 3 3B Instruct model on a Qualcomm device using the Qualcom...
- [Exporting custom LLMs](llm-export-custom-llm.md): If you have your own PyTorch model that is an LLM, this guide will show you how to manually export and lower to Execu...
- [Exporting LLMs with HuggingFace's Optimum ExecuTorch](llm-export-llm-optimum.md): [Optimum ExecuTorch](https://github.com/huggingface/optimum-executorch) provides a streamlined way to export Hugging ...
- [Exporting LLMs](llm-export-llm.md): Instead of needing to manually write code to call torch.export(), use ExecuTorch's assortment of lowering APIs, or ev...
- [Deploying LLMs to ExecuTorch](llm-getting-started.md): ExecuTorch is designed to support all types of machine learning models, and LLMs are no exception.
- [Llama on ExecuTorch](llm-llama.md): See
- [Running LLMs on iOS](llm-run-on-ios.md): ExecuTorch’s LLM-specific runtime components provide an experimental Objective-C and Swift components around the core...
- [Running LLMs with C++](llm-run-with-c-plus-plus.md): This guide explains how to use ExecuTorch's C++ runner library to run LLM models that have been exported to the`.pte...
- [LLMs](llm-working-with-llms.md): (working-with-llms)=
- [Markdown in Sphinx Tips and Tricks](markdown-sphinx-tips-tricks.md): In this repository, you can use both markdown and reSTructuredText to author
- [Memory Planning Inspection in ExecuTorch](memory-planning-inspection.md): After the [Memory Planning](concepts.md#memory-planning) pass of ExecuTorch, memory allocation information is stored ...
- [Debugging Models in ExecuTorch](model-debugging.md): With the ExecuTorch Developer Tools, users can debug their models for numerical inaccurcies and extract model outputs...
- [New Contributor Guide](new-contributor-guide.md): Welcome to **ExecuTorch** — a runtime for efficient deployment of PyTorch AI models to edge devices, including mobile...
- [Pico2: A simple MNIST Tutorial](pico2-tutorial.md): Deploy your PyTorch models directly to Raspberry Pi Pico2 microcontroller with ExecuTorch.
- [Desktop & Laptop](platforms-desktop.md): ExecuTorch supports desktop and laptop deployment across Linux, macOS, and Windows.
- [Embedded Platforms](platforms-embedded.md): ExecuTorch supports embedded devices from microcontrollers to edge devices.
- [Portable C++ Programming](portable-cpp-programming.md): NOTE: This document covers the code that needs to build for and execute in
- [`.ptd` file format](ptd-file-format.md): ExecuTorch`.ptd`files are serialized as modified binary flatbuffer
- [`.pte` file format](pte-file-format.md): ExecuTorch`.pte`program files are serialized as modified binary flatbuffer
- [Quantization & Optimization](quantization-optimization.md): (quantization-optimization)=
- [Quantization Overview](quantization-overview.md): Quantization is a technique that reduces the precision of numbers used in a model’s computations and stored weights—t...
- [Quantization](quantization.md): :maxdepth: 1
- [Quick Start](quick-start-section.md): (quick-start-section)=
- [ExecuTorch on Raspberry Pi](raspberry-pi-llama-tutorial.md): This tutorial demonstrates how to deploy **Llama models on Raspberry Pi 4/5 devices** using ExecuTorch:
- [Detailed C++ Runtime APIs Tutorial](running-a-model-cpp-tutorial.md): **Author:** [Jacob Szwejbka](https://github.com/JacobSzwejbka)
- [Backend Delegate Implementation and Linking](runtime-backend-delegate-implementation-and-linking.md): Please refer to:
- [Runtime & Integration](runtime-integration-advanced.md): (runtime-integration-advanced)=
- [ExecuTorch Runtime Overview](runtime-overview.md): This document discusses the design of the ExecuTorch runtime, which executes
- [Runtime Platform Abstraction Layer (PAL)](runtime-platform-abstraction-layer.md): The ExecuTorch _Platform Abstraction Layer_ (PAL) provides a way for execution
- [Profiling Models in ExecuTorch](runtime-profiling.md): Profiling in ExecuTorch gives users access to these runtime metrics:
- [Runtime](runtime.md): :maxdepth: 1
- [Success Stories](success-stories.md): (success-stories)=
- [Support](support-section.md): (support-section)=
- [Tools](tools-section.md): (tools-sdk-section)=
- [TITLE](tutorial-template.md): ::::{grid} 2
- [Building and Running ExecuTorch with XNNPACK Backend](tutorial-xnnpack-delegate-lowering.md): The following tutorial will familiarize you with leveraging the ExecuTorch XNNPACK Delegate for accelerating your ML ...
- [Usage](usage.md): This section describes how to use Executorch. It covers everything from
- [Using ExecuTorch on Android](using-executorch-android.md): 🚀 Quick Start: __New to ExecuTorch__ ? Jump to Using AAR from Maven Central for the ...
- [Building from Source](using-executorch-building-from-source.md): ExecuTorch uses [CMake](https://cmake.org/) as the primary build system.
- [Using ExecuTorch with C++](using-executorch-cpp.md): In order to support a wide variety of devices, from high-end mobile phones down to tiny embedded systems, ExecuTorch ...
- [Model Export and Lowering](using-executorch-export.md): The section describes the process of taking a PyTorch model and converting to the runtime format used by ExecuTorch. ...
- [Frequently Asked Questions](using-executorch-faqs.md): This page summarizes frequently asked questions and provides guidance on issues that commonly occur when adopting Exe...
- [Using ExecuTorch on iOS](using-executorch-ios.md): ExecuTorch supports both iOS and macOS via Objective-C, Swift, and C++. ExecuTorch also provides backends to leverage...
- [Runtime Integration](using-executorch-runtime-integration.md): This section describes options for configuring and customizing the ExecuTorch runtime. While the pre-built packages a...
- [Profiling and Debugging](using-executorch-troubleshooting.md): To facilitate model and runtime integration, ExecuTorch provides tools to profile model resource utilization, numeric...
- [Visualize a Model using ModelExplorer](visualize.md): The [visualization_utils.py](../../devtools/visualization/visualization_utils.py) contains functions for