# Executorch > The Neutron partitioner API allows for configuration of the model delegation to Neutron. Passing an``NeutronPartitioner``instance with no additional parameters will run as much of the model as possi ## Pages - [Nxp Partitioner](backends-nxp-nxp-partitioner.md): =============== - [Samsung Op Support](backends-samsung-samsung-op-support.md): ================ - [Backend Partitioner](backends-template-backend-partitioner.md): ========================== - [Vulkan Op Support](backends-vulkan-vulkan-op-support.md): ================ - [Xnnpack Partitioner](backends-xnnpack-xnnpack-partitioner.md): =============== - [Prerequisite | ETRecord - ExecuTorch Record](etrecord.md): Prerequisite | ETRecord - ExecuTorch Record - [Runtime API Reference](executorch-runtime-api-reference.md): Runtime API Reference - [Export API Reference](export-to-executorch-api-reference.md): Export API Reference - [Setting Up ExecuTorch](getting-started-setup.md): Setting Up ExecuTorch - [Inspector APIs](model-inspector.md): Inspector APIs - [Runtime Python API Reference](runtime-python-api-reference.md): Runtime Python API Reference - [Advanced](advanced-topics-section.md): (advanced-topics-section)= - [Backends](android-backends.md): (android-backends)= - [Examples & Demos](android-examples.md): - [Working with LLMs - Android Examples](https://github.com/meta-pytorch/executorch-examples/blob/main/llm/android/Ll... - [Android](android-section.md): (android-section)= - [API Life Cycle and Deprecation Policy](api-life-cycle.md): Each API of ExecuTorch falls into one of the following life cycle states: - [API](api-section.md): (api-section)= - [Cadence Xtensa Backend (Legacy / Outdated)](archive-backends-cadence-legacy.md): **⚠️ THIS DOCUMENTATION IS OUTDATED AND NO LONGER MAINTAINED** - [Backend & Delegates](backend-delegate-advanced.md): (backend-delegate-advanced)= - [Third-Party Dependency Management for Backend Delegates](backend-delegates-dependencies.md): Disclaimer: We are planning to restructure the repository around delegates. - [Integrating a Backend Delegate into ExecuTorch](backend-delegates-integration.md): Disclaimer: We are planning to restructure the repository around delegates. - [Backend Development](backend-development.md): :maxdepth: 1 - [Arm Ethos-U Backend](backends-arm-ethos-u-arm-ethos-u-overview.md): The Arm® Ethos™-U backend targets Edge/IoT-type AI use-cases by enabling optimal execution of quantized mod... - [Partitioner API](backends-arm-ethos-u-arm-ethos-u-partitioner.md): The`EthosUPartitioner`controls what parts of a model is delegated to the Arm Ethos-U backend. Below is a reference ... - [Quantization](backends-arm-ethos-u-arm-ethos-u-quantization.md): The Arm Ethos-U delegate only supports the execution of quantized models. To quantize a model so that is supported by... - [Arm Ethos-U Troubleshooting](backends-arm-ethos-u-arm-ethos-u-troubleshooting.md): This page describes common issues that you may encounter when using the Arm Ethos-U backend and how to debug and reso... - [Arm Ethos-U Backend Tutorials](backends-arm-ethos-u-tutorials-arm-ethos-u-tutorials.md): **→{doc}`ethos-u-getting-started`** - [Getting Started Tutorial](backends-arm-ethos-u-tutorials-ethos-u-getting-started.md): ::::{grid} 2 - [Arm VGF Backend](backends-arm-vgf-arm-vgf-overview.md): The Arm® VGF backend is the ExecuTorch solution for lowering PyTorch models to VGF compatible hardware. - [Partitioner API](backends-arm-vgf-arm-vgf-partitioner.md): The`VgfPartitioner`controls what parts of a model is delegated to the Arm VGF backend. Below is a reference of the ... - [Quantization](backends-arm-vgf-arm-vgf-quantization.md): The Arm VGF delegate can be used to execute quantized models. To quantize a model so that is supported by this delega... - [Arm VGF Troubleshooting](backends-arm-vgf-arm-vgf-troubleshooting.md): This page describes common issues that you may encounter when using the Arm VGF backend and how to debug and resolve ... - [Arm VGF Backend Tutorials](backends-arm-vgf-tutorials-arm-vgf-tutorials.md): **→{doc}`vgf-getting-started`** - [Getting Started Tutorial](backends-arm-vgf-tutorials-vgf-getting-started.md): ::::{grid} 2 - [Op support](backends-coreml-coreml-op-support.md): The Core ML backend supports almost all PyTorch operators. - [Core ML Backend](backends-coreml-coreml-overview.md): Core ML delegate is the ExecuTorch solution to take advantage of Apple's [Core ML framework](https://developer.apple.... - [Partitioner API](backends-coreml-coreml-partitioner.md): The Core ML partitioner API allows for configuration of the model delegation to Core ML. Passing a`CoreMLPartitioner... - [Quantization](backends-coreml-coreml-quantization.md): To quantize a PyTorch model for the Core ML backend, use the`CoreMLQuantizer`.`Quantizers`are backend specific, w... - [Troubleshooting](backends-coreml-coreml-troubleshooting.md): This page describes common issues that you may encounter when using the Core ML backend and how to debug and resolve ... - [MPS Backend](backends-mps-mps-overview.md): MPS delegate is the ExecuTorch solution to take advantage of Apple's GPU for on-device ML using the [MPS Graph](https... - [NXP eIQ Neutron Backend](backends-nxp-nxp-overview.md): This manual page is dedicated to introduction NXP eIQ Neutron backend. - [NXP eIQ Neutron Quantization](backends-nxp-nxp-quantization.md): The eIQ Neutron NPU requires the operators delegated to be quantized. To quantize the PyTorch model for the Neutron b... - [Preparing a Model for NXP eIQ Neutron Backend](backends-nxp-tutorials-nxp-basic-tutorial.md): This guide demonstrating the use of ExecuTorch AoT flow to convert a PyTorch model to ExecuTorch - [NXP Tutorials](backends-nxp-tutorials-nxp-tutorials.md): **→{doc}`nxp-basic-tutorial`— Lower and run a model on the NXP eIQ Neutron backend.** - [Samsung Exynos Backend](backends-samsung-samsung-overview.md): ExecuTorch's Samsung Exynos backend enables the execution of ExecuTorch models on - [Partitioner API](backends-samsung-samsung-partitioner.md): The`EnnPartitioner`API is the primary entrypoint when exporting a model to the Samsung - [Quantization](backends-samsung-samsung-quantization.md): The Exynos backend currently supports executing statically quantized 8-bit models. - [{BACKEND_NAME} Architecture and Internals](backends-template-backend-arch-internals.md): This page covers internal implementation details of the backend, and is mainly aimed at contributors and heavy power ... - [Backend Template](backends-template-backend-overview.md): Provide a brief overview/description of the backend. At a high-level, what does it do? Consider linking to top-level ... - [{BACKEND_NAME} Quantization](backends-template-backend-quantization.md): Document quantization schemes and flows for the backend. This should include a description of each scheme and a code ... - [{BACKEND_NAME} Troubleshooting](backends-template-backend-troubleshooting.md): This page describes common issues that you may encounter when using the {BACKEND_NAME} backend and how to debug and r... - [Using {FEATURE} on {BACKEND_NAME}](backends-template-guides-backend-basic-guide.md): This is a placeholder guide. - [{BACKEND_NAME} Guides](backends-template-guides-backend-guides.md): **→{doc}`{backend_name}-basic-guide`— Guide description.** - [Preparing a Model for {BACKEND_NAME}](backends-template-tutorials-backend-basic-tutorial.md): This is a placeholder tutorial. - [{BACKEND_NAME} Tutorials](backends-template-tutorials-backend-tutorials.md): **→{doc}`{backend_name}-basic-tutorial`— Lower and run a model on the {BACKEND_NAME} backend.** - [Exporting Llama 3.2 1B/3B Instruct to ExecuTorch Vulkan and running on device](backends-vulkan-tutorials-etvk-llama-tutorial.md): This tutorial assumes that you have a working local copy of the ExecuTorch repo, - [Executing and profiling an ExecuTorch Vulkan model on device](backends-vulkan-tutorials-etvk-profiling-tutorial.md): This tutorial assumes that you have a working local copy of the ExecuTorch repo, - [Vulkan Backend Tutorials](backends-vulkan-tutorials-vulkan-tutorials.md): **→{doc}`etvk-profiling-tutorial`** - [Vulkan Backend](backends-vulkan-vulkan-overview.md): The ExecuTorch Vulkan (ET-VK) backend enables ExecuTorch models to execute on - [Partitioner API](backends-vulkan-vulkan-partitioner.md): is a Python class that controls what operators in a model can or should be - [Quantization](backends-vulkan-vulkan-quantization.md): The Vulkan backend currently supports execution of quantized linear layers, - [Troubleshooting](backends-vulkan-vulkan-troubleshooting.md): This page describes common issues that you may encounter when using the Vulkan - [Architecture and Internals](backends-xnnpack-xnnpack-arch-internals.md): This is a high-level overview of the ExecuTorch XNNPACK backend delegate. This high performance delegate is aimed to ... - [XNNPACK Backend](backends-xnnpack-xnnpack-overview.md): The XNNPACK delegate is the ExecuTorch solution for CPU execution on mobile CPUs. [XNNPACK](https://github.com/google... - [Quantization](backends-xnnpack-xnnpack-quantization.md): The XNNPACK delegate can also be used as a backend to execute symmetrically quantized models. To quantize a PyTorch m... - [Troubleshooting](backends-xnnpack-xnnpack-troubleshooting.md): This page describes common issues that you may encounter when using the XNNPACK backend and how to debug and resolve ... - [Cadence Xtensa Backend](backends-cadence.md): In this tutorial we will walk you through the process of getting setup to build ExecuTorch for Cadence Xtensa DSPs an... - [MediaTek Backend](backends-mediatek.md): The MediaTek backend enables acceleration of PyTorch models on edge devices with MediaTek Neuron Processing Units (NP... - [Backends](backends-overview.md): ExecuTorch backends provide hardware acceleration for specific hardware targets, enabling models to run efficiently o... - [Qualcomm AI Engine Backend](backends-qualcomm.md): In this tutorial we will walk you through the process of getting started to - [Building and Running ExecuTorch with OpenVINO Backend](build-run-openvino.md): In this tutorial we will walk you through the process of setting up the prerequisites, building OpenVINO backend libr... - [Bundled Program -- a Tool for ExecuTorch Model Validation](bundled-io.md): `BundledProgram`is a wrapper around the core ExecuTorch program designed to help users wrapping test cases with the ... - [Backend Dialect](compiler-backend-dialect.md): _Backend dialect_ is a special variant of [edge dialect](ir-exir.md), because it contains backend specific nodes and ... - [Custom Compiler Passes and Partitioners](compiler-custom-compiler-passes.md): Passes can be roughly categorized into a couple of axes: - [Understanding Backends and Delegates](compiler-delegate-and-partitioner.md): Audience: Vendors, Backend Delegate developers, who are interested in integrating their own compilers and hardware as... - [Compiler Entry Points](compiler-entry-points.md): :maxdepth: 1 - [Compiler & IR](compiler-ir-advanced.md): (compiler-ir-advanced)= - [Memory Planning](compiler-memory-planning.md): Audience: Backend integrators and embedded developers who are interested in customizing the regions of memory ExecuTo... - [Concepts](concepts.md): This page provides an overview of key concepts and terms used throughout the ExecuTorch documentation. It is intended... - [Contributing to ExecuTorch](contributing.md): Thank you for your interest in contributing to ExecuTorch! We want to make - [Debugging Delegation](debug-backend-delegate.md): We provide a list of util functions to give users insights on what happened to the graph modules during the`to_backe... - [Delegate Debugging](delegate-debugging.md): [Delegate backends](compiler-delegate-and-partitioner.md) are a prominent component of on-device models due to their ... - [Backends](desktop-backends.md): (desktop-backends)= - [Desktop & Laptop Platforms](desktop-section.md): (desktop-section)= - [Tools](developer-tools.md): :maxdepth: 1 - [Introduction to the ExecuTorch Developer Tools](devtools-overview.md): ExecuTorch has been designed with [productivity](intro-overview.md) as one of its core objectives and the ExecuTorch ... - [Devtools Tutorial](devtools-tutorial.md): Please refer to the [Developer Tools tutorial](tutorials/devtools-integration-tutorial) for a w... - [Edge](edge-platforms-section.md): (edge-platforms-section)= - [Backends](embedded-backends.md): (embedded-backends)= - [Embedded Systems](embedded-section.md): (embedded-section)= - [Prerequisite | ETDump - ExecuTorch Dump](etdump.md): ETDump (ExecuTorch Dump) is one of the core components of the ExecuTorch Developer Tools. It is the mechanism through... - [Lowering a Model as a Delegate](examples-end-to-end-to-lower-model-to-delegate.md): Audience: ML Engineers, who are interested in applying delegates to accelerate their program in runtime. - [Examples](examples.md): :maxdepth: 1 - [Exporting to ExecuTorch](export-overview.md): One of the important steps in getting your PyTorch programs ready for execution - [Running an ExecuTorch Model Using the Module Extension in C++](extension-module.md): **Author:** [Anthony Shoumikhin](https://github.com/shoumikhin) - [Managing Tensor Memory in C++](extension-tensor.md): **Author:** [Anthony Shoumikhin](https://github.com/shoumikhin) - [File Formats](file-formats-advanced.md): (file-formats-advanced)= - [Architecture and Components](getting-started-architecture.md): This page describes the technical architecture of ExecuTorch and its individual components. This document is targeted... - [Getting Started with ExecuTorch](getting-started.md): This section is intended to describe the necessary steps to take a PyTorch model and run it using ExecuTorch. To use ... - [Welcome to the ExecuTorch Documentation](index.md): (home)= - [How ExecuTorch Works](intro-how-it-works.md): This page describes how ExecuTorch works and its key benefits. - [ExecuTorch Overview](intro-overview.md): **ExecuTorch** is an end-to-end solution for enabling on-device inference - [Intro](intro-section.md): (intro-section)= - [Backends](ios-backends.md): (ios-backends)= - [Examples & Demos](ios-examples.md): - [iOS LLM Examples Repository](https://github.com/meta-pytorch/executorch-examples/tree/main/llm/apple) - [iOS](ios-section.md): (ios-section)= - [Export IR Specification](ir-exir.md): Export IR is an intermediate representation (IR) for the result of - [Definition of the Core ATen Operator Set](ir-ops-set-definition.md): This page provides the description and background of the Core ATen Operator Set (opset). This page is recommended rea... - [IR Specification](ir-specification.md): :maxdepth: 1 - [Kernel Library Deep Dive](kernel-library-advanced.md): (kernel-library-advanced)= - [Kernel Registration](kernel-library-custom-aten-kernel.md): At the last stage of [ExecuTorch model exporting](export-overview.md), we lower the operators in the dialect to the _... - [Overview of ExecuTorch’s Kernel Libraries](kernel-library-overview.md): This page provides a description of the Portable Kernel Library and the Optimized Kernel Library, which are the defau... - [Kernel Library Selective Build](kernel-library-selective-build.md): _Selective build_ is a build mode on ExecuTorch that uses model metadata to guide ExecuTorch build. This build mode c... - [Kernel Library](kernel-library.md): :maxdepth: 1 - [Run Llama 3 3B Instruct on Android (with Qualcomm AI Engine Direct Backend)](llm-build-run-llama3-qualcomm-ai-engine-direct-backend.md): This tutorial demonstrates how to export and run the Llama 3 3B Instruct model on a Qualcomm device using the Qualcom... - [Exporting custom LLMs](llm-export-custom-llm.md): If you have your own PyTorch model that is an LLM, this guide will show you how to manually export and lower to Execu... - [Exporting LLMs with HuggingFace's Optimum ExecuTorch](llm-export-llm-optimum.md): [Optimum ExecuTorch](https://github.com/huggingface/optimum-executorch) provides a streamlined way to export Hugging ... - [Exporting LLMs](llm-export-llm.md): Instead of needing to manually write code to call torch.export(), use ExecuTorch's assortment of lowering APIs, or ev... - [Deploying LLMs to ExecuTorch](llm-getting-started.md): ExecuTorch is designed to support all types of machine learning models, and LLMs are no exception. - [Llama on ExecuTorch](llm-llama.md): See - [Running LLMs on iOS](llm-run-on-ios.md): ExecuTorch’s LLM-specific runtime components provide an experimental Objective-C and Swift components around the core... - [Running LLMs with C++](llm-run-with-c-plus-plus.md): This guide explains how to use ExecuTorch's C++ runner library to run LLM models that have been exported to the`.pte... - [LLMs](llm-working-with-llms.md): (working-with-llms)= - [Markdown in Sphinx Tips and Tricks](markdown-sphinx-tips-tricks.md): In this repository, you can use both markdown and reSTructuredText to author - [Memory Planning Inspection in ExecuTorch](memory-planning-inspection.md): After the [Memory Planning](concepts.md#memory-planning) pass of ExecuTorch, memory allocation information is stored ... - [Debugging Models in ExecuTorch](model-debugging.md): With the ExecuTorch Developer Tools, users can debug their models for numerical inaccurcies and extract model outputs... - [New Contributor Guide](new-contributor-guide.md): Welcome to **ExecuTorch** — a runtime for efficient deployment of PyTorch AI models to edge devices, including mobile... - [Pico2: A simple MNIST Tutorial](pico2-tutorial.md): Deploy your PyTorch models directly to Raspberry Pi Pico2 microcontroller with ExecuTorch. - [Desktop & Laptop](platforms-desktop.md): ExecuTorch supports desktop and laptop deployment across Linux, macOS, and Windows. - [Embedded Platforms](platforms-embedded.md): ExecuTorch supports embedded devices from microcontrollers to edge devices. - [Portable C++ Programming](portable-cpp-programming.md): NOTE: This document covers the code that needs to build for and execute in - [`.ptd` file format](ptd-file-format.md): ExecuTorch`.ptd`files are serialized as modified binary flatbuffer - [`.pte` file format](pte-file-format.md): ExecuTorch`.pte`program files are serialized as modified binary flatbuffer - [Quantization & Optimization](quantization-optimization.md): (quantization-optimization)= - [Quantization Overview](quantization-overview.md): Quantization is a technique that reduces the precision of numbers used in a model’s computations and stored weights—t... - [Quantization](quantization.md): :maxdepth: 1 - [Quick Start](quick-start-section.md): (quick-start-section)= - [ExecuTorch on Raspberry Pi](raspberry-pi-llama-tutorial.md): This tutorial demonstrates how to deploy **Llama models on Raspberry Pi 4/5 devices** using ExecuTorch: - [Detailed C++ Runtime APIs Tutorial](running-a-model-cpp-tutorial.md): **Author:** [Jacob Szwejbka](https://github.com/JacobSzwejbka) - [Backend Delegate Implementation and Linking](runtime-backend-delegate-implementation-and-linking.md): Please refer to: - [Runtime & Integration](runtime-integration-advanced.md): (runtime-integration-advanced)= - [ExecuTorch Runtime Overview](runtime-overview.md): This document discusses the design of the ExecuTorch runtime, which executes - [Runtime Platform Abstraction Layer (PAL)](runtime-platform-abstraction-layer.md): The ExecuTorch _Platform Abstraction Layer_ (PAL) provides a way for execution - [Profiling Models in ExecuTorch](runtime-profiling.md): Profiling in ExecuTorch gives users access to these runtime metrics: - [Runtime](runtime.md): :maxdepth: 1 - [Success Stories](success-stories.md): (success-stories)= - [Support](support-section.md): (support-section)= - [Tools](tools-section.md): (tools-sdk-section)= - [TITLE](tutorial-template.md): ::::{grid} 2 - [Building and Running ExecuTorch with XNNPACK Backend](tutorial-xnnpack-delegate-lowering.md): The following tutorial will familiarize you with leveraging the ExecuTorch XNNPACK Delegate for accelerating your ML ... - [Usage](usage.md): This section describes how to use Executorch. It covers everything from - [Using ExecuTorch on Android](using-executorch-android.md): 🚀 Quick Start: __New to ExecuTorch__ ? Jump to Using AAR from Maven Central for the ... - [Building from Source](using-executorch-building-from-source.md): ExecuTorch uses [CMake](https://cmake.org/) as the primary build system. - [Using ExecuTorch with C++](using-executorch-cpp.md): In order to support a wide variety of devices, from high-end mobile phones down to tiny embedded systems, ExecuTorch ... - [Model Export and Lowering](using-executorch-export.md): The section describes the process of taking a PyTorch model and converting to the runtime format used by ExecuTorch. ... - [Frequently Asked Questions](using-executorch-faqs.md): This page summarizes frequently asked questions and provides guidance on issues that commonly occur when adopting Exe... - [Using ExecuTorch on iOS](using-executorch-ios.md): ExecuTorch supports both iOS and macOS via Objective-C, Swift, and C++. ExecuTorch also provides backends to leverage... - [Runtime Integration](using-executorch-runtime-integration.md): This section describes options for configuring and customizing the ExecuTorch runtime. While the pre-built packages a... - [Profiling and Debugging](using-executorch-troubleshooting.md): To facilitate model and runtime integration, ExecuTorch provides tools to profile model resource utilization, numeric... - [Visualize a Model using ModelExplorer](visualize.md): The [visualization_utils.py](../../devtools/visualization/visualization_utils.py) contains functions for