-
Notifications
You must be signed in to change notification settings - Fork 646
NXP Backend: Add infrastructure for pre processing passes in edge dialect #13183
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
🔗 Helpful Links🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/executorch/13183
Note: Links to docs will display an error until the docs builds have been completed. ✅ No FailuresAs of commit 35c8809 with merge base c8a0706 ( This comment was automatically generated by Dr. CI and updates every 15 minutes. |
@pytorchbot label "release notes: nxp" module "nxp backend" |
Didn't find following labels among repository labels: module,nxp backend |
partitioner=[partitioner], | ||
compile_config=EdgeCompileConfig(_check_ir_validity=False), | ||
) | ||
edge_program_manager = edge_program_manager.to_backend(partitioner) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You have introduced back the use of export_to_edge
and edge_program_manager.to_edge
functions, which were replaced by the to_edge_transform_and_lower
function. The passes we introduce (edge passes) shall run after the export to edge dialect but before partitioning. The to_edge_transform_and_lower
have optional parameter transform_passes
which I suppose serves this purpose.
@Pop-korn, please update the code to use this call in the executorch_pipeline.py
and aot_neutron_compile.py
if feasible. This shall also resolve the conflict you have with the main branch.
Also rebase to the main. Then please mark the PR as ready to merge.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have resolved the conflict and re-based to main.
The to_edge_transform_and_lower
parameter transform_passes
only runs a single iteration of the passes. Therefore, if a pass makes a change that would allow an earlier pass to make another change, it will not get the chance to do so. The number of iterations is limited by the steps
attribute of the PassManager, which cannot be set using the current interface. The only way to use a different steps
value (that I know of) is to modify the class attribute directly (call PassManager.steps = 10
before calling to_edge_transform_and_lower
), which is not god practice as it has a global effect.
I understand that the to_edge_transform_and_lower
function is the preferred way for program lowering. So do we want to:
- Use
to_edge_transform_and_lower
and only run the passes once. - Use
to_edge_transform_and_lower
and globally modify thesteps
class attribute, to run multiple iterations of passes. - Use the current implementation. I.e. manually export to edge, run the passes ourselves (unsing
PassManager(steps=10)
), and then explicitly export to backend (with partitioning).
Which of these solutions is preferred? Or is there a different solution I am not seeing? @digantdesai
|
||
# The graph has now changed, and we shouldn't keep iterating through it. Return the new graph and the parent | ||
# class will call this pass again. | ||
return PassResult(graph_module, True) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@Pop-korn, in principal this does not differ from the initial draft. You return everytime, you make a modification and the caller EdgePassManager
starts a new iteration. Can you do all the graph modification and return then?
@digantdesai, @Pop-korn noticed in some of the passes the code iterates over a changing graph. E.g. https://github.com/pytorch/executorch/blob/main/backends/xnnpack/_passes/fuse_batch_norm.py#L41 the graph.nodes
gets modified in-place in the https://github.com/pytorch/executorch/blob/main/backends/xnnpack/_passes/fuse_batch_norm.py#L208, everytime a match is found.
@Pop-korn, can you please confirm? @digantdesai Is this legit and intended?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Responding to the first paragraph:
We can safely to all the modifications at once, and then return. We would be modifying the list of nodes while iterating over it, but it shouldn't have negative side-effects as we are only inserting nodes. The downside is that is is not good practice. Do you think that implementation would be preferred?
The second paragraph:
As far as I can tell, the XNNPack batch_norm fusion pass does indeed modify the graph while iterating over it. When the pass removes a batch_norm
node, the next iteration of the for
loop will skip a node. If (somehow) the graph contained the sequence convolution
-> batch_norm
-> batch_norm
, only the first batch_norm
node would be fused, and the second one would be skipped.
…edge dialect programs.
…heir own QDQ clusters. A Pytorch model can contain a `Linear` operator with 4D IO. After quantization, it gets its own QDQ cluster. But after lowering to edge, `view_copy` operators are added within the cluster, right before and after the `Linear` (now `addmm`/`mm`). This does not follow the QDQ schema and causes issues later down the pipeline. Therefore, pre-processing passes at the edge dialect level were implemented, to move the `view_copy` nodes into their own QDQ clusters.
5daaed4
to
35c8809
Compare
Summary
Add infrastructure for running pre-processing passes on edge dialect programs.
Add pre-processing pass to move
view_copy
nodes into their own QDQ clusters.Test plan
Unit test provided.