Transformation
If you want to know more details about transformation passes, please take a look at section “Transformation Pass” in chapter Internals.
Submodules
Transformation Passes
Base Class
Guide to writing QONNX transformations
Your transformation must inherit the Transformation abstract base class.
Your transformation’s apply function should take in a ModelWrapper, and return a tuple with (transformed_model: ModelWrapper, model_was_changed: Bool)
The transformations are meant to be applied using the .transform function in ModelWrapper. This makes a deep copy of the input model by default, so you don’t have to.
model_was_changed indicates whether your transformation made any changes to the model. If you know your transformation needs to be called only once and repeated calls have no further effect, you can return False even if the model was changed.
You MUST return model_was_changed=False at some point when your transformation is called multiple times, otherwise apply_repeated() will loop infinitely.
If you cannot guarantee that the transformation will reach a fixed point, you must declare this, return model_was_changed = False and let the user manually re-apply the transform.
- class qonnx.transformation.base.NodeLocalTransformation(num_workers=None)
Bases:
Transformation
Parent class for transformations, which can be executed locally to one node by accessing and modifying the attributes of only that node. This class can then automatically parallelize the transformation. Transformations sublcassing NodeLocalTransformation must implement the abstract method applyNodeLocal(). A read-only copy of the model is available as a member variable ref_input_model, but any modifications there will be disregarded.
To control the degree of parallelization, specify the num_workers argument in the constructor, using one of the following values: * None: use NUM_DEFAULT_WORKERS environment variable * 0: use all available CPU cores * (any other int>0): set number of parallel workers
- apply(model)
- abstract applyNodeLocal(node)
qonnx.transformation.batchnorm_to_affine
- class qonnx.transformation.batchnorm_to_affine.BatchNormToAffine
Bases:
Transformation
Replaces any test-time BatchNorm layers with Mul-Add layers.
- apply(model)
qonnx.transformation.bipolar_to_xnor
- class qonnx.transformation.bipolar_to_xnor.ConvertBipolarMatMulToXnorPopcount
Bases:
Transformation
Convert MatMul nodes with all-bipolar inputs to XnorPopcountMatMul and associated result correction.
- apply(model)
qonnx.transformation.change_3d_tensors_to_4d
- class qonnx.transformation.change_3d_tensors_to_4d.Change3DTo4DTensors
Bases:
Transformation
Replaces 3D tensors with 4D tensors assuming the following format: [N, C, H] -> [N, C, H, 1]. The attributes of a (specific) set of supported nodes are changed accordingly. If the graph contains unsupported nodes, a warning is raised and the transformation is not applied.
- apply(model)
qonnx.transformation.change_batchsize
- class qonnx.transformation.change_batchsize.ChangeBatchSize(bsize)
Bases:
Transformation
Change the batch size dimension to the given value for the entire graph by changing it for the global input/output and removing all intermediate shapes (will need a call to shape inference to restore shapes). Will attempt to handle any Reshape nodes with constant shape parameters by changing the batch size dimension value in the parameter.
- apply(model: ModelWrapper)
qonnx.transformation.change_datalayout
- class qonnx.transformation.change_datalayout.ChangeDataLayoutQuantAvgPool2d
Bases:
Transformation
Replace QuantAvgPool2d with datalayout (N,C,H,W) with Transpose nodes and QuantAvgPool2dNHWC with datalayout (N,H,W,C)
- apply(model)
qonnx.transformation.channels_last
- class qonnx.transformation.channels_last.AbsorbChanFirstIntoMatMul
Bases:
Transformation
Removes a transpose to channels first node if it is in front of a Flatten and MatMul (or Gemm) node.
The channels first transpose is fused into the initializer of the Quant node acting as a weight tensor for the MatMul/Gemm node. Reshape nodes with shape [1, -1] are also supported instead of Flatten nodes. Independent of whether the flattening operation was performed by a Flatten node or a Resphape node, a Flatten node will be reinserted in-front of the MatMul node.
- Note: This transformation removes some of the tensor shapes on the down-stream path.
Thus running shape inference afterwards is advised.
- apply(model)
- class qonnx.transformation.channels_last.ConvertToChannelsLastAndClean(make_input_channels_last=False)
Bases:
Transformation
Converts data layout dependent nodes to ChannelsLast nodes and inserts transformations. Then it tries to eliminate as many transformations as possible and moves the still existing ones as far upstream as possible.
- Parameters:
make_input_channels_last (bool) – Also makes the input of the network channels last, otherwise a transpose node will be left at the beginning of the network. Defaults to False
- apply(model)
- class qonnx.transformation.channels_last.InsertChannelsLastDomainsAndTrafos
Bases:
Transformation
Inserts ChannelsLast domain, where required and also inserts required transposes.
- apply(model)
- class qonnx.transformation.channels_last.MoveChanFirstDownstream
Bases:
Transformation
Moves channel first transformations further downstream.
- apply(model)
- class qonnx.transformation.channels_last.MoveChanLastUpstream
Bases:
Transformation
Moves channel last transformations further upstream.
- apply(model)
- class qonnx.transformation.channels_last.RemoveConsecutiveChanFirstAndChanLastTrafos
Bases:
Transformation
Remove two consecutive transformations, which would do: (ChannelsLast -> ChannelsFirst) -> (ChannelsFirst -> ChannelsLast) Or more concrete, the first converts to channels first and the second to channels last.
- apply(model)
qonnx.transformation.create_generic_partitions
- class qonnx.transformation.create_generic_partitions.PartitionFromDict(partitioning={}, partition_dir=None)
Bases:
Transformation
Split a graph into partitions. Each resulting partition node has a model attribute indicating the path to the subordinate onnx file. Cleanup and InferShapes() transformations should be applied first.
This transformation builds on PartitionFromLambda() and takes a dictionary that defines partitions based on node indices.
Argument 0: partitioning * Dictionary with the following format: { partition_id : node_index_list } * Example: {0 : [3,4,5], 1 : range(10, 15)}
Argument 1 (optional): partition_dir * Manually define where to save the partition models
- apply(model)
- class qonnx.transformation.create_generic_partitions.PartitionFromLambda(partitioning=<function PartitionFromLambda.<lambda>>, partition_dir=None)
Bases:
Transformation
Split a graph into partitions. Each resulting partition node has a model attribute indicating the path to the subordinate onnx file. Cleanup and InferShapes() transformations should be applied first.
Argument 0: partitioning * Function performing the mapping: node -> partition_id (int or string) * Partitions may not cover the graph completely (nodes mapped to -1 are retained) * Mapping must return -1 for GenericPartition nodes
Argument 1 (optional): partition_dir * Manually define where to save the partition models
- apply(model)
qonnx.transformation.double_to_single_float
- class qonnx.transformation.double_to_single_float.DoubleToSingleFloat
Bases:
Transformation
Convert any float64 initializers to float32.
- apply(model)
qonnx.transformation.expose_intermediate
- class qonnx.transformation.expose_intermediate.ExposeIntermediateTensorsLambda(tensor_filter=<function ExposeIntermediateTensorsLambda.<lambda>>)
Bases:
Transformation
- apply(model: ModelWrapper)
- class qonnx.transformation.expose_intermediate.ExposeIntermediateTensorsPatternList(pattern_list, dynamic_only=True)
Bases:
ExposeIntermediateTensorsLambda
- pattern_filter(tname, model)
qonnx.transformation.extend_partition
- class qonnx.transformation.extend_partition.ExtendPartition(extend_index)
Bases:
Transformation
Extends GenericPartition type nodes by inserting the graph pointed to by the model attribute. Argument 0: extend_index * List that contains the node indices of the GenericPartition nodes
- apply(model)
qonnx.transformation.extract_conv_bias
- class qonnx.transformation.extract_conv_bias.ExtractBiasFromConv
Bases:
Transformation
Extracts the (optional) Bias from a Conv(Transpose) node and inserts it behind the Conv(Transpose) node as an Add node.
- apply(model)
qonnx.transformation.extract_quant_scale_zeropt
- class qonnx.transformation.extract_quant_scale_zeropt.ExtractQuantScaleZeroPt
Bases:
Transformation
Extract any non-identity scale and zero-point Quant inputs as separate Div/Mul (for scale) and Add/Sub (for zeropoint” nodes, preceding and following the Quant node.
- apply(model: ModelWrapper)
qonnx.transformation.fold_constants
- class qonnx.transformation.fold_constants.FoldConstants(exclude_op_types=['Quant', 'BipolarQuant'])
Bases:
Transformation
Replace the output of a node with const-only inputs with a precomputed result. Skip any op types given in exclude_op_types.
- apply(model)
- class qonnx.transformation.fold_constants.FoldConstantsFiltered(match_filter_fxn)
Bases:
Transformation
Replace the output of a node with const-only inputs with a precomputed result. Use the match_filter_fxn(model, node) function to decide which nodes can be eligible for const folding.
- apply(model)
qonnx.transformation.gemm_to_matmul
- class qonnx.transformation.gemm_to_matmul.GemmToMatMul
Bases:
Transformation
Converts Gemm nodes into a MatMul and an Add node. This transformation is built to support version 9 of the Gemm node, as documented here: https://github.com/onnx/onnx/blob/master/docs/Changelog.md#Gemm-9 However, earlier and later versions of the node are likely to work as well. Explicitly not supported is the optionality of input C in versions >=11 and the broadcast attribute of versions <=6.
- apply(model)
qonnx.transformation.general
- class qonnx.transformation.general.ApplyConfig(config, node_filter=<function ApplyConfig.<lambda>>)
Bases:
Transformation
Applies node properties (attributes) from either a config dict or its JSON representation given as a filename. The JSON file can specify default values for particular op_types, as well as values for nodes with particular names. Example dict:
{ # set kernel_size = 3 for all nodes with op_type=Im2Col "Defaults" : {"kernel_size" : [3, ["Im2Col"]]}, # set kernel_size = 7 for the particular node with name Im2Col_0 "Im2Col_0" : {"kernel_size" : 7} }
- apply(model)
- class qonnx.transformation.general.ConvertDivToMul
Bases:
Transformation
Convert divide by constant nodes to multiply by constant nodes.
- apply(model)
- class qonnx.transformation.general.ConvertSubToAdd
Bases:
Transformation
Convert subtract-a-constant nodes to add-a-constant nodes.
- apply(model)
- class qonnx.transformation.general.GiveRandomTensorNames
Bases:
Transformation
Give random tensor names to all tensors.
- apply(model)
- class qonnx.transformation.general.GiveReadableTensorNames
Bases:
Transformation
Give more human-readable names to all internal tensors. You should apply GiveUniqueNodeNames prior to this transform to avoid empty node names, as the readable names are based on the node names.
- apply(model)
- class qonnx.transformation.general.GiveUniqueNodeNames(prefix='')
Bases:
Transformation
Give unique names to each node in the graph using enumeration, starting with given prefix (if specified in the constructor).
- apply(model)
- class qonnx.transformation.general.GiveUniqueParameterTensors
Bases:
Transformation
Make every parameter tensor unique. The aim is to avoid affecting other nodes apart from the one the system is currently operating on.
- apply(model)
- class qonnx.transformation.general.MovePadAttributeToTensor
Bases:
Transformation
Move padding info from attribute into input tensor for Pad nodes.
- apply(model)
- class qonnx.transformation.general.RemoveStaticGraphInputs
Bases:
Transformation
Remove any top-level graph inputs that have initializers.
- apply(model)
- class qonnx.transformation.general.RemoveUnusedTensors
Bases:
Transformation
Remove any unused tensors in the graph by removing any initializers, ValueInfo and tensor annotations associated with it. Unused tensors do not appear as any input/output for any graph nodes.
- apply(model)
- class qonnx.transformation.general.SortGraph
Bases:
Transformation
Returns the model with its node list sorted topologically. Any ONNX graph to be executed must have a topologically sorted node list, as dictated by the ONNX standard.
- apply(model)
qonnx.transformation.infer_data_layouts
- class qonnx.transformation.infer_data_layouts.InferDataLayouts
Bases:
Transformation
Try to infer data layout annotations info for all input/intermediate/output tensors based on inputs and node type.
- apply(model)
qonnx.transformation.infer_datatypes
- class qonnx.transformation.infer_datatypes.InferDataTypes
Bases:
Transformation
Infer QONNX DataType info for all intermediate/output tensors based on inputs and node type.
- apply(model)
- qonnx.transformation.infer_datatypes.infer_mac_result_dtype(idtypes, possible_negation)
- qonnx.transformation.infer_datatypes.is_scaled_int(x)
qonnx.transformation.infer_shapes
- class qonnx.transformation.infer_shapes.InferShapes
Bases:
Transformation
Ensure every tensor in the model has a specified shape (ValueInfo).
- apply(model)
qonnx.transformation.insert_topk
- class qonnx.transformation.insert_topk.InsertTopK(k=5, axis=-1, largest=1, sorted=1)
Bases:
Transformation
Add TopK node at the network output and replace the graph output with the TopK indices.
- apply(model)
qonnx.transformation.lower_convs_to_matmul
- class qonnx.transformation.lower_convs_to_matmul.LowerConvsToMatMul
Bases:
Transformation
Replace Conv layers with pairs of Im2Col-MatMul layers, plus Transpose layers to keep the original data layout.
- apply(model)
qonnx.transformation.make_input_chanlast
- class qonnx.transformation.make_input_chanlast.MakeInputChannelsLast
Bases:
Transformation
For networks with an input using the NCx data layout, add a transpose node at the beginning and mark the input as using NxC (channels-last).
- apply(model)
qonnx.transformation.merge_onnx_models
- class qonnx.transformation.merge_onnx_models.MergeONNXModels(pre_model)
Bases:
Transformation
Merges two models. The model passed in the transformation will be inserted before the model the transformation is applied on, the resulting model is returned. This transformation will try to connect graph.output[0] of the pre model and graph.input[0] of the post model. If more than one input or output exists, a warning is raised.
- apply(model)
qonnx.transformation.pruning
- class qonnx.transformation.pruning.ApplyMasks(prune_spec: Dict)
Bases:
Transformation
Apply the given sparsity masks in prune_spec to the appropriately named tensors in the model. These masks are only annotations, no actual pruning is performed at this stage.
- apply(model: ModelWrapper) Tuple[ModelWrapper, bool]
- class qonnx.transformation.pruning.PropagateMasks(lossy: bool = True)
Bases:
Transformation
Propagate the sparsity masks in the network to relevant upstream and downstream layers. Some inital sparsity masks must have been applied either manually or with the ApplyMasks transformation. Note that not all layer types are supported; see the update_node_mask function for details.
- apply(model: ModelWrapper) Tuple[ModelWrapper, bool]
- class qonnx.transformation.pruning.PruneChannels(prune_spec: Dict, lossy: bool = True)
Bases:
Transformation
Prune channels from specified tensors and their dependencies from a model, as specified by the dictionary given in prune_spec. This dictionary must be formatted as {tensor_name : {axis : {channels}}}. See test_pruning.py for examples. If lossy is True, the transformation will aggresively prune all relevant upstream/downstream layers around the specified tensors. This is good for maintaining the consistency of layer shapes, but may introduce a larger accuracy penalty. If lossy is False, the pruning will be more conservative to preserve the numerical ranges (e.g. biases won’t be pruned in the downstream layers) but this may lead to inconsistent shapes in the network.
- apply(model: ModelWrapper) Tuple[ModelWrapper, bool]
- class qonnx.transformation.pruning.RemoveMaskedChannels(lossy: bool = True)
Bases:
Transformation
Remove channels indicated by sparsity masks on the model. The sparsity mask annotations will be removed after they have been processed for each tensor. Does not perform any shape consistency checking and may result in a broken graph.
- apply(model: ModelWrapper) Tuple[ModelWrapper, bool]
- qonnx.transformation.pruning.ensure_masktype_is_dict(mask)
- qonnx.transformation.pruning.merge_dicts_of_sets(dict1, dict2)
- qonnx.transformation.pruning.remove_masked_tensor_channels(tensor_or_shape, mask, axis)
- qonnx.transformation.pruning.update_node_mask(node, masks_in, masks_out, lossy=True)
qonnx.transformation.qcdq_to_qonnx
- class qonnx.transformation.qcdq_to_qonnx.QCDQToQuant
Bases:
Transformation
Fuse a chain of nodes, specifically QuantizeLinear+DequantizeLinear back into QONNX Quant node. This transform finds chains of QuantizeLinear followed by DequantizeLinear during the quantization process into a QONNX Quant node. If a Clip node is found between the QuantizeLinear+DequantizeLinear, this will be taken into account for the Quant bitwidth calculation. Input —– A model potentially quantized with QuantizeLinear, (optional) Clip and DequantizeLinear nodes. Output —— A model with QuantizeLinear, Clip and DequantizeLinear nodes re-fused back into QONNX Quant nodes.
- apply(model: ModelWrapper) Tuple[ModelWrapper, bool]
- qonnx.transformation.qcdq_to_qonnx.extract_elem_type(elem_type: int, clip_range=None) Tuple[int, int, bool]
Return Quant attribute specification based on element type and (optional) clipping range. Returns: (bitwidth, signed, is_narrow_qnt)
qonnx.transformation.qonnx_to_qcdq
- class qonnx.transformation.qonnx_to_qcdq.QuantToQCDQ
Bases:
Transformation
Replace QONNX Quant-style quantization nodes with QuantizeLinear -> Clip -> DequantizeLinear (QCDQ)-style quantization nodes. The following restictions apply on the Quant: - the scale, zero-point and bitwidth inputs for Quant must be statically specified
by an initializer
the bitwidth must be an integer in the range [2, 8]
the zero-point tensor must be zero
the scale must be a scalar value or 1D tensor
the rounding_mode attribute must be ROUND
BipolarQuant is not (yet) supported.
- apply(model: ModelWrapper)
qonnx.transformation.quant_constant_folding
- class qonnx.transformation.quant_constant_folding.FoldTransposeIntoQuantInit
Bases:
Transformation
Fuses a Transpose node into the initializers of a Quant node.
- apply(model: ModelWrapper)
- qonnx.transformation.quant_constant_folding.is_quant_init(node: NodeProto, model: ModelWrapper)
qonnx.transformation.quantize_graph
- class qonnx.transformation.quantize_graph.QuantizeGraph(quantnode_map)
Bases:
Transformation
This transformation can be used to introduce a Quant node for a specific type of node in the graph. Users would be able to specify the location of the quant node by providing the input and output index as the parameters.
- Expectations:
Onnx model in the modelwraper format.
Model must be cleaned using qonnx.util.cleanup.cleanup_model()
Batchsize to be set.
- Steps to transform are:
Step1: Finding the input for the quant node. Step2: Finding the consumer of the quant node output. Step3: Finding the shape for the output tensor of quant node. Note: The output tensor of the quant node must have the same shape as the consumer of the input
to the quant node.
- Input:
A dict “quantnode_map” specifying the criterion, positions, and input parameters like scale, bitwidth, zeropoint, and others for a specific quantnode.
- Criterion:
- name: This will allow users to add quant nodes for specific node like “Conv_0” and “Gemm_0”.
Note: using this users can have quant nodes with different parameters. Ex: quantizing “Conv_0” and “Conv_1” with bitwidth of 4 and 6, respectively.
- op_type: This will allow users to add quant nodes for all nodes of a particular op_type such
as, “Conv”, “Gemm”, and others. Note: All quant nodes created using op_type criterion will have the same input parameters (scale, zeropoint, bitwidth, and others.)
- name and op_type: In this case, quant nodes will be added with precedence to “Name”
in comparison to “op_type”.
- Positions: (“input”, index) or (“output”, index)
“input”: indicates that the user want to quantize the input of the selected node.
“output”: indicates that the user want to quantize the output of the selected node.
index: refers to the input/output index to quantize (a node can have multiple inputs and outputs)
Parameters (to quant node) are provided as (scale, zeropoint, bitwidth, narrow, signed, rounding_mode)
Inputs: scale, zeropoint, bitwidth.
Attributes: narrow, signed, rounding_mode.
- Assert:
The input is a dictionary representing the node names as keys and a list of quant positions as values.
The input dictionary must have atleast one mac node (Conv, gemm, matmul) for the transformation.
- Return:
Returns a model with new quant nodes created at the positions specified using the “quantnode_map”.
- Example:
- quantnode_map = {“name”: {“Conv_0”: [((“input”, 0), (1, 0, 8, 0, 1, “ROUND”)),
((“input”, 1), (1, 0, 8, 0, 1, “ROUND”)), ((“output”, 0), (1, 0, 8, 0, 1, “ROUND”))],
“Conv_1”: [((“input”, 0), (1, 0, 8, 0, 1, “ROUND”))], “Conv_2”: [((“input”, 1), (1, 0, 8, 0, 1, “ROUND”)),
((“output”, 0), (1, 0, 8, 0, 1, “ROUND”))]},
- “op_type”: {“Gemm”: [((“input”, 0), (1, 0, 8, 0, 1, “ROUND”)),
((“input”, 1), (1, 0, 8, 0, 1, “ROUND”)), ((“input”, 2), (1, 0, 8, 0, 1, “ROUND”)), ((“output”, 0), (1, 0, 8, 0, 1, “ROUND”))]}}
- apply(model)
- qonnx.transformation.quantize_graph.adjust_graph(model, input_positions, node_name, quantized_nodes)
- qonnx.transformation.quantize_graph.create_quantnode(model, quantnode_input, quantnode_output_shape, scale_value, zeropoint_value, bitwidth_value, narrow, signed, rounding_mode)
qonnx.transformation.rebalance_conv
- class qonnx.transformation.rebalance_conv.RebalanceIm2Col(extract_channels)
Bases:
Transformation
For certain hardware that prefers channel parallelism over feature map spatial parallelism, it is possible to reshape the inputs to an Im2Col node to move some of the spatial dimension into the channels dimension. This transformation attempts to find such Im2Col nodes, adds a Reshape node in front and alters their kernel/stride sizes accordingly. See list of conditions checked in the implementation for a full list, but one example of rebalancing is provided in the unit test for this transformation (test_rebalance_conv.py)
- apply(model)
qonnx.transformation.remove
- class qonnx.transformation.remove.RemoveIdentityOps(atol=1e-05)
Bases:
Transformation
Remove identity ops like Add/Sub with zero or Mul/Div with one. A tolerance value (defaults to 1e-05) can be specified during init for the comparison to zero/one.
- apply(model)
- class qonnx.transformation.remove.RemoveUnusedNodes
Bases:
Transformation
Remove nodes which do not contribute to any top-level output in the graph, either directly or indirectly.
- apply(model: ModelWrapper)
- qonnx.transformation.remove.remove_node_and_rewire(model, node)
qonnx.transformation.resize_conv_to_deconv
- class qonnx.transformation.resize_conv_to_deconv.ResizeConvolutionToDeconvolution(maintain_bit_width: bool = False)
Bases:
Transformation
Replaces resize convolution layers (e.g., nearest neighbor upsample + same-padded convolution) with deconvolution layers using the weight convolution algorithm. Currently does not support resize convolutions that use bilinear or bicubic upsampling
- apply(model)
qonnx.transformation.subpixel_to_deconv
- class qonnx.transformation.subpixel_to_deconv.SubPixelToDeconvolution
Bases:
Transformation
Replaces sub-pixel convolution layers (i.e., same-padded convolution + depth2space) with deconvolution layers using the weight shuffle algorithm. Currently does not support same-padded convolutions with biases.
- apply(model)
finn.transformation.move_reshape
- class finn.transformation.move_reshape.RemoveCNVtoFCFlatten
Bases:
Transformation
Removes a flatten node if it is between two fpgadataflow nodes. For an NHWC-Conv to FC transition, the preceding transpose is absorbed. The flatten operation can also be implemented by a reshape node.
- apply(model)