Developer Guide: Processing Pipeline

This document details the step-by-step technical process executed by the ProcessingEngine class (processing_engine.py) when processing a single asset. A new instance of ProcessingEngine is created for each processing task to ensure state isolation.

The ProcessingEngine.process() method orchestrates the following pipeline based solely on the provided SourceRule object and the static Configuration object passed during engine initialization. It contains no internal prediction, classification, or fallback logic. All necessary overrides and static configuration values are accessed directly from these inputs.

The pipeline steps are:

Workspace Setup (_setup_workspace):
- Creates a temporary directory using tempfile.mkdtemp() to isolate the processing of the current asset.
Input Extraction (_extract_input):
- If the input is a supported archive type (.zip, .rar, .7z), it's extracted into the temporary workspace using the appropriate library (zipfile, rarfile, or py7zr).
- If the input is a directory, its contents are copied into the temporary workspace.
- Includes basic error handling for invalid or password-protected archives.
1. Prediction and Rule Generation (Handled Externally):
  - Before the ProcessingEngine is invoked, either the PredictionHandler (rule-based) or the LLMPredictionHandler (LLM-based) is used (typically triggered by the GUI) to analyze the input files and generate a SourceRule object.
  - This SourceRule object contains the predicted classifications (item_type, asset_type, etc.) and any initial overrides based on the chosen prediction method (preset rules or LLM interpretation).
  - The GUI allows the user to review and modify these predicted rules before processing begins.
  - The final, potentially user-modified, SourceRule object is the primary input to the ProcessingEngine.
2. File Inventory (_inventory_and_classify_files):
  - Scans the contents of the temporary workspace.
  - This step primarily inventories the files present. The classification itself (determining item_type, etc.) has already been performed by the external prediction handler and is stored within the input SourceRule. The engine uses the classifications provided in the SourceRule.
  - Stores the file paths and their associated rules from the SourceRule in self.classified_files.
3. Base Metadata Determination (_determine_base_metadata, _determine_single_asset_metadata):
  - Determines the base asset name, category, and archetype using the explicit values provided in the input SourceRule object and the static configuration from the Configuration object. Overrides (like supplier_identifier, asset_type, and asset_name_override), including supplier overrides from the GUI, are taken directly from the SourceRule.
Skip Check:
- If the overwrite flag (passed during initialization) is False, the tool checks if the final output directory for the determined asset name already exists and contains a metadata.json file.
- If both exist, processing for this specific asset is skipped, marked as "skipped", and the pipeline moves to the next asset (if processing multiple assets from one source) or finishes.
Map Processing (_process_maps):
- Iterates through the files classified as texture maps for the current asset based on the SourceRule. Configuration values used in this step, such as target resolutions, bit depth rules, and output format rules, are retrieved directly from the static Configuration object or explicit overrides in the SourceRule.
- Loads the image using cv2.imread (handling grayscale and unchanged flags). Converts BGR to RGB internally for consistency (except for saving non-EXR formats).
- Handles Glossiness-to-Roughness inversion if necessary (loads gloss, inverts 1.0 - img/norm, prioritizes gloss source if both exist).
- Resizes the image to target resolutions defined in IMAGE_RESOULTIONS (from Configuration) using cv2.resize (INTER_LANCZOS4 for downscaling). Upscaling is generally avoided by checks.
- Determines the output bit depth based on MAP_BIT_DEPTH_RULES (from Configuration) or overrides in the SourceRule.
- Determines the output file format (.jpg, .png, .exr) based on a hierarchy of rules defined in the Configuration or overrides in the SourceRule.
- Converts the NumPy array data type appropriately before saving (e.g., float to uint8/uint16 with scaling).
- Saves the processed map using cv2.imwrite (converting RGB back to BGR if saving to non-EXR formats). Includes fallback logic (e.g., attempting PNG if saving 16-bit EXR fails).
- Calculates image statistics (Min/Max/Mean) using _calculate_image_stats on normalized float64 data for the CALCULATE_STATS_RESOLUTION (from Configuration).
- Determines the aspect ratio change string (e.g., "EVEN", "X150") using _normalize_aspect_ratio_change.
- Stores details about each processed map (path, resolution, format, stats, etc.) in processed_maps_details_asset.
Map Merging (_merge_maps_from_source):
- Iterates through the MAP_MERGE_RULES defined in the Configuration.
- Identifies the required source map files needed as input for each merge rule based on the classified files in the SourceRule.
- Determines common resolutions available across the required input maps.
- Loads the necessary source map channels for each common resolution (using a helper _load_and_transform_source which includes caching).
- Converts inputs to normalized float32 (0-1).
- Injects default channel values (from rule defaults in Configuration or overrides in SourceRule) if an input channel is missing.
- Merges channels using cv2.merge.
- Determines output bit depth and format based on rules in Configuration or overrides in SourceRule. Handles potential JPG 16-bit conflict by forcing 8-bit.
- Saves the merged map using the _save_image helper (includes data type/color space conversions and fallback).
- Stores details about each merged map in merged_maps_details_asset.
Metadata File Generation (_generate_metadata_file):
- Collects all determined information for the current asset: base metadata, details from processed_maps_details_asset and merged_maps_details_asset, list of ignored files, source preset used, etc. This information is derived from the input SourceRule and the processing results.
- Writes this collected data into the metadata.json file within the temporary workspace using json.dump.
Output Organization (_organize_output_files):
- Creates the final structured output directory: <output_base_dir>/<supplier_name>/<asset_name>/. The supplier_name used here is derived from the SourceRule, ensuring that supplier overrides from the GUI are respected in the output path.
- Creates subdirectories Extra/, Unrecognised/, and Ignored/ within the asset directory.
- Moves the processed maps, merged maps, model files, metadata.json, and files classified as Extra, Unrecognised, or Ignored from the temporary workspace into their respective locations in the final output directory structure.
Workspace Cleanup (_cleanup_workspace):
- Removes the temporary workspace directory and its contents using shutil.rmtree(). This is called within a finally block to ensure cleanup is attempted even if errors occur during processing.
(Optional) Blender Script Execution:
- If triggered via CLI arguments (--nodegroup-blend, --materials-blend) or GUI controls, the orchestrator (main.py or gui/processing_handler.py) executes the corresponding Blender scripts (blenderscripts/*.py) using subprocess.run after the ProcessingEngine.process() call completes successfully for an asset batch. See Developer Guide: Blender Integration Internals for more details.

This pipeline, executed by the ProcessingEngine, provides a clear and explicit processing flow based on the complete rule set provided by the GUI or other interfaces.

8.2 KiB Raw Blame History

Developer Guide: Processing Pipeline

8.2 KiB

Raw Blame History