7.1 KiB

Developer Guide: Processing Pipeline

This document details the step-by-step technical process executed by the ProcessingEngine class (processing_engine.py) when processing a single asset. A new instance of ProcessingEngine is created for each processing task to ensure state isolation.

The ProcessingEngine.process() method orchestrates the following pipeline based solely on the provided SourceRule object and the static Configuration object passed during engine initialization. It contains no internal prediction, classification, or fallback logic. All necessary overrides and static configuration values are accessed directly from these inputs.

The pipeline steps are:

  1. Workspace Preparation (External):

    • Before the ProcessingEngine is invoked, the calling code (e.g., main.ProcessingTask, monitor._process_archive_task) is responsible for setting up a temporary workspace.
    • This typically involves using utils.workspace_utils.prepare_processing_workspace, which creates a temporary directory and extracts the input source (archive or folder) into it.
    • The path to this prepared workspace is passed to the ProcessingEngine during initialization.
  2. Prediction and Rule Generation (External):

    • Also handled before the ProcessingEngine is invoked.
    • Either the RuleBasedPredictionHandler, LLMPredictionHandler (triggered by the GUI), or utils.prediction_utils.generate_source_rule_from_archive (used by the Monitor) analyzes the input files and generates a SourceRule object.
    • This SourceRule contains predicted classifications and initial overrides.
    • If using the GUI, the user can modify these rules.
    • The final SourceRule object is the primary input to the ProcessingEngine.process() method.
  3. File Inventory (_inventory_and_classify_files):

    • Scans the contents of the already prepared temporary workspace.
    • This step primarily inventories the files present. The classification (determining item_type, etc.) is taken directly from the input SourceRule. The item_type for each file (within the FileRule objects of the SourceRule) is expected to be a key from Configuration.FILE_TYPE_DEFINITIONS.
    • Stores the file paths and their associated rules from the SourceRule in self.classified_files.
  4. Base Metadata Determination (_determine_base_metadata, _determine_single_asset_metadata):

    • Determines the base asset name, category, and archetype using the explicit values provided in the input SourceRule and the static Configuration. Overrides (like supplier_identifier, asset_type, asset_name_override) are taken directly from the SourceRule. The asset_type (within the AssetRule object of the SourceRule) is expected to be a key from Configuration.ASSET_TYPE_DEFINITIONS.
  5. Skip Check:

    • If the overwrite flag is False, checks if the final output directory already exists and contains metadata.json.
    • If so, processing for this asset is skipped.
  6. Map Processing (_process_maps):

    • Iterates through files classified as maps in the SourceRule.
    • Loads images (cv2.imread).
    • Glossiness-to-Roughness Inversion:
      • The system identifies a map as a gloss map if its input filename contains "MAP_GLOSS" (case-insensitive).
      • If such a map is intended to become a roughness map (e.g., its item_type or item_type_override in the SourceRule effectively designates it as roughness), its colors are inverted.
      • After inversion, the map is treated as a "MAP_ROUGH" type for subsequent processing steps.
      • This filename-driven approach is the primary mechanism for triggering gloss-to-roughness inversion, replacing reliance on older contextual flags (like file_rule.is_gloss_source) or general gloss_map_identifiers from the configuration for this specific transformation within the processing engine.
    • Resizes images based on Configuration.
    • Determines output bit depth and format based on Configuration and SourceRule.
    • Converts data types and saves images (cv2.imwrite).
  • The output filename uses the standard_type alias (e.g., COL, NRM) retrieved from the Configuration.FILE_TYPE_DEFINITIONS based on the file's effective item_type.
    • Calculates image statistics.
    • Stores processed map details.
  1. Map Merging (_merge_maps_from_source):

    • Iterates through MAP_MERGE_RULES in Configuration.
    • Identifies required source maps by checking the item_type_override within the SourceRule (specifically in the FileRule for each file). Both item_type and item_type_override are expected to be keys from Configuration.FILE_TYPE_DEFINITIONS. Files with a base item_type of "FILE_IGNORE" are explicitly excluded from consideration.
    • Loads source channels, handling missing inputs with defaults from Configuration or SourceRule.
    • Merges channels (cv2.merge).
    • Determines output format/bit depth and saves the merged map.
    • Stores merged map details.
  2. Metadata File Generation (_generate_metadata_file):

    • Collects asset metadata, processed/merged map details, ignored files list, etc., primarily from the SourceRule and internal processing results.
    • Writes data to metadata.json in the temporary workspace.
  3. Output Organization (_organize_output_files):

  • Determines the final output directory using the global OUTPUT_DIRECTORY_PATTERN and the final filename using the global OUTPUT_FILENAME_PATTERN (both from the Configuration object). The utils.path_utils module combines these with the base output directory and asset-specific data (like asset name, map type, resolution, etc.) to construct the full path for each file.
    • Creates the final structured output directory (<output_base_dir>/<supplier_name>/<asset_name>/), using the supplier name from the SourceRule.
    • Moves processed maps, merged maps, models, metadata, and other classified files from the temporary workspace to the final output directory.
  1. Workspace Cleanup (External):

    • After the ProcessingEngine.process() method completes (successfully or with errors), the calling code is responsible for cleaning up the temporary workspace directory created in Step 1. This is often done in a finally block where utils.workspace_utils.prepare_processing_workspace was called.
  2. (Optional) Blender Script Execution (External):

    • If triggered (e.g., via CLI arguments or GUI controls), the orchestrating code (e.g., main.ProcessingTask) executes the corresponding Blender scripts (blenderscripts/*.py) using subprocess.run after the ProcessingEngine.process() call completes successfully.
    • Note: Centralized logic for this was intended for utils/blender_utils.py, but this utility has not yet been implemented. See Developer Guide: Blender Integration Internals for more details.

This pipeline, executed by the ProcessingEngine, provides a clear and explicit processing flow based on the complete rule set provided by the GUI or other interfaces.