Rusfort ce26d54a5d Pre-Codebase-review commit :3
Codebase dedublication and Cleanup refactor

Documentation updated as well

Preferences update

Removed testfiles from repository
2025-05-03 13:19:25 +02:00

5.3 KiB

Developer Guide: Processing Pipeline

This document details the step-by-step technical process executed by the ProcessingEngine class (processing_engine.py) when processing a single asset. A new instance of ProcessingEngine is created for each processing task to ensure state isolation.

The ProcessingEngine.process() method orchestrates the following pipeline based solely on the provided SourceRule object and the static Configuration object passed during engine initialization. It contains no internal prediction, classification, or fallback logic. All necessary overrides and static configuration values are accessed directly from these inputs.

The pipeline steps are:

  1. Workspace Preparation (External):

    • Before the ProcessingEngine is invoked, the calling code (e.g., main.ProcessingTask, monitor._process_archive_task) is responsible for setting up a temporary workspace.
    • This typically involves using utils.workspace_utils.prepare_processing_workspace, which creates a temporary directory and extracts the input source (archive or folder) into it.
    • The path to this prepared workspace is passed to the ProcessingEngine during initialization.
  2. Prediction and Rule Generation (External):

    • Also handled before the ProcessingEngine is invoked.
    • Either the RuleBasedPredictionHandler, LLMPredictionHandler (triggered by the GUI), or utils.prediction_utils.generate_source_rule_from_archive (used by the Monitor) analyzes the input files and generates a SourceRule object.
    • This SourceRule contains predicted classifications and initial overrides.
    • If using the GUI, the user can modify these rules.
    • The final SourceRule object is the primary input to the ProcessingEngine.process() method.
  3. File Inventory (_inventory_and_classify_files):

    • Scans the contents of the already prepared temporary workspace.
    • This step primarily inventories the files present. The classification (determining item_type, etc.) is taken directly from the input SourceRule.
    • Stores the file paths and their associated rules from the SourceRule in self.classified_files.
  4. Base Metadata Determination (_determine_base_metadata, _determine_single_asset_metadata):

    • Determines the base asset name, category, and archetype using the explicit values provided in the input SourceRule and the static Configuration. Overrides (like supplier_identifier, asset_type, asset_name_override) are taken directly from the SourceRule.
  5. Skip Check:

    • If the overwrite flag is False, checks if the final output directory already exists and contains metadata.json.
    • If so, processing for this asset is skipped.
  6. Map Processing (_process_maps):

    • Iterates through files classified as maps in the SourceRule.
    • Loads images (cv2.imread).
    • Handles Glossiness-to-Roughness inversion.
    • Resizes images based on Configuration.
    • Determines output bit depth and format based on Configuration and SourceRule.
    • Converts data types and saves images (cv2.imwrite).
    • Calculates image statistics.
    • Stores processed map details.
  7. Map Merging (_merge_maps_from_source):

    • Iterates through MAP_MERGE_RULES in Configuration.
    • Identifies required source maps based on SourceRule.
    • Loads source channels, handling missing inputs with defaults from Configuration or SourceRule.
    • Merges channels (cv2.merge).
    • Determines output format/bit depth and saves the merged map.
    • Stores merged map details.
  8. Metadata File Generation (_generate_metadata_file):

    • Collects asset metadata, processed/merged map details, ignored files list, etc., primarily from the SourceRule and internal processing results.
    • Writes data to metadata.json in the temporary workspace.
  9. Output Organization (_organize_output_files):

    • Creates the final structured output directory (<output_base_dir>/<supplier_name>/<asset_name>/), using the supplier name from the SourceRule.
    • Moves processed maps, merged maps, models, metadata, and other classified files from the temporary workspace to the final output directory.
  10. Workspace Cleanup (External):

    • After the ProcessingEngine.process() method completes (successfully or with errors), the calling code is responsible for cleaning up the temporary workspace directory created in Step 1. This is often done in a finally block where utils.workspace_utils.prepare_processing_workspace was called.
  11. (Optional) Blender Script Execution (External):

    • If triggered (e.g., via CLI arguments or GUI controls), the orchestrating code (e.g., main.ProcessingTask) executes the corresponding Blender scripts (blenderscripts/*.py) using subprocess.run after the ProcessingEngine.process() call completes successfully.
    • Note: Centralized logic for this was intended for utils/blender_utils.py, but this utility has not yet been implemented. See Developer Guide: Blender Integration Internals for more details.

This pipeline, executed by the ProcessingEngine, provides a clear and explicit processing flow based on the complete rule set provided by the GUI or other interfaces.