# Developer Guide: Processing Pipeline

This document details the step-by-step technical process executed by the `ProcessingEngine` class (`processing_engine.py`) when processing a single asset. A new instance of `ProcessingEngine` is created for each processing task to ensure state isolation.

The `ProcessingEngine.process()` method orchestrates the following pipeline based *solely* on the provided `SourceRule` object and the static `Configuration` object passed during engine initialization. It contains no internal prediction, classification, or fallback logic. All necessary overrides and static configuration values are accessed directly from these inputs.

The pipeline steps are:

1.  **Workspace Setup (`_setup_workspace`)**:
    *   Creates a temporary directory using `tempfile.mkdtemp()` to isolate the processing of the current asset.

2.  **Input Extraction (`_extract_input`)**:
    *   If the input is a supported archive type (.zip, .rar, .7z), it's extracted into the temporary workspace using the appropriate library (`zipfile`, `rarfile`, or `py7zr`).
    *   If the input is a directory, its contents are copied into the temporary workspace.
    *   Includes basic error handling for invalid or password-protected archives.

3.  **File Inventory and Classification (`_inventory_and_classify_files`)**:
    *   Scans the contents of the temporary workspace.
    *   Uses the pre-compiled regex patterns from the loaded `Configuration` object and the explicit rules and predicted classifications from the input `SourceRule` object to classify each file. The classification is based on the data already determined by the `PredictionHandler` and potentially modified by the user in the GUI.
    *   Stores the classification results (including source path, determined map type, potential variant suffix, etc.) in `self.classified_files`.
    *   Sorts potential map variants based on the order provided in the `SourceRule` or static configuration.

4.  **Base Metadata Determination (`_determine_base_metadata`, `_determine_single_asset_metadata`)**:
    *   Determines the base asset name, category, and archetype using the explicit values provided in the input `SourceRule` object and the static configuration from the `Configuration` object. Overrides (like `supplier_identifier`, `asset_type`, and `asset_name_override`), including supplier overrides from the GUI, are taken directly from the `SourceRule`.

5.  **Skip Check**:
    *   If the `overwrite` flag (passed during initialization) is `False`, the tool checks if the final output directory for the determined asset name already exists and contains a `metadata.json` file.
    *   If both exist, processing for this specific asset is skipped, marked as "skipped", and the pipeline moves to the next asset (if processing multiple assets from one source) or finishes.

6.  **Map Processing (`_process_maps`)**:
    *   Iterates through the files classified as texture maps for the current asset based on the `SourceRule`. Configuration values used in this step, such as target resolutions, bit depth rules, and output format rules, are retrieved directly from the static `Configuration` object or explicit overrides in the `SourceRule`.
    *   Loads the image using `cv2.imread` (handling grayscale and unchanged flags). Converts BGR to RGB internally for consistency (except for saving non-EXR formats).
    *   Handles Glossiness-to-Roughness inversion if necessary (loads gloss, inverts `1.0 - img/norm`, prioritizes gloss source if both exist).
    *   Resizes the image to target resolutions defined in `IMAGE_RESOULTIONS` (from `Configuration`) using `cv2.resize` (`INTER_LANCZOS4` for downscaling). Upscaling is generally avoided by checks.
    *   Determines the output bit depth based on `MAP_BIT_DEPTH_RULES` (from `Configuration`) or overrides in the `SourceRule`.
    *   Determines the output file format (`.jpg`, `.png`, `.exr`) based on a hierarchy of rules defined in the `Configuration` or overrides in the `SourceRule`.
    *   Converts the NumPy array data type appropriately before saving (e.g., float to uint8/uint16 with scaling).
    *   Saves the processed map using `cv2.imwrite` (converting RGB back to BGR if saving to non-EXR formats). Includes fallback logic (e.g., attempting PNG if saving 16-bit EXR fails).
    *   Calculates image statistics (Min/Max/Mean) using `_calculate_image_stats` on normalized float64 data for the `CALCULATE_STATS_RESOLUTION` (from `Configuration`).
    *   Determines the aspect ratio change string (e.g., `"EVEN"`, `"X150"`) using `_normalize_aspect_ratio_change`.
    *   Stores details about each processed map (path, resolution, format, stats, etc.) in `processed_maps_details_asset`.

7.  **Map Merging (`_merge_maps_from_source`)**:
    *   Iterates through the `MAP_MERGE_RULES` defined in the `Configuration`.
    *   Identifies the required *source* map files needed as input for each merge rule based on the classified files in the `SourceRule`.
    *   Determines common resolutions available across the required input maps.
    *   Loads the necessary source map channels for each common resolution (using a helper `_load_and_transform_source` which includes caching).
    *   Converts inputs to normalized float32 (0-1).
    *   Injects default channel values (from rule `defaults` in `Configuration` or overrides in `SourceRule`) if an input channel is missing.
    *   Merges channels using `cv2.merge`.
    *   Determines output bit depth and format based on rules in `Configuration` or overrides in `SourceRule`. Handles potential JPG 16-bit conflict by forcing 8-bit.
    *   Saves the merged map using the `_save_image` helper (includes data type/color space conversions and fallback).
    *   Stores details about each merged map in `merged_maps_details_asset`.

8.  **Metadata File Generation (`_generate_metadata_file`)**:
    *   Collects all determined information for the current asset: base metadata, details from `processed_maps_details_asset` and `merged_maps_details_asset`, list of ignored files, source preset used, etc. This information is derived from the input `SourceRule` and the processing results.
    *   Writes this collected data into the `metadata.json` file within the temporary workspace using `json.dump`.

9.  **Output Organization (`_organize_output_files`)**:
    *   Creates the final structured output directory: `<output_base_dir>/<supplier_name>/<asset_name>/`. The `supplier_name` used here is derived from the `SourceRule`, ensuring that supplier overrides from the GUI are respected in the output path.
    *   Creates subdirectories `Extra/`, `Unrecognised/`, and `Ignored/` within the asset directory.
    *   Moves the processed maps, merged maps, model files, `metadata.json`, and files classified as Extra, Unrecognised, or Ignored from the temporary workspace into their respective locations in the final output directory structure.

10. **Workspace Cleanup (`_cleanup_workspace`)**:
    *   Removes the temporary workspace directory and its contents using `shutil.rmtree()`. This is called within a `finally` block to ensure cleanup is attempted even if errors occur during processing.

11. **(Optional) Blender Script Execution**:
    *   If triggered via CLI arguments (`--nodegroup-blend`, `--materials-blend`) or GUI controls, the orchestrator (`main.py` or `gui/processing_handler.py`) executes the corresponding Blender scripts (`blenderscripts/*.py`) using `subprocess.run` after the `ProcessingEngine.process()` call completes successfully for an asset batch. See `Developer Guide: Blender Integration Internals` for more details.

This pipeline, executed by the `ProcessingEngine`, provides a clear and explicit processing flow based on the complete rule set provided by the GUI or other interfaces.