2025-04-29 18:26:13 +02:00

7.8 KiB

Developer Guide: Processing Pipeline

This document details the step-by-step technical process executed by the AssetProcessor class (asset_processor.py) when processing a single asset.

The AssetProcessor.process() method orchestrates the following pipeline:

  1. Workspace Setup (_setup_workspace):

    • Creates a temporary directory using tempfile.mkdtemp() to isolate the processing of the current asset.
  2. Input Extraction (_extract_input):

    • If the input is a supported archive type (.zip, .rar, .7z), it's extracted into the temporary workspace using the appropriate library (zipfile, rarfile, or py7zr).
    • If the input is a directory, its contents are copied into the temporary workspace.
    • Includes basic error handling for invalid or password-protected archives.
  3. File Inventory and Classification (_inventory_and_classify_files):

    • Scans the contents of the temporary workspace.
    • Uses the pre-compiled regex patterns from the loaded Configuration object to classify each file.
    • Classification follows a multi-pass approach for priority:
      • Explicitly marked Extra/ files (using move_to_extra_patterns regex).
      • Model files (using model_patterns regex).
      • Potential Texture Maps (matching map_type_mapping keyword patterns).
      • Standalone 16-bit variants check (using bit_depth_variants patterns).
      • Prioritization of 16-bit variants over their 8-bit counterparts (marking the 8-bit version as Ignored).
      • Final classification of remaining potential maps.
      • Remaining files are classified as Unrecognised (and typically moved to Extra/ later).
    • Stores the classification results (including source path, determined map type, potential variant suffix, etc.) in self.classified_files.
    • Sorts potential map variants based on preset rule order, keyword order within the rule, and finally alphabetical path to determine suffix assignment priority (-1, -2, etc.).
  4. Base Metadata Determination (_determine_base_metadata, _determine_single_asset_metadata):

    • Determines the base asset name using source_naming_convention rules from the Configuration (separators, indices), with fallbacks to common prefixes or the input name. Handles multiple distinct assets within a single input source.
    • Determines the asset category (Texture, Asset, Decal) based on the presence of model files or decal_keywords in the Configuration.
    • Determines the asset archetype (e.g., Wood, Metal) by matching keywords from archetype_rules (in Configuration) against file stems or the determined base name.
    • Stores this preliminary metadata.
  5. Skip Check:

    • If the overwrite flag (passed during initialization) is False, the tool checks if the final output directory for the determined asset name already exists and contains a metadata.json file.
    • If both exist, processing for this specific asset is skipped, marked as "skipped", and the pipeline moves to the next asset (if processing multiple assets from one source) or finishes.
  6. Map Processing (_process_maps):

    • Iterates through the files classified as texture maps for the current asset.
    • Loads the image using cv2.imread (handling grayscale and unchanged flags). Converts BGR to RGB internally for consistency (except for saving non-EXR formats).
    • Handles Glossiness-to-Roughness inversion if necessary (loads gloss, inverts 1.0 - img/norm, prioritizes gloss source if both exist).
    • Resizes the image to target resolutions defined in IMAGE_RESOULTIONS (from Configuration) using cv2.resize (INTER_LANCZOS4 for downscaling). Upscaling is generally avoided by checks.
    • Determines the output bit depth based on MAP_BIT_DEPTH_RULES (respect vs force_8bit).
    • Determines the output file format (.jpg, .png, .exr) based on a hierarchy of rules:
      • FORCE_LOSSLESS_MAP_TYPES list (overrides other logic).
      • RESOLUTION_THRESHOLD_FOR_JPG (forces JPG for large 8-bit maps).
      • Source format, target bit depth, and configured defaults (OUTPUT_FORMAT_16BIT_PRIMARY, OUTPUT_FORMAT_8BIT).
    • Converts the NumPy array data type appropriately before saving (e.g., float to uint8/uint16 with scaling).
    • Saves the processed map using cv2.imwrite (converting RGB back to BGR if saving to non-EXR formats). Includes fallback logic (e.g., attempting PNG if saving 16-bit EXR fails).
    • Calculates image statistics (Min/Max/Mean) using _calculate_image_stats on normalized float64 data for the CALCULATE_STATS_RESOLUTION.
    • Determines the aspect ratio change string (e.g., "EVEN", "X150") using _normalize_aspect_ratio_change.
    • Stores details about each processed map (path, resolution, format, stats, etc.) in processed_maps_details_asset.
  7. Map Merging (_merge_maps_from_source):

    • Iterates through the MAP_MERGE_RULES defined in the Configuration.
    • Identifies the required source map files needed as input for each merge rule based on the classified files.
    • Determines common resolutions available across the required input maps.
    • Loads the necessary source map channels for each common resolution (using a helper _load_and_transform_source which includes caching).
    • Converts inputs to normalized float32 (0-1).
    • Injects default channel values (from rule defaults) if an input channel is missing.
    • Merges channels using cv2.merge.
    • Determines output bit depth and format based on rules (similar logic to _process_maps, considering input properties). Handles potential JPG 16-bit conflict by forcing 8-bit.
    • Saves the merged map using the _save_image helper (includes data type/color space conversions and fallback).
    • Stores details about each merged map in merged_maps_details_asset.
  8. Metadata File Generation (_generate_metadata_file):

    • Collects all determined information for the current asset: base metadata, details from processed_maps_details_asset and merged_maps_details_asset, list of ignored files, source preset used, etc.
    • Writes this collected data into the metadata.json file within the temporary workspace using json.dump.
  9. Output Organization (_organize_output_files):

    • Creates the final structured output directory: <output_base_dir>/<supplier_name>/<asset_name>/.
    • Creates subdirectories Extra/, Unrecognised/, and Ignored/ within the asset directory.
    • Moves the processed maps, merged maps, model files, metadata.json, and files classified as Extra, Unrecognised, or Ignored from the temporary workspace into their respective locations in the final output directory structure.
  10. Workspace Cleanup (_cleanup_workspace):

    • Removes the temporary workspace directory and its contents using shutil.rmtree(). This is called within a finally block to ensure cleanup is attempted even if errors occur during processing.
  11. (Optional) Blender Script Execution:

    • If triggered via CLI arguments (--nodegroup-blend, --materials-blend) or GUI controls, the orchestrator (main.py or gui/processing_handler.py) executes the corresponding Blender scripts (blenderscripts/*.py) using subprocess.run after the AssetProcessor.process() call completes successfully for an asset batch. See Developer Guide: Blender Integration Internals for more details.

Note on Data Passing: As mentioned in the Architecture documentation, major changes to the data passing mechanisms between the GUI, Main (CLI orchestration), and AssetProcessor modules are currently being planned. The descriptions of how data is processed and transformed within this pipeline reflect the current state and will require review and updates once the plan for these changes is finalized.