Asset-Frameworker/ProjectNotes/FEAT-rar-7z-support-plan.md
2025-04-29 18:26:13 +02:00

3.4 KiB

Plan for Adding .rar and .7z Support

Goal: Extend the Asset Processor Tool to accept .rar and .7z files as input sources, in addition to the currently supported .zip files and folders.

Plan:

  1. Add Required Libraries:

    • Update the requirements.txt file to include py7zr and rarfile as dependencies. This will ensure these libraries are installed when setting up the project.
  2. Modify Input Extraction Logic:

    • Locate the _extract_input method within the AssetProcessor class in asset_processor.py.
    • Modify this method to check the file extension of the input source.
    • If the extension is .zip, retain the existing extraction logic using Python's built-in zipfile module.
    • If the extension is .rar, implement extraction using the rarfile library.
    • If the extension is .7z, implement extraction using the py7zr library.
    • Include error handling for cases where the archive might be corrupted, encrypted (since we are not implementing password support at this stage, these should likely be skipped or logged as errors), or uses an unsupported compression method. Log appropriate warnings or errors in such cases.
    • If the input is a directory, retain the existing logic to copy its contents to the temporary workspace.
  3. Update CLI and Monitor Input Handling:

    • Review main.py (CLI entry point) and monitor.py (Directory Monitor).
    • Ensure that the argument parsing in main.py can accept .rar and .7z file paths as valid inputs.
    • In monitor.py, modify the ZipHandler (or create a new handler) to watch for .rar and .7z file creation events in the watched directory, in addition to .zip files. The logic for triggering processing via main.run_processing should then be extended to handle these new file types.
  4. Update Documentation:

    • Edit Documentation/00_Overview.md to explicitly mention .rar and .7z as supported input formats in the overview section.
    • Update Documentation/01_User_Guide/02_Features.md to list .rar and .7z alongside .zip and folders in the features list.
    • Modify Documentation/01_User_Guide/03_Installation.md to include instructions for installing the new py7zr and rarfile dependencies (likely via pip install -r requirements.txt).
    • Revise Documentation/02_Developer_Guide/05_Processing_Pipeline.md to accurately describe the updated _extract_input method, detailing how .zip, .rar, .7z, and directories are handled.
  5. Testing:

    • Prepare sample .rar and .7z files (including nested directories and various file types) to test the extraction logic thoroughly.
    • Test processing of these new archive types via both the CLI and the Directory Monitor.
    • Verify that the subsequent processing steps (classification, map processing, metadata generation, etc.) work correctly with files extracted from .rar and .7z archives.

Here is a simplified flow diagram illustrating the updated input handling:

graph TD
    A[Input Source] --> B{Is it a file or directory?};
    B -- Directory --> C[Copy Contents to Workspace];
    B -- File --> D{What is the file extension?};
    D -- .zip --> E[Extract using zipfile];
    D -- .rar --> F[Extract using rarfile];
    D -- .7z --> G[Extract using py7zr];
    E --> H[Temporary Workspace];
    F --> H;
    G --> H;
    C --> H;
    H --> I[Processing Pipeline Starts];