{ "cells": [ { "cell_type": "markdown", "metadata": {}, "source": [ "# \"Doro\" Welcome to the Hugging Face File Uploader! \n", "\n", "# Welcome to the Hugging Face Backup & Image Zipper\n", "\n", "This notebook provides a suite of interactive widgets designed to streamline the entire process of preparing and uploading files to your Hugging Face repositories.\n", "\n", "Each step has been enhanced with \"smart\" features to provide clear feedback, prevent common errors, and accelerate your workflow.\n", "\n", "## Workflow at a Glance\n", "\n", "This notebook is organized into a simple, step-by-step process. Just run the cells in order.\n", "\n", "1. **βš™οΈ Setup & Validate Environment:** The first cell installs all necessary packages and then **validates** the environment, confirming that all tools and the `hf` CLI are ready to use.\n", "2. **πŸ”‘ Secure Authentication:** The login cell **checks your current login status** first. If you need to log in, it will then **validate your token** to ensure it has the correct `write` permissions required for uploading.\n", "3. **πŸ—‚οΈ (Optional) Zip Your Images:** Use the **Smart Image Zipper** to prepare your image datasets. It allows you to filter by file type and **analyze a folder** to see a preview of the archive size *before* zipping.\n", "4. **πŸš€ Upload to the Hub:** The **Smart Uploader** widget provides a powerful interface for uploading your files, supporting concurrent uploads, single-commit mode, and automatic repository creation.\n", "\n", "## Key Features Across the Toolkit\n", "- **Interactive Widgets:** Manage your entire workflow without writing complex scripts.\n", "- **Environment Validation:** Confidence that your setup is correct from the very beginning.\n", "- **Secure Login with Permission Checks:** Prevents upload failures due to incorrect token permissions.\n", "- **Pre-Zip Analysis:** Analyze image folders to know the size and file count before you zip.\n", "- **Advanced Upload Options:** Choose between single-commit mode for clean history or concurrent uploads for speed.\n", "- **Fast Uploads:** Automatically uses `hf_transfer` to speed up large file transfers.\n", "- **Live Progress Bars:** Monitor progress during both zipping and uploading.\n", "\n", "---\n", "\n", "**Community & Support:**\n", "\n", "* **GitHub:** [HuggingFace\\_Backup Repository on GitHub](https://github.com/Ktiseos-Nyx/HuggingFace_Backup) (for the latest version, updates, bug reports, and contributions)\n", "* **Discord:**\n", " * [Ktiseos Nyx AI/ML Discord](https://discord.gg/HhBSvM9gBY)\n", " * [Earth & Dusk Media](https://discord.gg/5t2kYxt7An)\n", "\n", "This toolkit is designed to simplify every step of getting your files onto the Hugging Face Hub. We hope you find it useful!" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ " # \"Doro\" Install Dependencies\n", "\n", "\n", "### βš™οΈ Environment Setup & Validation\n", "\n", "This cell installs and verifies the required packages to ensure your environment is correctly configured.\n", "\n", "### Key Packages & Versions:\n", "* `huggingface_hub==1.3.0`: The latest library for interacting with the Hub, including the powerful `hf` command-line interface (CLI).\n", "* `hf_transfer==0.1.9`: The current version of the library that dramatically **accelerates uploads**.\n", "* `ipywidgets`: Powers the interactive uploader widget.\n", "\n", "After installation, the cell will **validate** the entire setupβ€”checking package versions and confirming that the `hf` CLI is available and ready to use.\n", "\n", "---" ] }, { "cell_type": "code", "execution_count": 1, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "Installing/updating required packages...\n", "Package installation/update process complete.\n" ] } ], "source": [ "# Cell 1: Environment Setup and Validation\n", "# -----------------------------------------------------------------------------\n", "import sys\n", "import os\n", "import subprocess\n", "import pkg_resources\n", "\n", "print(\"βš™οΈ Setting up the environment with specified package versions...\")\n", "\n", "# --- 1. Install exact versions for a reproducible environment ---\n", "# Using '==' ensures that this notebook will work consistently.\n", "# The '-U' is still useful to ensure that if an older version is present, it's replaced.\n", "!{sys.executable} -m pip install -U \"huggingface_hub==1.3.0\" \"ipywidgets>=8.0.0\" \"hf_transfer==0.1.9\" --no-color --disable-pip-version-check\n", "\n", "print(\"\\nβœ… Installation complete.\")\n", "print(\"πŸ” Validating environment and tools...\")\n", "\n", "# --- 2. Validate Python package imports and confirm versions ---\n", "try:\n", " # Use pkg_resources to be absolutely sure of the installed version\n", " hf_hub_version = pkg_resources.get_distribution(\"huggingface_hub\").version\n", " ipywidgets_version = pkg_resources.get_distribution(\"ipywidgets\").version\n", " print(f\" - βœ”οΈ huggingface_hub version: {hf_hub_version}\")\n", " print(f\" - βœ”οΈ ipywidgets version: {ipywidgets_version}\")\n", " \n", " if hf_hub_version != \"1.3.0\":\n", " print(f\" ⚠️ Warning: Expected v1.3.0, but found v{hf_hub_version}. This might cause issues.\")\n", "\n", "except (pkg_resources.DistributionNotFound, ImportError) as e:\n", " print(f\"❌ Critical package failed to be validated: {e}. Please check the installation log above.\")\n", "\n", "# --- 3. Validate hf_transfer for accelerated uploads ---\n", "try:\n", " hf_transfer_version = pkg_resources.get_distribution(\"hf-transfer\").version\n", " print(f\" - βœ”οΈ hf-transfer version: {hf_transfer_version}. Uploads will be accelerated.\")\n", " os.environ['HF_HUB_ENABLE_HF_TRANSFER'] = '1'\n", "except (pkg_resources.DistributionNotFound, ImportError):\n", " print(f\" - ⚠️ hf-transfer is not installed. Uploads may be slow.\")\n", "\n", "# --- 4. Validate that the 'hf' Command-Line Interface (CLI) is working ---\n", "# This confirms the core tool you're interested in is available from the shell.\n", "try:\n", " result = subprocess.run(['hf', 'version'], capture_output=True, text=True, check=True)\n", " print(f\" - βœ”οΈ Hugging Face CLI is ready: ({result.stdout.strip()})\")\n", "except (subprocess.CalledProcessError, FileNotFoundError):\n", " print(\" - ❌ The 'hf' CLI command could not be found or failed to run.\")\n", " print(\" This might indicate an issue with your system's PATH or a broken installation.\")\n", "\n", "print(\"\\nπŸ’‘ Tip: If the widgets in the uploader do not appear later, try restarting the Jupyter kernel (Kernel -> Restart).\")\n", " " ] }, { "cell_type": "markdown", "metadata": { "id": "Xs1mb1VKLuUW" }, "source": [ "# ✨ \"Doro\" Connecting to Hugging Face: Authentication\n", "## πŸ”‘ How To Use\n", "\n", "This smart cell securely handles your login and validates your credentials. You will need a token with **write** permissions to upload files.\n", "\n", "### What this cell does:\n", "1. **Checks Your Status:** It first checks if you are already logged in.\n", "2. **Prompts if Needed:** If you aren't logged in, it will display a login box.\n", "3. **Validates Your Token:** After you enter a token, it immediately confirms that the token is valid and checks its permissions.\n", "\n", "### Instructions:\n", "1. **Create a Token:** Go to your [Hugging Face Tokens page](https://huggingface.co/settings/tokens), click \"New token\", and give it the **`write`** role.\n", "2. **Copy the Token:** Copy the newly generated token to your clipboard.\n", "3. **Run this Cell:** Execute the code cell below. If prompted, paste your token into the box and press `Login`.\n", "\n", "### After Running:\n", "* βœ… If successful, a confirmation message will appear with your username and token permissions.\n", "* ⚠️ If your token only has `read` access, a warning will be displayed.\n", "* ❌ If the login fails, an error message will help you diagnose the issue." ] }, { "cell_type": "code", "execution_count": 11, "metadata": {}, "outputs": [ { "data": { "text/html": [ "\n", "
\n", " βœ… Already logged in!
\n", " Welcome back, Duskfallcrew. Your token has WRITE permissions.\n", "
\n", "
If you need to switch accounts, please restart the kernel and run this cell again.
\n", " " ], "text/plain": [ "" ] }, "metadata": {}, "output_type": "display_data" } ], "source": [ "# Cell 2: Hugging Face Authentication Setup (Smart Version)\n", "# -----------------------------------------------------------------------------\n", "# This cell securely logs you into Hugging Face.\n", "# It checks if you're already logged in, validates your token after entry,\n", "# and confirms the permissions (read/write) of your token.\n", "# -----------------------------------------------------------------------------\n", "from huggingface_hub import notebook_login, whoami\n", "from IPython.display import display, HTML, clear_output\n", "\n", "print(\"Checking Hugging Face authentication status...\")\n", "\n", "try:\n", " # 1. Check if we're already logged in by making a simple API call.\n", " user_info = whoami()\n", " username = user_info.get(\"name\")\n", " auth_scope = \"write\" if user_info.get(\"auth\", {}).get(\"accessToken\", {}).get(\"role\") == \"write\" else \"read\"\n", " \n", " # If the call succeeds, we are already authenticated.\n", " clear_output(wait=True)\n", " display(HTML(f\"\"\"\n", "
\n", " βœ… Already logged in!
\n", " Welcome back, {username}. Your token has {auth_scope.upper()} permissions.\n", "
\n", "
If you need to switch accounts, please restart the kernel and run this cell again.
\n", " \"\"\"))\n", "\n", "except Exception:\n", " # 2. If whoami() fails, it means we're not logged in.\n", " clear_output(wait=True)\n", " print(\"You are not logged in. Please proceed with authentication.\")\n", " print(\"A login widget will appear below. Paste your Hugging Face token with 'write' permissions.\")\n", " \n", " # Display the standard login widget.\n", " notebook_login()\n", "\n", " # 3. After the user submits, validate the new token immediately.\n", " try:\n", " clear_output(wait=True) # Remove the login widget for a clean output\n", " print(\"Validating token...\")\n", " user_info = whoami()\n", " username = user_info.get(\"name\")\n", " auth_scope = \"write\" if user_info.get(\"auth\", {}).get(\"accessToken\", {}).get(\"role\") == \"write\" else \"read\"\n", " \n", " display(HTML(f\"\"\"\n", "
\n", " βœ… Login Successful!
\n", " Welcome, {username}. Your token has been saved with {auth_scope.upper()} permissions.\n", "
\n", " \"\"\"))\n", " \n", " if auth_scope != 'write':\n", " display(HTML(f\"\"\"\n", "
\n", " ⚠️ Warning: Your token only has 'read' permissions. You will not be able to upload files. \n", " Please generate a new token with 'write' permissions on the Hugging Face website.\n", "
\n", " \"\"\"))\n", "\n", " except Exception as e:\n", " clear_output(wait=True)\n", " display(HTML(f\"\"\"\n", "
\n", " ❌ Login Failed.
\n", " The token you provided could not be validated. Please check your token and try again.\n", "
Error: {e}
\n", "
\n", " \"\"\"))\n", "\n" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "# πŸš€ \"Doro\" Using the Hugging Face File Uploader\n", "\n", "## Uploader Checklist\n", "Follow these steps to upload your files.\n", "\n", "1. Fill in Repository Details\n", "- Owner (your username/org)\n", "- Repo Name\n", "- Repo Type (model, dataset, etc.)\n", "\n", "2. Select Local Files\n", "- Set the Source Directory to your local folder path.\n", "- Click the πŸ”„ List Files button.\n", "- Select your desired files from the list that appears.\n", "\n", "3. Review Upload Settings (Optional)\n", "- Add a Commit Message to describe your changes.\n", "- Choose whether to Create a Pull Request for this upload.\n", "\n", "4. Start the Upload\n", "- Click the ⬆️ Upload Selected Files button and monitor the output." ] }, { "cell_type": "code", "execution_count": 5, "metadata": { "cellView": "form", "id": "J851eLx6Ii3h" }, "outputs": [], "source": [ "# --- Essential Imports for the Uploader ---\n", "import glob\n", "import os\n", "import time\n", "import traceback # --- NEW: Explicitly import traceback ---\n", "from pathlib import Path\n", "import math\n", "from concurrent.futures import ThreadPoolExecutor, as_completed # --- NEW: For concurrent uploads ---\n", "from typing import List, Tuple, Optional # --- NEW: For type hinting ---\n", "\n", "from huggingface_hub import HfApi, CommitOperationAdd\n", "from ipywidgets import (Text, Dropdown, Button, SelectMultiple, VBox, HBox,\n", " Output, Layout, Checkbox, HTML, Textarea, Label,\n", "\n", " FloatProgress)\n", "from IPython.display import display, clear_output\n", "\n", "# Attempt to enable hf_transfer.\n", "os.environ['HF_HUB_ENABLE_HF_TRANSFER'] = '1'\n", "\n", "class SmartHuggingFaceUploader:\n", " \"\"\"\n", " A \"smarter\" Jupyter widget-based tool to upload files to the Hugging Face Hub.\n", " Enhancements:\n", " - Fetches user/org repositories.\n", " - Option for single commit for all files.\n", " - Option to create the repo if it doesn't exist.\n", " - Displays file sizes in the picker.\n", " - Disables UI elements during long operations.\n", " - Supports concurrent uploads for better performance.\n", " \"\"\"\n", "\n", " def __init__(self) -> None:\n", " self.api = HfApi()\n", " self.user_info = self.api.whoami()\n", " self.file_types = [\n", " # ... (your file types are great, no change needed)\n", " ('SafeTensors', 'safetensors'), ('PyTorch Models', 'pt'), ('PyTorch Legacy', 'pth'),\n", " ('ONNX Models', 'onnx'), ('TensorFlow Models', 'pb'), ('Keras Models', 'h5'),\n", " ('Checkpoints', 'ckpt'), ('Binary Files', 'bin'),\n", " ('JSON Files', 'json'), ('YAML Files', 'yaml'), ('YAML Alt', 'yml'),\n", " ('Text Files', 'txt'), ('CSV Files', 'csv'), ('Pickle Files', 'pkl'),\n", " ('PNG Images', 'png'), ('JPEG Images', 'jpg'), ('JPEG Alt', 'jpeg'),\n", " ('WebP Images', 'webp'), ('GIF Images', 'gif'),\n", " ('ZIP Archives', 'zip'), ('TAR Files', 'tar'), ('GZ Archives', 'gz')\n", " ]\n", " self.current_directory = os.getcwd()\n", " self.hf_transfer_active = self._check_hf_transfer_availability()\n", " self._create_widgets()\n", " self._bind_events()\n", " self._update_files(None) # Initial file list update\n", "\n", " def _check_hf_transfer_availability(self) -> bool:\n", " if os.environ.get(\"HF_HUB_ENABLE_HF_TRANSFER\") == \"1\":\n", " try:\n", " import hf_transfer\n", " return True\n", " except ImportError:\n", " return False\n", " return False\n", "\n", " def _create_widgets(self) -> None:\n", " # --- Repository Info ---\n", " self.repo_info_html = HTML(value=\"πŸ“š Repository Details\")\n", " \n", " # --- NEW: Dynamic Repo Fetching ---\n", " self.org_name_text = Text(value=self.user_info['name'], placeholder='Organization or Username', description='Owner:', style={'description_width': 'initial'})\n", " self.repo_name_dropdown = Dropdown(options=[], description='Repo:', style={'description_width': 'initial'}, layout=Layout(flex='1'))\n", " self.fetch_repos_btn = Button(description=\"Fetch Repos\", button_style='info', tooltip=\"Fetch this owner's repositories\", layout=Layout(width='auto'))\n", " \n", " self.repo_type_dropdown = Dropdown(options=['model', 'dataset', 'space'], value='model', description='Repo Type:', style={'description_width': 'initial'})\n", " self.repo_folder_text = Text(placeholder='Optional: e.g., models/v1', description='Remote Folder:', style={'description_width': 'initial', \"flex\": \"1 1 auto\"}, layout=Layout(width='auto'))\n", "\n", " # --- File Selection ---\n", " self.file_section_html = HTML(value=\"πŸ—‚οΈ File Selection & Source\")\n", " self.file_type_dropdown = Dropdown(options=self.file_types, value='safetensors', description='File Type:', style={'description_width': 'initial'})\n", " self.sort_by_dropdown = Dropdown(options=['name', 'date', 'size'], value='name', description='Sort By:', style={'description_width': 'initial'}) # --- NEW: Sort by size\n", " self.recursive_search_checkbox = Checkbox(value=False, description='Search Subdirectories', indent=False)\n", "\n", " self.directory_label = Label(value=\"Source Directory:\", layout=Layout(width='auto'))\n", " self.directory_text = Text(value=self.current_directory, description=\"\", style={'description_width': '0px'}, layout=Layout(width=\"auto\", flex='1 1 auto'))\n", " self.directory_update_btn = Button(description='πŸ”„ List Files', button_style='info', tooltip='Change source directory and refresh file list', layout=Layout(width='auto'))\n", "\n", " # --- Commit Details ---\n", " self.commit_section_html = HTML(value=\"πŸ’­ Commit Details\")\n", " self.commit_msg_textarea = Textarea(value=\"Upload files via SmartUploader\", placeholder='Enter your commit message', description='Message:', style={'description_width': 'initial'}, layout=Layout(width='98%', height='60px'))\n", " \n", " # --- NEW: Single Commit & Repo Creation Options ---\n", " self.single_commit_checkbox = Checkbox(value=True, description='Single Commit', indent=False, tooltip=\"Upload all files in one commit.\")\n", " self.create_repo_checkbox = Checkbox(value=True, description='Create repo if not exists', indent=False)\n", " self.private_repo_checkbox = Checkbox(value=False, description='Make repo private', indent=False)\n", "\n", " # --- Upload Settings ---\n", " self.upload_section_html = HTML(value=\"πŸš€ Upload Settings\")\n", " self.create_pr_checkbox = Checkbox(value=False, description='Create Pull Request', indent=False)\n", " self.clear_after_checkbox = Checkbox(value=True, description='Clear output after upload', indent=False)\n", " # --- NEW: Concurrent Uploads ---\n", " self.concurrent_uploads_checkbox = Checkbox(value=True, description='Enable Concurrent Uploads (faster)', indent=False)\n", "\n", " # --- Action Buttons ---\n", " self.upload_button = Button(description='⬆️ Upload Selected Files', button_style='success', tooltip='Start upload process', layout=Layout(width='auto', height='auto'))\n", " self.clear_output_button = Button(description='🧹 Clear Output Log', button_style='warning', tooltip='Clear the output log area', layout=Layout(width='auto'))\n", "\n", " # --- File Picker & Output ---\n", " self.file_picker_selectmultiple = SelectMultiple(options=[], description='Files:', layout=Layout(width=\"98%\", height=\"200px\"), style={'description_width': 'initial'})\n", " self.output_area = Output(layout=Layout(padding='10px', border='1px solid #ccc', margin_top='10px', width='98%', max_height='400px', overflow_y='auto'))\n", "\n", " # --- Progress Display Area ---\n", " self.current_file_label = Label(value=\"N/A\")\n", " self.file_count_label = Label(value=\"File 0/0\")\n", " self.progress_bar = FloatProgress(value=0, min=0, max=100, description='Overall:', bar_style='info', layout=Layout(width='85%'))\n", " self.progress_percent_label = Label(value=\"0%\")\n", "\n", " self.progress_display_box = VBox([\n", " HBox([Label(\"Current File:\", layout=Layout(width='100px')), self.current_file_label]),\n", " HBox([Label(\"File Count:\", layout=Layout(width='100px')), self.file_count_label]),\n", " HBox([self.progress_bar, self.progress_percent_label], layout=Layout(align_items='center'))\n", " ], layout=Layout(visibility='hidden', margin='10px 0', padding='10px', border='1px solid #ddd', width='98%'))\n", "\n", " def _set_ui_busy_state(self, busy: bool) -> None:\n", " \"\"\" --- NEW: Disables key UI elements during operations. --- \"\"\"\n", " self.upload_button.disabled = busy\n", " self.directory_update_btn.disabled = busy\n", " self.fetch_repos_btn.disabled = busy\n", " self.upload_button.icon = 'spinner' if busy else ''\n", "\n", " def _bind_events(self) -> None:\n", " self.directory_update_btn.on_click(self._update_directory_and_files)\n", " self.fetch_repos_btn.on_click(self._fetch_user_repos) # --- NEW ---\n", " self.upload_button.on_click(self._upload_files_handler)\n", " self.clear_output_button.on_click(lambda _: self.output_area.clear_output(wait=True))\n", " # Update file list when any relevant option changes\n", " for widget in [self.file_type_dropdown, self.sort_by_dropdown, self.recursive_search_checkbox]:\n", " widget.observe(self._update_files, names='value')\n", "\n", " def _fetch_user_repos(self, _) -> None:\n", " \"\"\" --- NEW: Fetches repositories for the specified owner. --- \"\"\"\n", " owner = self.org_name_text.value.strip()\n", " if not owner:\n", " with self.output_area: print(\"❗ Please enter an owner (user/org) name.\")\n", " return\n", " \n", " with self.output_area:\n", " clear_output(wait=True)\n", " print(f\"Fetching repos for '{owner}'...\")\n", " try:\n", " self._set_ui_busy_state(True)\n", " repos = list(self.api.list_repos(author=owner, repo_type=self.repo_type_dropdown.value))\n", " repo_names = sorted([repo.repo_id.split('/')[1] for repo in repos])\n", " self.repo_name_dropdown.options = repo_names\n", " if repo_names:\n", " self.repo_name_dropdown.value = repo_names[0]\n", " print(f\"βœ… Found {len(repo_names)} repositories.\")\n", " except Exception as e:\n", " print(f\"❌ Could not fetch repositories: {e}\")\n", " finally:\n", " self._set_ui_busy_state(False)\n", "\n", " def _update_directory_and_files(self, _) -> None:\n", " new_dir = self.directory_text.value.strip()\n", " if not new_dir or not os.path.isdir(new_dir):\n", " with self.output_area:\n", " clear_output(wait=True)\n", " print(f\"❌ Invalid or empty directory path: {new_dir}\")\n", " return\n", "\n", " self.current_directory = os.path.abspath(new_dir)\n", " self.directory_text.value = self.current_directory\n", " self._update_files(None)\n", "\n", " def _update_files(self, _) -> None:\n", " self._set_ui_busy_state(True)\n", " file_extension = self.file_type_dropdown.value\n", " self.output_area.clear_output(wait=True)\n", " try:\n", " source_path = Path(self.current_directory)\n", " if not source_path.is_dir():\n", " with self.output_area: print(f\"⚠️ Source directory '{self.current_directory}' is not valid.\")\n", " self.file_picker_selectmultiple.options = []\n", " return\n", "\n", " # --- MODIFIED: Cleaner file search and sorting ---\n", " pattern = f'**/*.{file_extension}' if self.recursive_search_checkbox.value else f'*.{file_extension}'\n", " found_paths = list(source_path.glob(pattern))\n", " \n", " # Use a dictionary to hold file info for easier sorting\n", " files_info = {}\n", " for p in found_paths:\n", " if p.is_file(): # More robust check\n", " stat = p.stat()\n", " files_info[str(p)] = {'mtime': stat.st_mtime, 'size': stat.st_size, 'name': p.name.lower()}\n", "\n", " sort_key = self.sort_by_dropdown.value\n", " reverse_sort = sort_key in ['date', 'size']\n", " \n", " sorted_paths = sorted(files_info.keys(), key=lambda p: files_info[p][sort_key], reverse=reverse_sort)\n", " \n", " # --- MODIFIED: Display relative paths with file size ---\n", " display_options = []\n", " for abs_path_str in sorted_paths:\n", " file_size = files_info[abs_path_str]['size']\n", " display_name = f\"{os.path.relpath(abs_path_str, self.current_directory)} ({self._format_size(file_size)})\"\n", " display_options.append((display_name, abs_path_str))\n", " \n", " self.file_picker_selectmultiple.options = display_options\n", " \n", " with self.output_area:\n", " if not display_options:\n", " print(f\"🀷 No '.{file_extension}' files found in '{self.current_directory}'.\")\n", " else:\n", " print(f\"✨ Found {len(display_options)} '.{file_extension}' files. Select files to upload.\")\n", "\n", " except Exception as e:\n", " with self.output_area:\n", " clear_output(wait=True); print(f\"❌ Error listing files: {e}\"); traceback.print_exc()\n", " finally:\n", " self._set_ui_busy_state(False)\n", "\n", " def _format_size(self, size_bytes: int) -> str:\n", " if size_bytes < 0: return \"Invalid size\"\n", " if size_bytes == 0: return \"0 B\"\n", " units = (\"B\", \"KB\", \"MB\", \"GB\", \"TB\", \"PB\", \"EB\")\n", " i = math.floor(math.log(size_bytes, 1024)) if size_bytes > 0 else 0\n", " if i >= len(units): i = len(units) - 1\n", " s = round(size_bytes / (1024 ** i), 2)\n", " return f\"{s} {units[i]}\"\n", "\n", " def _upload_files_handler(self, _) -> None:\n", " org_or_user = self.org_name_text.value.strip()\n", " repo_name = self.repo_name_dropdown.value.strip() # --- MODIFIED: Use dropdown value\n", "\n", " if not org_or_user or not repo_name:\n", " with self.output_area: clear_output(wait=True); print(\"❗ Please fill in 'Owner' and select a 'Repo'.\")\n", " return\n", "\n", " repo_id = f\"{org_or_user}/{repo_name}\"\n", " selected_file_paths = list(self.file_picker_selectmultiple.value)\n", "\n", " if not selected_file_paths:\n", " with self.output_area: clear_output(wait=True); print(\"πŸ“ Nothing selected for upload.\")\n", " return\n", "\n", " self._set_ui_busy_state(True)\n", " self.output_area.clear_output(wait=True)\n", " \n", " try:\n", " # --- NEW: Automatic Repo Creation ---\n", " if self.create_repo_checkbox.value:\n", " with self.output_area: print(f\"Ensuring repo '{repo_id}' exists...\")\n", " self.api.create_repo(\n", " repo_id=repo_id,\n", " repo_type=self.repo_type_dropdown.value,\n", " private=self.private_repo_checkbox.value,\n", " exist_ok=True\n", " )\n", "\n", " with self.output_area:\n", " print(f\"🎯 Preparing to upload to: https://huggingface.co/{repo_id}\")\n", " if self.hf_transfer_active: print(\"πŸš€ HF_TRANSFER is enabled.\")\n", " else: print(\"ℹ️ For faster uploads, run `%pip install -q hf_transfer` and restart kernel.\")\n", "\n", " # --- MODIFIED: Handle single vs. multi-commit ---\n", " if self.single_commit_checkbox.value:\n", " self._upload_as_single_commit(repo_id, selected_file_paths)\n", " else:\n", " self._upload_as_multiple_commits(repo_id, selected_file_paths)\n", "\n", " except Exception as e:\n", " with self.output_area:\n", " print(f\"❌ An unexpected error occurred: {e}\")\n", " traceback.print_exc()\n", " finally:\n", " self._set_ui_busy_state(False)\n", " if self.clear_after_checkbox.value:\n", " time.sleep(5)\n", " self.output_area.clear_output(wait=True)\n", " self.progress_display_box.layout.visibility = 'hidden'\n", "\n", " def _upload_as_single_commit(self, repo_id: str, file_paths: List[str]) -> None:\n", " \"\"\" --- NEW: Logic for uploading all files in a single commit. --- \"\"\"\n", " self.progress_display_box.layout.visibility = 'visible'\n", " self.progress_bar.value = 0\n", " self.progress_percent_label.value = \"0%\"\n", " self.current_file_label.value = \"Preparing operations...\"\n", " \n", " repo_folder_prefix = self.repo_folder_text.value.strip().replace('\\\\', '/')\n", " \n", " operations = []\n", " for path_str in file_paths:\n", " path_in_repo_base = os.path.relpath(path_str, self.current_directory).replace('\\\\', '/')\n", " path_in_repo = f\"{repo_folder_prefix}/{path_in_repo_base}\" if repo_folder_prefix.strip('/') else path_in_repo_base\n", " operations.append(CommitOperationAdd(path_in_repo=path_in_repo, path_or_fileobj=path_str))\n", " \n", " commit_message = self.commit_msg_textarea.value or f\"Upload {len(operations)} files\"\n", " \n", " with self.output_area:\n", " print(f\"πŸš€ Starting upload of {len(operations)} files in a single commit...\")\n", " \n", " start_time = time.time()\n", " \n", " # Note: Progress bar for single commit is harder. Here we just show completion.\n", " try:\n", " commit_info = self.api.create_commit(\n", " repo_id=repo_id,\n", " operations=operations,\n", " commit_message=commit_message,\n", " repo_type=self.repo_type_dropdown.value,\n", " create_pr=self.create_pr_checkbox.value\n", " )\n", " duration = time.time() - start_time\n", " self.progress_bar.value = 100\n", " self.progress_percent_label.value = \"100%\"\n", " self.current_file_label.value = \"Completed.\"\n", " with self.output_area:\n", " print(f\"βœ… Successfully committed {len(operations)} files in {duration:.1f}s.\")\n", " print(f\" View commit: {commit_info.commit_url}\")\n", " except Exception as e:\n", " with self.output_area:\n", " print(f\"❌ Commit failed: {e}\")\n", " traceback.print_exc()\n", "\n", " def _upload_as_multiple_commits(self, repo_id: str, file_paths: List[str]) -> None:\n", " \"\"\" --- MODIFIED: Original logic now in its own function, with concurrency. --- \"\"\"\n", " self.progress_display_box.layout.visibility = 'visible'\n", " self.progress_bar.value = 0\n", " total_files = len(file_paths)\n", " \n", " repo_type = self.repo_type_dropdown.value\n", " repo_folder_prefix = self.repo_folder_text.value.strip().replace('\\\\', '/')\n", " base_commit_msg = self.commit_msg_textarea.value or \"Upload file\"\n", " \n", " success_count = 0\n", " files_processed = 0\n", "\n", " # --- NEW: Concurrent Upload Logic ---\n", " use_concurrency = self.concurrent_uploads_checkbox.value\n", " max_workers = 4 if use_concurrency else 1\n", "\n", " with ThreadPoolExecutor(max_workers=max_workers) as executor:\n", " future_to_path = {}\n", " for path_str in file_paths:\n", " path_in_repo_base = os.path.relpath(path_str, self.current_directory).replace('\\\\', '/')\n", " path_in_repo = f\"{repo_folder_prefix}/{path_in_repo_base}\" if repo_folder_prefix.strip('/') else path_in_repo_base\n", " commit_message_for_file = f\"{base_commit_msg} ({Path(path_str).name})\"\n", "\n", " future = executor.submit(\n", " self.api.upload_file,\n", " path_or_fileobj=path_str,\n", " path_in_repo=path_in_repo,\n", " repo_id=repo_id,\n", " repo_type=repo_type,\n", " create_pr=self.create_pr_checkbox.value,\n", " commit_message=commit_message_for_file,\n", " )\n", " future_to_path[future] = path_str\n", "\n", " for future in as_completed(future_to_path):\n", " local_path_str = future_to_path[future]\n", " file_name = Path(local_path_str).name\n", " self.current_file_label.value = file_name\n", " \n", " try:\n", " response_url = future.result()\n", " with self.output_area: print(f\"βœ… Uploaded '{file_name}'\\n View at: {response_url}\")\n", " success_count += 1\n", " except Exception as e:\n", " with self.output_area: \n", " print(f\"❌ Error uploading {file_name}: {e}\")\n", " traceback.print_exc()\n", " finally:\n", " files_processed += 1\n", " percentage = int((files_processed / total_files) * 100)\n", " self.progress_bar.value = percentage\n", " self.progress_percent_label.value = f\"{percentage}%\"\n", " self.file_count_label.value = f\"File {files_processed}/{total_files}\"\n", "\n", " with self.output_area:\n", " print(f\"\\n✨ Upload complete. {success_count}/{total_files} files processed. ✨\")\n", " # Final links logic (unchanged)\n", "\n", " def display(self) -> None:\n", " # --- MODIFIED: Layout updated for new widgets ---\n", " repo_select_box = HBox([self.org_name_text, self.repo_name_dropdown, self.fetch_repos_btn], layout=Layout(flex_flow='wrap', justify_content='space-between', align_items='center'))\n", " repo_opts_box = HBox([self.repo_type_dropdown, self.create_repo_checkbox, self.private_repo_checkbox], layout=Layout(flex_flow='wrap', justify_content='space-between', align_items='center', margin='5px 0'))\n", " \n", " dir_select_box = HBox([self.directory_label, self.directory_text, self.directory_update_btn], layout=Layout(width='100%', align_items='center'))\n", " file_opts_box = HBox([self.file_type_dropdown, self.sort_by_dropdown, self.recursive_search_checkbox], layout=Layout(flex_flow='wrap', justify_content='space-between', align_items='center'))\n", " \n", " commit_opts_box = HBox([self.single_commit_checkbox], layout=Layout(margin='5px 0'))\n", " \n", " upload_opts_box = HBox([self.create_pr_checkbox, self.clear_after_checkbox, self.concurrent_uploads_checkbox], layout=Layout(margin='5px 0', flex_flow='wrap'))\n", " action_buttons_box = HBox([self.upload_button, self.clear_output_button], layout=Layout(margin='10px 0 0 0', spacing='10px'))\n", "\n", " main_layout = VBox([\n", " self.repo_info_html, repo_select_box, repo_opts_box, self.repo_folder_text,\n", " HTML(\"
\"),\n", " self.file_section_html, file_opts_box, dir_select_box,\n", " self.file_picker_selectmultiple,\n", " HTML(\"
\"),\n", " self.commit_section_html, self.commit_msg_textarea, commit_opts_box,\n", " HTML(\"
\"),\n", " self.upload_section_html, upload_opts_box,\n", " action_buttons_box,\n", " self.progress_display_box,\n", " self.output_area\n", " ], layout=Layout(width='800px', padding='10px', border='1px solid lightgray'))\n", " \n", " display(main_layout)\n", "\n", "# How to use it:\n", "# uploader = SmartHuggingFaceUploader()\n", "# uploader.display()" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "# πŸš€ \"Doro\" Uploader Widget! \n", "\n", "**Run the next cell to initiate the uploader widget!**\n" ] }, { "cell_type": "code", "execution_count": 6, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "πŸš€ Initializing the Smart Hugging Face Uploader...\n" ] }, { "data": { "application/vnd.jupyter.widget-view+json": { "model_id": "8d07f13a250d4d42a157b89c15148ae5", "version_major": 2, "version_minor": 0 }, "text/plain": [ "VBox(children=(HTML(value='πŸ“š Repository Details'), HBox(children=(Text(value='Duskfallcrew', descriptio…" ] }, "metadata": {}, "output_type": "display_data" }, { "name": "stdout", "output_type": "stream", "text": [ "βœ… Uploader interface is ready. You can now select files and upload.\n" ] } ], "source": [ "# --- Uploader Widget ---\n", "# This cell creates and displays the uploader interface.\n", "# Make sure you have run the cell containing the SmartHuggingFaceUploader class definition first!\n", "\n", "print(\"πŸš€ Initializing the Smart Hugging Face Uploader...\")\n", "\n", "# Use the new class name here\n", "uploader = SmartHuggingFaceUploader() \n", "uploader.display()\n", "\n", "print(\"βœ… Uploader interface is ready. You can now select files and upload.\")" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## πŸ—‚οΈ Smart Image Zipper\n", "\n", "This widget helps you create a zip archive of images from a specified folder. It's designed to give you more control and feedback than a simple zipping script.\n", "\n", "### Smart Features:\n", "* **Selective Zipping:** Choose which image file types (`.png`, `.jpg`, etc.) you want to include.\n", "* **Analyze Before Zipping:** Click \"Analyze Folder\" to get a preview of how many files will be included and their total size.\n", "* **Live Progress:** A progress bar shows the zipping process in real-time, which is essential for large datasets.\n", "* **Download Link:** Once complete, it provides a direct download link for your new archive." ] }, { "cell_type": "code", "execution_count": 13, "metadata": {}, "outputs": [], "source": [ "# Cell: Smart Image Zipper Widget\n", "# -----------------------------------------------------------------------------\n", "import ipywidgets as widgets\n", "from IPython.display import display, FileLink, HTML, clear_output\n", "import zipfile\n", "import os\n", "from pathlib import Path\n", "\n", "class SmartZipper:\n", " \"\"\"A widget to selectively zip image files with analysis and progress feedback.\"\"\"\n", " \n", " def __init__(self):\n", " self.files_to_zip = []\n", " self._create_widgets()\n", " self._bind_events()\n", "\n", " def _create_widgets(self):\n", " # --- 1. Folder & File Naming ---\n", " self.folder_path_text = widgets.Text(\n", " value=os.getcwd(),\n", " placeholder='Enter the path to the folder containing images',\n", " description='Source Folder:',\n", " style={'description_width': 'initial'},\n", " layout=widgets.Layout(width='98%')\n", " )\n", " self.zip_name_text = widgets.Text(\n", " value='image_archive',\n", " placeholder='Name for the final .zip file',\n", " description='Zip Name:',\n", " style={'description_width': 'initial'},\n", " layout=widgets.Layout(width='50%')\n", " )\n", "\n", " # --- 2. File Type Selection ---\n", " self.file_types_label = widgets.Label(value=\"Select image types to include:\")\n", " self.image_types_checkboxes = [\n", " widgets.Checkbox(description=ext, value=True, indent=False) \n", " for ext in ['.jpg', '.jpeg', '.png', '.webp', '.gif', '.bmp', '.tiff']\n", " ]\n", " self.image_types_box = widgets.HBox(self.image_types_checkboxes, layout=widgets.Layout(flex_flow='wrap'))\n", "\n", " # --- 3. Action Buttons & Progress ---\n", " self.analyze_button = widgets.Button(\n", " description=\"1. Analyze Folder\", \n", " button_style='info', \n", " icon='search',\n", " tooltip=\"Scan the folder and see what will be zipped.\"\n", " )\n", " self.zip_button = widgets.Button(\n", " description=\"2. Create Zip Archive\", \n", " button_style='success', \n", " icon='archive',\n", " tooltip=\"Start the zipping process.\",\n", " disabled=True # Disabled until analysis is complete\n", " )\n", " self.progress_bar = widgets.FloatProgress(\n", " value=0, min=0, max=1.0, description='Zipping:', \n", " bar_style='info', orientation='horizontal',\n", " layout=widgets.Layout(visibility='hidden', width='98%')\n", " )\n", " \n", " # --- 4. Output Area ---\n", " self.output_area = widgets.Output(layout=widgets.Layout(padding='10px', border='1px solid #ccc', margin_top='10px', width='98%'))\n", "\n", " # --- 5. Assemble the Layout ---\n", " self.layout = widgets.VBox([\n", " widgets.HTML(\"

1. Select Source and Name

\"),\n", " self.folder_path_text,\n", " self.zip_name_text,\n", " widgets.HTML(\"

2. Choose File Types

\"),\n", " self.image_types_box,\n", " widgets.HTML(\"

3. Execute

\"),\n", " widgets.HBox([self.analyze_button, self.zip_button]),\n", " self.progress_bar,\n", " self.output_area\n", " ], layout=widgets.Layout(width='700px', padding='10px', border='1px solid lightgray'))\n", "\n", " def _bind_events(self):\n", " self.analyze_button.on_click(self._analyze_folder)\n", " self.zip_button.on_click(self._create_zip_archive)\n", "\n", " def _set_busy_state(self, busy):\n", " \"\"\"Disable buttons during long operations.\"\"\"\n", " self.analyze_button.disabled = busy\n", " self.zip_button.disabled = busy\n", " self.analyze_button.icon = 'spinner' if busy else 'search'\n", "\n", " def _analyze_folder(self, b):\n", " self.zip_button.disabled = True\n", " self.files_to_zip.clear()\n", " self.output_area.clear_output()\n", " self._set_busy_state(True)\n", " \n", " source_folder = self.folder_path_text.value.strip()\n", " selected_extensions = [cb.description for cb in self.image_types_checkboxes if cb.value]\n", " \n", " with self.output_area:\n", " if not Path(source_folder).is_dir():\n", " display(HTML(\"Error: The specified source folder is not a valid directory.\"))\n", " self._set_busy_state(False)\n", " return\n", " \n", " print(f\"πŸ” Scanning '{source_folder}' for {', '.join(selected_extensions)} files...\")\n", " \n", " total_size = 0\n", " for file_path in Path(source_folder).rglob('*'):\n", " if file_path.is_file() and file_path.suffix.lower() in selected_extensions:\n", " self.files_to_zip.append(file_path)\n", " total_size += file_path.stat().st_size\n", " \n", " # Convert size to human-readable format\n", " size_mb = total_size / (1024 * 1024)\n", " \n", " if not self.files_to_zip:\n", " display(HTML(\"Warning: No matching image files were found.\"))\n", " self._set_busy_state(False)\n", " return\n", " \n", " display(HTML(f\"βœ… Analysis Complete: Found {len(self.files_to_zip)} matching image files, with a total size of {size_mb:.2f} MB.\"))\n", " display(HTML(\"You can now proceed by clicking 'Create Zip Archive'.\"))\n", " self.zip_button.disabled = False\n", " \n", " self._set_busy_state(False)\n", "\n", " def _create_zip_archive(self, b):\n", " self._set_busy_state(True)\n", " self.progress_bar.value = 0\n", " self.progress_bar.layout.visibility = 'visible'\n", " self.output_area.clear_output()\n", " \n", " zip_name = self.zip_name_text.value.strip()\n", " final_zip_path = Path.cwd() / f\"{zip_name}.zip\"\n", " \n", " with self.output_area:\n", " if not zip_name:\n", " display(HTML(\"Error: Please provide a name for the zip file.\"))\n", " self._set_busy_state(False)\n", " return\n", " \n", " print(f\"πŸ“¦ Creating archive at: {final_zip_path}...\")\n", " \n", " try:\n", " with zipfile.ZipFile(final_zip_path, 'w', zipfile.ZIP_DEFLATED) as zipf:\n", " total_files = len(self.files_to_zip)\n", " for i, file_path in enumerate(self.files_to_zip):\n", " relative_path = file_path.relative_to(self.folder_path_text.value.strip())\n", " zipf.write(file_path, relative_path)\n", " self.progress_bar.value = (i + 1) / total_files\n", "\n", " display(HTML(f\"πŸŽ‰ Successfully created '{final_zip_path.name}'!\"))\n", " display(HTML(\"You can now use this file in the uploader widget above, or download it using the link below.\"))\n", " display(FileLink(str(final_zip_path)))\n", "\n", " except Exception as e:\n", " display(HTML(f\"Error creating zip file: {e}\"))\n", " finally:\n", " self.progress_bar.layout.visibility = 'hidden'\n", " self._set_busy_state(False)\n", " self.zip_button.disabled = True # Force re-analysis for next run\n", "\n", " def display(self):\n", " \"\"\"Renders the widget in the notebook.\"\"\"\n", " display(self.layout)\n" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## πŸ—‚οΈ Smart Image Zipper\n", "\n", "This one helps display the widget! " ] }, { "cell_type": "code", "execution_count": 14, "metadata": {}, "outputs": [ { "data": { "application/vnd.jupyter.widget-view+json": { "model_id": "e289eb19638e46c6a0448a807f0a8ac7", "version_major": 2, "version_minor": 0 }, "text/plain": [ "VBox(children=(HTML(value='

1. Select Source and Name

'), Text(value='/workspace/stable-diffusion-webui…" ] }, "metadata": {}, "output_type": "display_data" } ], "source": [ "zipper = SmartZipper()\n", "zipper.display()" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [] } ], "metadata": { "colab": { "collapsed_sections": [ "IZ_JYwvBLrg-", "PNF2kdyeO3Dn" ], "private_outputs": true, "provenance": [] }, "kernelspec": { "display_name": "Python3 (ipykernel)", "language": "python", "name": "python3" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 3 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", "version": "3.10.12" } }, "nbformat": 4, "nbformat_minor": 4 }