Skip to content

leSullivan/unpaired_image_synthesis_with_gans

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

3 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Master Thesis: Synthetic Fence Imagery Using GANs

This repository contains the implementation of various Generative Adversarial Network (GAN) architectures for unpaired image-to-image translation, specifically focused on creating fences on provided landscape images.

Key Features

  • Implementation of multiple GAN architectures:
    • Conditional GAN (CGAN)
    • CycleGAN
    • TurboCycleGAN (with Stable Diffusion Turbo integration)
  • PyTorch Lightning-based training framework
  • Support for different generator architectures (UNet, ResNet)
  • Support for different discriminator architectures
  • Data pipeline for unpaired image datasets
  • SLURM integration for HPC cluster training
  • Comprehensive logging with TensorBoard

Requirements

  • Python 3.8+
  • PyTorch
  • PyTorch Lightning (v2.5.0)
  • Torchvision
  • TensorBoard
  • LPIPS (Learned Perceptual Image Patch Similarity)
  • Transformers
  • Diffusers (v0.25.1)
  • PEFT (Parameter-Efficient Fine-Tuning)
  • Vision-aided loss

Installation

  1. Clone this repository:

    git clone https://github.com/leSullivan/unpaired_image_synthesis_with_gans.git
    cd unpaired_image_synthesis_with_gans
  2. Install the required packages:

    pip install -r requirements.txt
  3. Set up your environment variables:

    • Create a .env file with necessary paths
    • Set CHECKPOINT_PATH for model checkpoints

Data Preparation

The project expects data to be organized in the following directory structure:

imgs/
├── backgrounds/       # Background landscape images
├── fences/            # Fence images
├── results/           # Generated results
└── training_data/     # Training dataset
    ├── trainBg/       # Training background images
    ├── trainFence/    # Training fence images
    ├── valBg/         # Validation background images
    └── valFence/      # Validation fence images

Usage

Training a Model

To train a model, use the main.py script with appropriate arguments:

python main.py --model_name cyclegan --g_type unet_128 --d_type vagan --ngf 64 --img_h 512 --img_w 768 --num_epochs 400

Key parameters:

  • --model_name: Model architecture to use (cgan, cyclegan, turbo_cyclegan)
  • --g_type: Generator architecture (unet_128, resnet, etc.)
  • --d_type: Discriminator architecture
  • --ngf: Number of generator filters
  • --img_h/--img_w: Image dimensions
  • --lambda_perceptual/--lambda_cycle: Loss function weights

SLURM Integration

For training on HPC clusters, use the provided SLURM scripts:

./scripts/queue_turbo_cycle_gan.sh

Or use the template scripts as a starting point:

./scripts/slurm_template.sh
./scripts/slurm_multigpu_template.sh

Monitoring Training

To monitor training progress with TensorBoard:

tensorboard --logdir=/path/to/logs

Alternatively, use the provided script to fetch logs from a remote server:

./scripts/get_tb_logs.sh

Project Structure

  • main.py: Main entry point for training
  • src/:
    • config.py: Configuration parameters
    • data_pipeline.py: Data loading and processing
    • frameworks/: GAN implementation frameworks
    • models/: Generator and discriminator model implementations
    • create_training_dataset.py: Dataset creation utilities
  • scripts/: Helper scripts for running on SLURM clusters
  • imgs/: Image directories
  • requirements.txt: Python dependencies

Configuration

The default configuration parameters are defined in src/config.py. These can be overridden via command-line arguments when running main.py.

Key configuration parameters:

  • Image dimensions and channels
  • Model hyperparameters
  • Training hyperparameters (learning rate, batch size, etc.)
  • Loss function weights
  • Data paths

Acknowledgements

This project builds upon several open-source implementations and research papers:

Model Architectures

Papers

  • Zhu, J. Y., Park, T., Isola, P., & Efros, A. A. (2017). Unpaired image-to-image translation using cycle-consistent adversarial networks. IEEE International Conference on Computer Vision (ICCV).
  • Isola, P., Zhu, J. Y., Zhou, T., & Efros, A. A. (2017). Image-to-image translation with conditional adversarial networks. IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
  • Wang, T. C., Liu, M. Y., Zhu, J. Y., Tao, A., Kautz, J., & Catanzaro, B. (2018). High-resolution image synthesis and semantic manipulation with conditional gans. IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
  • Parmar, G., Zhang, R., & Zhu, J. Y. (2023). Image-to-image Turbo: Iterative Diffusion for High-Quality Image-to-Image Translation. arXiv preprint arXiv:2312.04451.

Libraries

  • PyTorch Lightning - Framework for high-performance AI research
  • Diffusers - State-of-the-art diffusion models for image generation
  • PEFT - Parameter-Efficient Fine-Tuning techniques
  • LPIPS - Learned Perceptual Image Patch Similarity metric

About

Implementation of multiple GAN architectures (cGAN, CycleGAN, TurboCycleGAN) for unpaired image-to-image translation, specifically focused on synthetic fence generation in landscape images. Built with PyTorch Lightning and includes SLURM integration for HPC training.

Topics

Resources

Stars

Watchers

Forks

Contributors

Languages