GANsformer: Generative Adversarial Transformers Drew A

Overview

PWC PWC PWC

GANsformer: Generative Adversarial Transformers

Drew A. Hudson* & C. Lawrence Zitnick

*I wish to thank Christopher D. Manning for the fruitful discussions and constructive feedback in developing the Bipartite Transformer, especially when explored within the language representation area, as well as for the kind financial support that allowed this work to happen!

This is an implementation of the GANsformer model, a novel and efficient type of transformer, explored for the task of image generation. The network employs a bipartite structure that enables long-range interactions across the image, while maintaining computation of linearly efficiency, that can readily scale to high-resolution synthesis. The model iteratively propagates information from a set of latent variables to the evolving visual features and vice versa, to support the refinement of each in light of the other and encourage the emergence of compositional representations of objects and scenes. In contrast to the classic transformer architecture, it utilizes multiplicative integration that allows flexible region-based modulation, and can thus be seen as a generalization of the successful StyleGAN network.

Instructions for model training and data prepreation as well as pretrained models will be available soon.
Note that the code is still going through some refactoring and clean-up. Will be ready for running by end of March 3. Stay Tuned!
(Code clean-up by March 3, all instructions by March 7, pretrained networks by March 20)

Bibtex

@article{hudson2021gansformer,
  title={Generative Adversarial Transformers},
  author={Hudson, Drew A and Zitnick, C. Lawrence},
  journal={arXiv preprint},
  year={2021}
}

Architecture overview

The GANsformer consists of two networks:

  • Generator: which produces the images (x) given randomly sampled latents (z). The latent z has a shape [batch_size, component_num, latent_dim], where component_num = 1 by default (Vanilla GAN, StyleGAN) but is > 1 for the GANsformer model. We can define the latent components by splitting z along the second dimension to obtain z_1,...,z_k latent components. The generator likewise consists of two parts:

    • Mapping network: converts sampled latents from a normal distribution (z) to the intermediate space (w). A series of Feed-forward layers. The k latent components either are mapped independently from the z space to the w space or interact with each other through self-attention (optional flag).
    • Synthesis network: the intermediate latents w are used to guide the generation of new images. Images features begin from a small constant/sampled grid of 4x4, and then go through multiple layers of convolution and up-sampling until reaching the desirable resolution (e.g. 256x256). After each convolution, the image features are modulated (meaning that their variance and bias are controlled) by the intermediate latent vectors w. While in the StyleGAN model there is one global w vectors that controls all the features equally. The GANsformer uses attention so that the k latent components specialize to control different regions in the image to create it cooperatively, and therefore perform better especially in generating images depicting multi-object scenes.
    • Attention can be used in several ways
      • Simplex Attention: when attention is applied in one direction only from the latents to the image features (top-down).
      • Duplex Attention: when attention is applied in the two directions: latents to image features (top-down) and then image features back to latents (bottom-up), so that each representation informs the other iteratively.
      • Self Attention between latents: can also be used so to each direct interactions between the latents.
      • Self Attention between image features (SAGAN model): prior approaches used attention directly between the image features, but this method does not scale well due to the quadratic number of features which becomes very high for high-resolutions.
  • Discriminator: Receives and image and has to predict whether it is real or fake – originating from the dataset or the generator. The model perform multiple layers of convolution and downsampling on the image, reducing the representation's resolution gradually until making final prediction. Optionally, attention can be incorporated into the discriminator as well where it has multiple (k) aggregator variables, that use attention to adaptively collect information from the image while being processed. We observe small improvements in model performance when attention is used in the discriminator, although note that most of the gain in using attention based on our observations arises from the generator.

Codebase

This codebase builds on top of and extends the great StyleGAN2 repository by Karras et al.
The GANsformer model can also be seen as a generalization of StyleGAN: while StyleGAN has one global latent vector that control the style of all image features globally, the GANsformer has k latent vectors, that cooperate through attention to control regions within the image, and thereby better modeling images of multi-object and compositional scenes.

More documentation and instructions will be coming soon!

Comments
  • Do you have any plans to export a pytorch version?

    Do you have any plans to export a pytorch version?

    Hi, I am not too familiar with tensorflow... If there are no such plans currently, do you have quick pointers to:

    1. the GANsformer model, especially where and how you deal with the latents (based on your paper, you split the latents?)
    2. what kind of optimizers are you using? and how do you implemented it? Is it similar to what we did in NLP (warmup, etc);
    3. did you ever tried using the standard feedforward after your duplex attention layer instead of 3x3? Did it still work?

    Thanks again for your kind attention! Best,

    opened by MultiPath 12
  • Some Errors On Training

    Some Errors On Training

    Thank you for your great work. I appreciate it a lot.

    I just tried to train a model with your codes, however there are lots of undefined variables used. For example:

    https://github.com/dorarad/gansformer/blob/148f72964219f8ead2621204bc5cfa89200b6879/training/network.py#L795

    It throw out undefined variable error for 'maps_in'. When I fix that with a constant, I get another error from

    https://github.com/dorarad/gansformer/blob/148f72964219f8ead2621204bc5cfa89200b6879/training/network.py#L811

    again gen_mod and gen_cond are not defined. When I fix that with a constant again, I get another error which says:

    gansformer-main/gansformer-main/training/network.py", line 1127, in G_synthesis grid_poses = get_positional_embeddings(resolution_log2, pos_dim or dlatent_size, pos_type, pos_directions_num, init = pos_init, **_kwargs) TypeError: get_positional_embeddings() got an unexpected keyword argument 'label_size'

    Am i missing something or is there a problem?

    opened by yilmazkorkmz 10
  • CLEVR pretrained model gives FID 22

    CLEVR pretrained model gives FID 22

    Hi, kudos for great work!

    I've just noticed that with recommended preprocessing and evaluation, the metrics on gdrive:cityscapes work as expected (FID ~5.2), while for CLEVR exactly two same lines:

    python prepare_data.py --clevr --max-images 100000
    python run_network.py --eval --gpus 0 --expname clevr-exp --dataset clevr --pretrained-pkl gdrive:clevr-snapshot.pkl
    

    give ~22 FID, not 9.2. Can you please double-check if the provided snapshot is correct? Or am I missing smth here?

    Thanks in advance!

    opened by JanRocketMan 8
  • kernel error in generate.py

    kernel error in generate.py

    In a python 3.7, tensorflow-gpu=1.15.0 cuda 10.0 and cuddn 7.5 I get this error in generate.py (which appeared to require cuddn 7.6.5, which brings a different error (see second part). Any advice?

    ... Could not load dynamic library 'libcudnn.so.7'; dlerror: libcudnn.so.7: cannot open shared object file

    ........... Total 35894608

    Generate images... 0%| | 0/8 [00:01<?, ?image (1 batches of 8 images)/s] Traceback (most recent call last): File "/vulcanscratch/yaser/miniconda3/envs/yygentransformer/lib/python3.7/site-packages/tensorflow_core/python/client/session.py", line 1365, in _do_call return fn(*args) File "/vulcanscratch/yaser/miniconda3/envs/yygentransformer/lib/python3.7/site-packages/tensorflow_core/python/client/session.py", line 1350, in _run_fn target_list, run_metadata) File "/vulcanscratch/yaser/miniconda3/envs/yygentransformer/lib/python3.7/site-packages/tensorflow_core/python/client/session.py", line 1443, in _call_tf_sessionrun run_metadata) tensorflow.python.framework.errors_impl.InvalidArgumentError: No OpKernel was registered to support Op 'FusedBiasAct' used by {{node Gs/_Run/Gs/G_mapping/AttLayer_0/FusedBiasAct}}with these attrs: [gain=1, T=DT_FLOAT, axis=1, alpha=0, grad=0, act=1] Registered devices: [CPU, XLA_CPU, XLA_GPU] Registered kernels: device='GPU'; T in [DT_HALF] device='GPU'; T in [DT_FLOAT]

         [[Gs/_Run/Gs/G_mapping/AttLayer_0/FusedBiasAct]]
    

    CUDNN7.6.5 error .... Total 35894608

    Generate images... 0%| | 0/8 [00:01<?, ?image (1 batches of 8 images)/s] Traceback (most recent call last): File "/vulcanscratch/yaser/miniconda3/envs/yygentransformer/lib/python3.7/site-packages/tensorflow_core/python/client/session.py", line 1365, in _do_call return fn(*args) File "/vulcanscratch/yaser/miniconda3/envs/yygentransformer/lib/python3.7/site-packages/tensorflow_core/python/client/session.py", line 1350, in _run_fn target_list, run_metadata) File "/vulcanscratch/yaser/miniconda3/envs/yygentransformer/lib/python3.7/site-packages/tensorflow_core/python/client/session.py", line 1443, in _call_tf_sessionrun run_metadata) tensorflow.python.framework.errors_impl.InternalError: 2 root error(s) found. (0) Internal: cudaErrorNoKernelImageForDevice [[{{node Gs/_Run/Gs/G_mapping/global/Dense0_0/FusedBiasAct}}]] [[Gs/_Run/Gs/maps_out/_3151]] (1) Internal: cudaErrorNoKernelImageForDevice [[{{node Gs/_Run/Gs/G_mapping/global/Dense0_0/FusedBiasAct}}]] 0 successful operations. 0 derived errors ignored.

    opened by yaseryacoob 8
  • About the Duplex attention

    About the Duplex attention

    Hi, Thanks for sharing the code!

    I have a few questions about Section 3.1.2. Duplex attention.

    1. I am confused by the notation in the section. For example, in this section, "Y=(K^{P\times d}, V^{P\times d}), where the values store the content of the Y variables (e.g. the randomly sampled latents for the case of GAN)". Does it mean that V^{P\times d} is sampled from the original variable Y? how to set the number of P in your code?

    2. "keys track the centroids of the attention-based assignments from X to Y, which can be computed as K=a_b(Y, X)", does it mean K is calculated by using the self-attention module but with (Y, X) as input? If so, how to understand “the keys track the centroid of the attention-based assignments from X to Y”? BTW, how to get the centroids?

    3. For the update rule in duplex attention, what does the a() function mean? Does it denote a self-attention module like a_b() in Section 3.1.1, where X as query, K as keys, and V as values, if so, K is calculated from another self-attention module as mentioned in question 2, so the output of a_b(Y, X) will be treated as Keys, so the update rule contains two self-attention operations? is that right? Does it mean ’Duplex‘ attention?

    4. But finally I find I may be wrong when I read the last paragraph in this section. As mentioned in this section, "to support bidirectional interaction between elements, we can chain two reciprocal simplex attentions from X to Y and from Y to X, obtaining the duplex attention" So, does it mean, first, we calculate the Y by using a simplex attention module u^a(Y, X), and then use this Y as input of u^d(X, Y) to update X? Does it mean the duplex attention module contains three self-attention operations?

    Thanks a lot! :)

    opened by AndrewChiyz 7
  • FID VQ-GAN

    FID VQ-GAN

    Thank you for open-sourcing your code :)

    I was wondering about the generally very high FID values for the VQGAN. In the VQGAN paper, they report on, e.g., FFHQ 256x256 an FID of 11.4, whereas you report 63.1... Any idea why they are so different?

    Thanks!

    opened by xl-sr 7
  • PyTorch implementation generates same image samples

    PyTorch implementation generates same image samples

    Hi, I'm getting the same output image samples (see below) when I train the PyTorch implementation on FFHQ from scratch. The only changes I made (due to some memory issues mentioned in #33) were adding --batch-gpu 1 and removing saving attention map functionality (commenting out pytorch_version/training/visualize.py lines 167-206).

    python run_network.py --train --gpus 0 --batch-gpu 1 --ganformer-default --expname ffhq-scratch --dataset ffhq 000120 000240

    opened by kwhuang88228 6
  • Metrics PR Error

    Metrics PR Error

    Dear authors,

    Thank you for your wonderful contribution!!!

    When I tried to get precision and recall values during training by adding option, --metric pr, I got the following error


    \precision_recall.py", line 179, in _evaluate feats = self._gen_feats(Gs, inception, minibatch_size, num_gpus, Gs_kwargs) NameError: name 'inception' is not defined

    So, I have changed the lines in precision_recall.py. After the modification, It seems to work. I would greatly appreciate it if you kindly review my modification.


    def _evaluate(self, Gs, Gs_kwargs, num_gpus, num_imgs, paths = None, **kwargs):

           if paths is not None: 
               # Extract features for local sample image files (paths)
    ----->  eval_features = self._paths_to_feats(paths, feat_func, minibatch_size, num_gpus, num_imgs)
           else:
               # Extract features for newly generated fake imgs
    ----->  eval_features = self._gen_feats(Gs, feature_net, minibatch_size, num_imgs, num_gpus, Gs_kwargs)
    
           # Compute precision and recall
           state = knn_precision_recall_features(ref_features = ref_features, eval_features = eval_features,
               feature_net = feature_net, nhood_sizes = [self.nhood_size], row_batch_size = self.row_batch_size,
    ----->  col_batch_size = self.row_batch_size, num_gpus = num_gpus, num_imgs = num_imgs)
           self._report_result(state.knn_precision[0], suffix = "_precision")
           self._report_result(state.knn_recall[0], suffix = "_recall")
    
    -------------------------------------------------------------------------
    
    opened by bwhwang 6
  • Memory issue when training 1024 resolution

    Memory issue when training 1024 resolution

    I'm trying to train a 1024x1024 database on a V100 GPU. I tried both the tensorflow version and the pytorch version. Despite setting batch-gpu to 1, the tensorflow version always run out of system RAM(after the first tick, system ram total 51gb), and the pytorch version alway run out of cuda memory(before the first tick).

    Here are my training settings:

    python run_network.py --train --metrics 'none' --gpus 0 --batch-gpu 1 --resolution 1024 \
     --ganformer-default --expname art1 --dataset 1024art
    

    Also, I always encounter the warning: tcmalloc: large alloc

    opened by BlueberryGin 5
  • Issues with docker

    Issues with docker

    Hi,

    I'm trying to dockerize using this image - tensorflow/tensorflow:1.14.0-gpu-py3.

    FROM tensorflow/tensorflow:1.14.0-gpu-py3
    
    ARG USER="test"
    ARG WORK_DIR="/home/$USER"
    
    WORKDIR $WORK_DIR
    
    RUN apt-get update && apt-get install build-essential
    
    RUN apt-get install ffmpeg libsm6 libxext6  -y
    
    RUN pip install --upgrade pip setuptools wheel
    
    COPY . ./
    
    RUN pip install -r requirements.txt
    
    RUN python generate.py --gpus 0 --model gdrive:bedrooms-snapshot.pkl --output-dir images --images-num 4
    

    However, I am getting this error:

    Downloading https://drive.google.com/uc?id=1-2L3iCBpP_cf6T2onf3zEQJFAAzxsQne .... done
    
    2021-04-06 08:32:44 UTC -- Setting up TensorFlow plugin 'upfirdn_2d.cu': Preprocessing... Compiling... Loading... bin_file:  /home/test/dnnlib/tflib/_cudacache/upfirdn_2d_1.14_.so
    
    2021-04-06 08:32:44 UTC -- Failed!
    
    2021-04-06 08:32:44 UTC -- Traceback (most recent call last):
    
    2021-04-06 08:32:44 UTC --   File "generate.py", line 49, in <module>
    
    2021-04-06 08:32:44 UTC --     main()
    
    2021-04-06 08:32:44 UTC --   File "generate.py", line 46, in main
    
    2021-04-06 08:32:44 UTC --     run(**vars(args))
    
    2021-04-06 08:32:44 UTC --   File "generate.py", line 22, in run
    
    2021-04-06 08:32:44 UTC --     G, D, Gs = load_networks(model)                             # Load pre-trained network
    
    2021-04-06 08:32:44 UTC --   File "/home/test/pretrained_networks.py", line 30, in load_networks
    
    2021-04-06 08:32:44 UTC --     G, D, Gs = pickle.load(stream, encoding = "latin1")[:3]
    
    2021-04-06 08:32:44 UTC --   File "/home/test/dnnlib/tflib/network.py", line 306, in __setstate__
    
    2021-04-06 08:32:44 UTC --     self._init_graph()
    
    2021-04-06 08:32:44 UTC --   File "/home/test/dnnlib/tflib/network.py", line 159, in _init_graph
    
    2021-04-06 08:32:44 UTC --     out_expr = self._build_func(*self.input_templates, **build_kwargs)
    
    2021-04-06 08:32:44 UTC --   File "<string>", line 2371, in G_synthesis_stylegan2
    
    2021-04-06 08:32:44 UTC --   File "/home/test/dnnlib/tflib/ops/upfirdn_2d.py", line 229, in downsample_2d
    
    2021-04-06 08:32:44 UTC --     return _simple_upfirdn_2d(x, k, down=factor, pad0=(p+1)//2, pad1=p//2, data_format=data_format, impl=impl)
    
    2021-04-06 08:32:44 UTC --   File "/home/test/dnnlib/tflib/ops/upfirdn_2d.py", line 358, in _simple_upfirdn_2d
    
    2021-04-06 08:32:44 UTC --     y = upfirdn_2d(y, k, upx=up, upy=up, downx=down, downy=down, padx0=pad0, padx1=pad1, pady0=pad0, pady1=pad1, impl=impl)
    
    2021-04-06 08:32:44 UTC --   File "/home/test/dnnlib/tflib/ops/upfirdn_2d.py", line 61, in upfirdn_2d
    
    2021-04-06 08:32:44 UTC --     return impl_dict[impl](x=x, k=k, upx=upx, upy=upy, downx=downx, downy=downy, padx0=padx0, padx1=padx1, pady0=pady0, pady1=pady1)
    
    2021-04-06 08:32:44 UTC --   File "/home/test/dnnlib/tflib/ops/upfirdn_2d.py", line 139, in _upfirdn_2d_cuda
    
    2021-04-06 08:32:44 UTC --     return func(x)
    
    2021-04-06 08:32:44 UTC --   File "/usr/local/lib/python3.6/dist-packages/tensorflow/python/ops/custom_gradient.py", line 162, in decorated
    
    2021-04-06 08:32:44 UTC --     return _graph_mode_decorator(f, *args, **kwargs)
    
    2021-04-06 08:32:44 UTC --   File "/usr/local/lib/python3.6/dist-packages/tensorflow/python/ops/custom_gradient.py", line 183, in _graph_mode_decorator
    
    2021-04-06 08:32:44 UTC --     result, grad_fn = f(*args)
    
    2021-04-06 08:32:44 UTC --   File "/home/test/dnnlib/tflib/ops/upfirdn_2d.py", line 131, in func
    
    2021-04-06 08:32:44 UTC --     y = _get_plugin().up_fir_dn2d(x=x, k=kc, upx=upx, upy=upy, downx=downx, downy=downy, padx0=padx0, padx1=padx1, pady0=pady0, pady1=pady1)
    
    2021-04-06 08:32:44 UTC --   File "/home/test/dnnlib/tflib/ops/upfirdn_2d.py", line 14, in _get_plugin
    
    2021-04-06 08:32:44 UTC --     return custom_ops.get_plugin(os.path.splitext(__file__)[0] + '.cu')
    
    2021-04-06 08:32:44 UTC --   File "/home/test/dnnlib/tflib/custom_ops.py", line 162, in get_plugin
    
    2021-04-06 08:32:44 UTC --     plugin = tf.load_op_library(bin_file)
    
    2021-04-06 08:32:44 UTC --   File "/usr/local/lib/python3.6/dist-packages/tensorflow/python/framework/load_library.py", line 61, in load_op_library
    
    2021-04-06 08:32:44 UTC --     lib_handle = py_tf.TF_LoadLibrary(library_filename)
    
    2021-04-06 08:32:44 UTC -- tensorflow.python.framework.errors_impl.NotFoundError: /home/test/dnnlib/tflib/_cudacache/upfirdn_2d_1.14_.so: undefined symbol: _ZN10tensorflow12OpDefBuilder4AttrENSt7__cxx1112basic_stringIcSt11char_traitsIcESaIcEEE
    
    2021-04-06 08:32:44 UTC -- error building image: error building stage: failed to execute command: waiting for process to exit: exit status 1
    

    Please help to check and advise. Thanks!

    opened by arsyad-ah 5
  • Cannot utilize multiple CPU cores

    Cannot utilize multiple CPU cores

    Hi-

    Thank you for making such a fascinating project available here!

    I'm trying to run ganformer within a conda environment, but am having problems getting ganformer to utilize multiple CPU cores.

    Using Ubuntu 20.04. Here is the setup for the conda environment used:

    conda create --name cuda10 python=3.7
    conda activate cuda10
    conda install tensorflow-gpu=1.14
    conda install pillow h5py requests tqdm termcolor seaborn
    pip install opencv-python lmdb gdown easydict
    

    To run it

    python gansformer/run_network.py --train --pretrained-pkl None --gpus 0,1 --ganformer-default --expname myDS_256 --dataset myDS --data-dir /data/myDS_256_tf --keep-samples --metrics none --result-dir training_runs/256_c1/ --num-threads 24 --minibatch-size 16
    

    Everything seems to be running correctly, there are no errors or crashes. The only problem is slow training initialization and low GPU utilization during training. System Monitor shows that only one CPU core is used at a time, so I'm guessing this is the cause of both issues. Do you have any ideas of what might be causing the restriction to a single CPU core?

    I always try to avoid raising an issue when something obvious might be wrong on my end, but this is my first time using conda so it might be that I'm simply using it incorrectly, or that I'm using your program incorrectly. I appreciate your patience if that is the case.

    Thank you for your attention to this issue!

    opened by abstractdonut 4
  • question on duplex attention (k means) code

    question on duplex attention (k means) code

    First, thank you for this amazing work!

    I am suspecting that an indentation is missing at the following position of the code:

    https://github.com/dorarad/gansformer/blob/3a9efa4545be25604b70560b7f491ec3633c14a3/pytorch_version/training/networks.py#L784

    The reason why it raises my suspicion is that, if the code is executed as it is, it seems like the actual key values (to_tensor) are never involved in the computation of the attention scores when k means is enabled. If I am mistaken, would you mind explain why line 787 replaces the original attention scores with the values computed here (where the embedding "to_centroids" seems to be initialized to be a mapping of the queries)?

    opened by nintendops 0
  • Training wont work, needs tensor.contrib which was removed in tf version 1.14

    Training wont work, needs tensor.contrib which was removed in tf version 1.14

    When running: python3 run_network.py --train --ganformer-default --expname test --dataset plant --eval-images-num 10000 The following error appears:

    I tensorflow/core/platform/cpu_feature_guard.cc:193] This TensorFlow binary is optimized with oneAPI Deep Neural Network Library (oneDNN) to use the following CPU instructions in performance-critical operations: AVX2 AVX512F AVX512_VNNI FMA To enable them in other operations, rebuild TensorFlow with the appropriate compiler flags. 2022-10-11 14:56:30.661744: I tensorflow/core/util/util.cc:169] oneDNN custom operations are on. You may see slightly different numerical results due to floating-point round-off errors from different computation orders. To turn them off, set the environment variable TF_ENABLE_ONEDNN_OPTS=0. 2022-10-11 14:56:30.690985: E tensorflow/stream_executor/cuda/cuda_blas.cc:2981] Unable to register cuBLAS factory: Attempting to register factory for plugin cuBLAS when one has already been registered 2022-10-11 14:56:31.202500: W tensorflow/stream_executor/platform/default/dso_loader.cc:64] Could not load dynamic library 'libnvinfer.so.7'; dlerror: libnvinfer.so.7: cannot open shared object file: No such file or directory; LD_LIBRARY_PATH: /usr/local/cuda-11.7/lib64 2022-10-11 14:56:31.202557: W tensorflow/stream_executor/platform/default/dso_loader.cc:64] Could not load dynamic library 'libnvinfer_plugin.so.7'; dlerror: libnvinfer_plugin.so.7: cannot open shared object file: No such file or directory; LD_LIBRARY_PATH: /usr/local/cuda-11.7/lib64 2022-10-11 14:56:31.202565: W tensorflow/compiler/tf2tensorrt/utils/py_utils.cc:38] TF-TRT Warning: Cannot dlopen some TensorRT libraries. If you would like to use Nvidia GPU with TensorRT, please make sure the missing libraries mentioned above are installed properly. Traceback (most recent call last): File "/home/ali/gansformer/run_network.py", line 15, in import pretrained_networks File "/home/ali/gansformer/pretrained_networks.py", line 4, in import dnnlib.tflib as tflib File "/home/ali/gansformer/dnnlib/tflib/init.py", line 1, in from . import autosummary File "/home/ali/gansformer/dnnlib/tflib/autosummary.py", line 23, in from . import tfutil File "/home/ali/gansformer/dnnlib/tflib/tfutil.py", line 9, in import tensorflow.contrib # requires TensorFlow 1.x! ModuleNotFoundError: No module named 'tensorflow.contrib'

    opened by AliMezher18 0
  • Hosting models on Hugging Face

    Hosting models on Hugging Face

    Hello! Thank you for open-sourcing this work, this is amazing 😊 I was wondering if you'd be interested in mirroring the pretrained model weights over on the Hugging Face model hub. I'm sure our community would love to see your work, and (among other things) hosting checkpoints on the Hub helps a lot with discoverability. We've got a guide here on how to upload models, but I'm also happy to help out with it if you'd like!

    opened by NimaBoscarino 0
  • Ganformer2

    Ganformer2

    Thanks for your brilliant work of ganformer and ganformer2! May I ask is there a roughly timeline to when the ganformer2 model would be release? Thanks for your time!

    opened by yangkang98 0
Releases(v1.5.2)
  • v1.5.2(Feb 2, 2022)

    Official Implementation of the Generative Adversarial Transformers paper, in both pytorch and tensorflow, for image and compositional scene generation. The codebase supports training, evaluation, image sampling, and variety of visualizations.

    Updates for version 1.5.2 (Feb 22, 2022): We updated the weight initialization of the PyTorch version to the intended scale, leading to a substantial improvement in the model's learning speed.

    Source code(tar.gz)
    Source code(zip)
  • v1.0(Mar 17, 2021)

    Official Implementation of the Generative Adversarial Transformers paper for image and compositional scene generation. The codebase supports training, evaluation, image sampling, and variety of visualizations.

    Source code(tar.gz)
    Source code(zip)
Owner
Drew Arad Hudson
Drew Arad Hudson
POT : Python Optimal Transport

POT: Python Optimal Transport This open source Python library provide several solvers for optimization problems related to Optimal Transport for signa

Python Optimal Transport 1.7k Dec 31, 2022
Byzantine-robust decentralized learning via self-centered clipping

Byzantine-robust decentralized learning via self-centered clipping In this paper, we study the challenging task of Byzantine-robust decentralized trai

EPFL Machine Learning and Optimization Laboratory 4 Aug 27, 2022
Official repo for QHack—the quantum machine learning hackathon

Note: This repository has been frozen while we consider the submissions for the QHack Open Hackathon. We hope you enjoyed the event! Welcome to QHack,

Xanadu 118 Jan 05, 2023
A package related to building quasi-fibration symmetries

qf A package related to building quasi-fibration symmetries. If you'd like to learn more about how it works, see the brief explanation and References

Paolo Boldi 1 Dec 01, 2021
Differentiable Factor Graph Optimization for Learning Smoothers @ IROS 2021

Differentiable Factor Graph Optimization for Learning Smoothers Overview Status Setup Datasets Training Evaluation Acknowledgements Overview Code rele

Brent Yi 60 Nov 14, 2022
Summary Explorer is a tool to visually explore the state-of-the-art in text summarization.

Summary Explorer Summary Explorer is a tool to visually inspect the summaries from several state-of-the-art neural summarization models across multipl

Webis 42 Aug 14, 2022
Styled text-to-drawing synthesis method. Featured at the 2021 NeurIPS Workshop on Machine Learning for Creativity and Design

Styled text-to-drawing synthesis method. Featured at the 2021 NeurIPS Workshop on Machine Learning for Creativity and Design

Peter Schaldenbrand 247 Dec 23, 2022
Code for PhySG: Inverse Rendering with Spherical Gaussians for Physics-based Relighting and Material Editing

PhySG: Inverse Rendering with Spherical Gaussians for Physics-based Relighting and Material Editing CVPR 2021. Project page: https://kai-46.github.io/

Kai Zhang 141 Dec 14, 2022
A trashy useless Latin programming language written in python.

Codigum! The first programming langage in latin! (please keep your eyes closed when if you read the source code) It is pretty useless though. Document

Bic 2 Oct 25, 2021
Discriminative Condition-Aware PLDA

DCA-PLDA This repository implements the Discriminative Condition-Aware Backend described in the paper: L. Ferrer, M. McLaren, and N. Brümmer, "A Speak

Luciana Ferrer 31 Aug 05, 2022
The first public PyTorch implementation of Attentive Recurrent Comparators

arc-pytorch PyTorch implementation of Attentive Recurrent Comparators by Shyam et al. A blog explaining Attentive Recurrent Comparators Visualizing At

Sanyam Agarwal 150 Oct 14, 2022
A nutritional label for food for thought.

Lexiscore As a first effort in tackling the theme of information overload in content consumption, I've been working on the lexiscore: a nutritional la

Paul Bricman 34 Nov 08, 2022
Use deep learning, genetic programming and other methods to predict stock and market movements

StockPredictions Use classic tricks, neural networks, deep learning, genetic programming and other methods to predict stock and market movements. Both

Linda MacPhee-Cobb 386 Jan 03, 2023
Read and write layered TIFF ImageSourceData and ImageResources tags

Read and write layered TIFF ImageSourceData and ImageResources tags Psdtags is a Python library to read and write the Adobe Photoshop(r) specific Imag

Christoph Gohlke 4 Feb 05, 2022
Megaverse is a new 3D simulation platform for reinforcement learning and embodied AI research

Megaverse Megaverse is a new 3D simulation platform for reinforcement learning and embodied AI research. The efficient design of the engine enables ph

Aleksei Petrenko 191 Dec 23, 2022
A boosting-based Multiple Instance Learning (MIL) package that includes MIL-Boost and MCIL-Boost

A boosting-based Multiple Instance Learning (MIL) package that includes MIL-Boost and MCIL-Boost

Jun-Yan Zhu 27 Aug 08, 2022
KSAI Lite is a deep learning inference framework of kingsoft, based on tensorflow lite

KSAI Lite is a deep learning inference framework of kingsoft, based on tensorflow lite

80 Dec 27, 2022
Definition of a business problem according to Wilson Lower Bound Score and Time Based Average Rating

Wilson Lower Bound Score, Time Based Rating Average In this study I tried to calculate the product rating and sorting reviews more accurately. I have

3 Sep 30, 2021
Dataset for the Research2Clinics @ NeurIPS 2021 Paper: What Do You See in this Patient? Behavioral Testing of Clinical NLP Models

Behavioral Testing of Clinical NLP Models This repository contains code for testing the behavior of clinical prediction models based on patient letter

Betty van Aken 2 Sep 20, 2022
Generative Models for Graph-Based Protein Design

Graph-Based Protein Design This repo contains code for Generative Models for Graph-Based Protein Design by John Ingraham, Vikas Garg, Regina Barzilay

John Ingraham 159 Dec 15, 2022