App.enfugue.ai Versions Save

ENFUGUE is an open-source web app for making studio-grade images and video using generative AI.

0.3.3

3 months ago

Installation and Running

A script is provided for Windows and Linux machines to install, update, and run ENFUGUE. Copy the relevant command below and answer the on-screen prompts to choose your installation type and install optional dependencies.

Windows

Access the command prompt from the start menu by searching for "command." Alternatively, hold the windows key on your keyboard and click x, then press r or click run, then type cmd and press enter or click ok.

curl https://raw.githubusercontent.com/painebenjamin/app.enfugue.ai/main/enfugue.bat -o enfugue.bat
.\enfugue.bat

Linux

Access a command shell using your preferred method and execute the following.

curl https://raw.githubusercontent.com/painebenjamin/app.enfugue.ai/main/enfugue.sh -o enfugue.sh
chmod u+x enfugue.sh
./enfugue.sh

Both of these commands accept the same flags.

USAGE: enfugue.(bat|sh) [OPTIONS]
Options:
 --help                   Display this help message.
 --conda / --portable     Automatically set installation type (do not prompt.)
 --update / --no-update   Automatically apply or skip updates (do not prompt.)
 --mmpose / --no-mmpose   Automatically install or skip installing MMPose (do not prompt.)

Windows/Linux Manual Installation

If you want to install without using the installation scripts, see this Wiki page.

MacOS

Automatic installers are coming! For now, please follow this manual installation method.

Download enfugue-server-0.3.3-macos-ventura-mps-x86_64.tar.gz, then double-click it to extract the package. When you run the application using the command below, your Mac will warn you of running downloaded packages, and you will have to perform an administrator override to allow it to run - you will be prompted to do this. To avoid this, you can run an included command like so:

./enfugue-server/unquarantine.sh

This command finds all the files in the installation and removes the com.apple.quarantine xattr from the file. This does not require administrator privilege. After doing this (or if you will grant the override,) run the server with:

./enfugue-server/enfugue.sh

Note: while the MacOS packages are compiled on x86 machines, they are tested and designed for the new M1/M2 ARM machines thanks to Rosetta, Apple's machine code translation system.

New Features

1. DragNUWA

The interface for DragNUWA and a resulting animation.

DragNUWA is an exciting new way to control Stable Video Diffusion released by ProjectNUWA and Microsoft. It allows you to draw the direction and speed of motion over the course of an animation. An entirely new motion vector interface has been created to allow for easy input into this complicated system.

Review the video below for information on how to use DragNUWA, include controls for creating and modifying motions.

https://user-images.githubusercontent.com/57536852/297960767-ae28ac55-2eba-4315-9362-29dc41cdd8d4.mp4

2. SVD Integration

Using Stable Video Diffusion for Text-to-Image-to-Video

To go along with the above, Stable Video Diffusion has been removed from the "Extras" menu and added to the main sidebar. When you enable animation, you will now be able to select between SVD and AnimateDiff/HotshotXL.

At the moment, SVD is treated as a post-processing step. Because there is no text-to-video yet for SVD, it will be treated as if you are making an image, and the image-to-video portion will be executed afterwards.

3. Altered Directory Structuring, Model Standardization and Improved Initialization

The installation manager and the new initialization popup.

To better facilitate sharing between ENFUGUE and other Stable Diffusion web applications, a small handful of changes have been made.

There are now more directories under the root directory to place different models. This matches more closely with other UI's.
At initialization, you may now specify a different root directory. When doing so, the remaining directories will auto-configure; for example, if you point the root directory to the models folder in a stable-diffusion-webui installation, the checkpoint directory will configure itself to be the same as that applications' Stable-diffusion directory.
Whenever you attempt to load a model that can be downloaded, such as a default checkpoint, the entire root directory will be scanned to locate it prior to downloading it, for all known AI model formats. This helps reduce the need for file duplication in situations where files are not in the expected location.
VAE and ControlNets were previously downloaded in diffusers format when they were not found. This has been changed and now all model resources are downloaded in runway/stability format, again to best enable cross-application compatibility.

NOTE! As a result of the new structure, all of the files in the /cache folder that begin with models-- may be deleted.

4. Improved Flexible Domain Routing

To help facilitate users running ENFUGUE on shared or on-demand server resources, networking has been improved for when you must communicate with ENFUGUE through a proxy. You should no longer need to configure a domain or paths for such situations explicitly, ENFUGUE should be able to determine based on the headers of the request that you are using a proxy, and the UI will adjust paths accordingly, both wioth and without SSL.

If you previously were configuring server.domain or server.cms.path.root manually, you can set server.domain to null and remove server.cms.path.root to enable flexible domain routing. You should find that simply accessing the reported proxy URL should work with no further configuration needed.

Full Changelog: https://github.com/painebenjamin/app.enfugue.ai/compare/0.3.2...0.3.3

What's Next

Planned for 0.3.4

These are all repeats from the previous release - DragNUWA was a surprise and diverted attention!

1. Images/Video on the Timeline

The Prompt Travel interface will be expanded to allow images and video to be manually placed on them.

2. Audio

Audio will additionally be added to the timeline, and will be an input for audio-reactive diffusion.

3. Additional Model Support

IP Adapter + FaceID
PhotoMaker
SVD ControlNet
PIA (Personalized Image Adapter)

Planned for 0.4.0

1. 3D/4D Diffusion

Support Stable123 and other 3D model generators.

Thank you!

0.3.2

4 months ago

Installation and Running

Windows

curl https://raw.githubusercontent.com/painebenjamin/app.enfugue.ai/main/enfugue.bat -o enfugue.bat
.\enfugue.bat

Linux

Access a command shell using your preferred method and execute the following.

curl https://raw.githubusercontent.com/painebenjamin/app.enfugue.ai/main/enfugue.sh -o enfugue.sh
chmod u+x enfugue.sh
./enfugue.sh

Both of these commands accept the same flags.

USAGE: enfugue.(bat|sh) [OPTIONS]
Options:
 --help                   Display this help message.
 --conda / --portable     Automatically set installation type (do not prompt.)
 --update / --no-update   Automatically apply or skip updates (do not prompt.)
 --mmpose / --no-mmpose   Automatically install or skip installing MMPose (do not prompt.)

Windows/Linux Manual Installation

If you want to install without using the installation scripts, see this Wiki page.

MacOS

Automatic installers are coming! For now, please follow this manual installation method.

Download enfugue-server-0.3.2-macos-ventura-mps-x86_64.tar.gz, then double-click it to extract the package. When you run the application using the command below, your Mac will warn you of running downloaded packages, and you will have to perform an administrator override to allow it to run - you will be prompted to do this. To avoid this, you can run an included command like so:

./enfugue-server/unquarantine.sh

./enfugue-server/enfugue.sh

Note: while the MacOS packages are compiled on x86 machines, they are tested and designed for the new M1/M2 ARM machines thanks to Rosetta, Apple's machine code translation system.

New Features

1. Multi-Server, Flexible Domain Routing

ENFUGUE now supports running multiple servers at once, listening on different ports and optionally different hosts and protocols.
By default, ENFUGUE runs two servers; one on port 45554 over HTTPS (as it has been) and a second on port 45555 on HTTP.
If accessing via https://app.enfugue.ai:45554 does not work for your networking setup, you can now connect to enfugue using http://127.0.0.1:45555 or any other IP address/hostname that resolves to the machine running ENFUGUE.
Configuration syntax remains the same, however the host, domain, port, secure, cert, and key keys can now accept lists/arrays.

2. SDXL Turbo + Reduced Overhead (~25% Faster Single-Image Generation)

Using SDXL turbo resulting in 1 second or less image generation.

Fully integrated into all workflows; derivatives also supported (turbo fine-tuned models like Dreamshaper Turbo, etc.)
The travel time between browser to stable diffusion and back has been reduced by about 90%.
This translates to significant speed gains for single-image generation, with diminishing returns for multiple/large image generations.

3. Prompt Weighting Syntax Change

Some examples of prompt weighting to finely control hair color.

The Compel library has been implemented, which provides much better translation of prompt to embedding. This improves the control you can exert over your images using text alone.
See here for syntax documentation.

4. Stable Video Diffusion

https://github.com/painebenjamin/app.enfugue.ai/assets/57536852/78ba6bd8-af48-453c-b6ab-115ac3145cd4

Standalone img2vid integration available through canvas shortcut (enfugue-generated image sample) or through the Extras menu (external/other image sample.)
As more workflows are enabled through SVD, it will become further integrated.
This model is licensed under a non-commercial license. As such, output of this tool cannot be used for commercial purposes without first contacting Stability AI and acquiring a commercial license.

5. AnimateDiff V3, Sparse ControlNet, Frame Start Input

Animation created using closed and open eye keyframes.

AnimateDiff version 3 has been released and is now the default motion module when creating animations using Stable Diffusion 1.5.
The authors of AnimateDiff additionally created Sparse ControlNet, a new kind of ControlNet that allows for placing images on keyframes, and letting the ControlNet help interpolate between the frames. Select Sparse RGB or Sparse Scribble as your ControlNet in a Control Unit to use it.
To control on which frame the image occurs, set the Starting Frame value in the Layer Options menu. The first value for this field is one (i.e. it is 1-indexed.) This will be changed in the next update (see what's next below.)

6. FreeInit/Iterative Denoising/Ablation

https://github.com/painebenjamin/app.enfugue.ai/assets/57536852/5d26fd0b-656c-4852-b87b-cfb6861e1bae

FreeInit has been implemented, allowing for repeated denoising of animations.
This can stabilize animations when using long windows or large multipliers that would otherwise result in incoherent videos.
This has only been observed to work with DDIM sampling. If you find another scheduler that works, please let me know!

7. SDXL Inpainting Improvements

Before (left) and after (right) fixing color grading issues.

Inpainting for SDXL has been improved, removing issues with color grading.
Additionally, outpainting for SDXL has been improved. Expect more coherent and higher quality results when extending images beyond their borders.

8. IP Adapter Full, Face Isolate

A new IP adapter model has been added for 1.5; Full Face.
Additionally an option has been added alongside IP adapter scale that allows you to isolate the image to the face. This will use a face detection model to remove all other image data except the face prior to sending it to IP adapter.

Source Image

Results

	Change Framing	Change Clothing	Transfer Face
Prompt	a woman wearing a black turtleneck and tan blazer, full-body fashion photography	a woman wearing a t-shirt and jean jacket	an android with a human face and robot body, cyberpunk city streets
Default
Default + Face Isolate
Plus
Plus + Face Isolate
Plus Face
Plus Face + Face Isolate
Full Face
Full Face + Face Isolate

9. New Interpolator Implementation

The previous interpolator, which was implemented in TensorFlow, has been replaced by an identical one implemented in PyTorch. This is very valuable as it has removed the dependence on TensorFlow entirely.
You will not find different behavior between these two implementations.

10. Caption Upsampler

Input (left) and output (right). Find the input under Extras in the menu.

- Turns prompts into more descriptive ones using open-source LLM. - Only H4 Zephyr 7B is available for LLM models at this time. The first time you use the upsampler, this model will be downloaded. It is approximately 10 Gb in size and requires around 12 Gb of VRAM to run.

11. Split/Unified Layout

Using a split layout allows you to zoom in to inpaint on one side while still seeing the whole output on the other.

Current behavior is now termed "Unified Layout," where the input canvas and output samples occupy the same space and you can swap between them.
Now you can also split the viewport in half vertically or horizontally (adjustable,) one side is for the input canvas, one side is for the output samples.

12. Real-time

https://github.com/painebenjamin/app.enfugue.ai/assets/57536852/004d53ee-5c98-4947-97ae-bef8565f66be

To go along with the above, you can now also enable real-time image generation.
This will render an image any time a change is made to the global inputs or any layered input.
Intermediate images and progress reporting are disabled when real-time is enabled.
You can expect images in roughly one-second intervals when using Turbo or LCM at a reasonable size with no additional processing.
All other tools are enabled. Be aware that the same rules for processing speed apply here, so the more kinds of inputs you add (ControlNets, IP adapters) and output processing (face fixing, upscaling,) the higher the latency will be.

13. Control Improvements

The 'Space' key now functions to pan the canvas, similar to middle-mouse or ctrl-left-mouse.
Touch events are now properly bound on the canvas, enabling touchscreen use.
Scroll-based zooming with touchpads has been significantly slowed down to make it more controllable.

14. New Downloadable Models

Checkpoints

PlaygroundAI's Playground V2 model
Alex Izquierdo's OpenDallE
Segmind's Vega

LoRA

15. Other Changes

"Use Tiled Diffusion/VAE" is now two separate inputs, "Use Tiled UNet" and "Use Tiled VAE." There are situations where you will hit CUDA out-of-memory errors during decoding, but not during inference. This will enable you to tile the decoding (just select 'Use Tiled VAE') without also having to tile the inference.
Classifier-free guidance (guidance scale <=1.0) was broken for SDXL, it has been fixed.
Result menus were accidentally removed from the top menu bar, this has been fixed.
An issue with reordering layers has been fixed.
The prompt travel interface has been improved to perform better when frame counts are larger than 64.
Images and video have been given a offset variables to allow you to finely position them within their frames.
You are no longer prompted to keep state when clicking 'Edit Image' or 'Edit Video,' it is now always kept.
You are now prompted in the front-end before you send an invocation with an image that does not have any assigned role.
The guidance scale minimum has been returned from 1.0 to 0.0.
The number of inference steps minimum has been reduced from 2 to 1.
The default number of diffusion steps when using upscaling with denoising has been reduced from 100 to 40, and the default guidance scale has been reduced from 12 to 10.

Full Changelog: https://github.com/painebenjamin/app.enfugue.ai/compare/0.3.1...0.3.2

What's Next

Planned for 0.3.3

1. Images/Video on the Timeline

The Prompt Travel interface will be expanded to allow images and video to be manually placed on them.

2. Audio

Audio will additionally be added to the timeline, and will be an input for audio-reactive diffusion.

3. Additional Model Support

IP Adapter + FaceID
SVD ControlNet
PIA (Personalized Image Adapter)

Planned for 0.4.0

1. 3D/4D Diffusion

Support Stable123 and other 3D model generators.

Thank you!

0.3.1

5 months ago

New Linux Installation Method

To help ease the difficulties of downloading, installing and updating enfugue, a new installation method and execution method has been developed. This script is a one-and-done shell script that will prompt you for any options you will need to set. Installation is as follows:

curl https://raw.githubusercontent.com/painebenjamin/app.enfugue.ai/main/enfugue.sh -o enfugue.sh
chmod u+x enfugue.sh
./enfugue.sh

You will be prompted when a new version of enfugue is available, and it will be automatically downloaded for you. Execute enfugue.sh -h to see command-line options. Open the file with a text editor to view configuration options and additional instructions.

New Features

1. LCM - Latent Consistency Models

An image and animation made with LCM, taking 1 and 14 seconds to generate respectively.

Latent Consistency Models are a method for performing inference in only a small handful of steps, with minimal reduction in quality.

To use LCM in Enfugue, take the following steps:

In More Model Configuration, add the appropriate LoRA for your currently selected checkpoint. This is recommended to be set at exactly 1.0 weight.
Change your scheduler to LCM Scheduler.
Reduce your guidance scale to between 1.1 and 1.4 - 1.2 is a good start.
Reduce your inference steps to between 3 and 8 - 4 is a good start.
Disable tiled diffusion and VAE; it performs poorly with the LCM scheduler.
If you're using animation, disable frame attention slicing, or switch to a different scheduler like Euler Discrete - you can use other schedulers with LCM, too!

You may find LCM does not do well with fine structures like faces and hands. To help address this, you can either upscale as I have here, or use next new feature.

2. Detailer

Left to right: base image, with face fix, with face fix and inpaint.

Enfugue now has a version of Automatic1111's ADetailer (After Detailer.) This allows you to configure a detailing pass after each image generation that can:

Use face restoration to make large modifications to faces to make them appear more natural.
In addition to (or instead of) the above, you can automatically perform an inpainting pass over faces on the image. This will give Stable Diffusion a chance to add detail back to faces and make them blend in better with the rest of the image style. This is best used in conjunction with the above.
In addition to the above, you can also identify and inpaint hands. This can fix human hands that are broken or inaccurate.
Finally, you can perform a final denoising pass over the whole image. This can help make the final fixed image more coherent.

This works very well when combined with LCM, which can perform the inpainting and final denoising passes in a single step, offsetting the difficulty that LCM sometimes has with these subjects.

3. Themes

The included themes.

Enfugue now has themes. These are always available from the menu.

Select from the original enfugue theme, five different colored themes, two monochrome themes, and the ability to set your own custom theme.

4. Opacity Slider, Simpler Visibility Options

Stacking two denoised images on top of one another, and the resulting animation.

An opacity slider has been added to the layer options menu. When used, this will make the image or video partially transparent in the UI. In addition, if the image is in the visible input layer, it will be made transparent when merged there, as well.

To make it more clear what images are and are not visible to Stable Diffusion, the "Denoising" image role has been replaced with a "Visibility" dropdown. This has three options:

Invisible - The image is not visible to Stable Diffusion. It may still be used for IP Adapter and/or ControlNet.
Visible - The image is visible to stable diffusion. The alpha channel of the image is not added to the painting mask.
Denoised - The image is visible to stable diffusion. The alpha channel of the image is added to the painting mask.

5. Generic Model Downloader

The Download Model UI.

To help bridge the gap when it comes to external service integrations, there is now a generic "Download Models" menu in Enfugue. This will allow you to enter a URL to a model hosted anywhere on the internet, and have Enfugue download it to the right location for that model type.

6. Model Metadata Viewer

The metadata viewer showing a result from CivitAI.

When using any field that allows selecting from different AI models, there is now a magnifying glass icon. When clicked, this will present you with a window containing the CivitAI metadata for that model.

This does not require the metadata be saved prior to viewing. If the model does not exist in CivitAI's database, no metadata will be available.

7. More Scheduler Configuration

The more scheduler configuration UI.

Next to the scheduler selector is a small gear icon. When clicked, this will present you with a window allowing for advanced scheduler configuration.

These values should not need to be tweaked in general. However, some new animation modules are trained using different values for these configurations, so they have been exposed to allow using these models effectively in Enfugue.

Full Changelog: https://github.com/painebenjamin/app.enfugue.ai/compare/0.3.0...0.3.1

How-To Guide

If you're on Linux, it's recommended to use the new automated installer. See the top of this document for those instructions. For Windows users or anyone not using the automated installer, read below.

First decide how you'd like to install, either a portable distribution, or through conda.

Conda will install all enfugue dependencies separated. This is the recommended installation method, as it will ensure the highest compatibility with your hardware, and makes for easy and fast updates.
A portable distribution comes with all dependencies in one directory, with an executable binary.

Installing and Running: Portable Distributions

Summary

Platform	Graphics API	File(s)	CUDA Version	Torch Version
Windows	CUDA	enfugue-server-0.3.1-win-cuda-x86_64.zip.001 enfugue-server-0.3.1-win-cuda-x86_64.zip.002	11.8.0	2.1.0
Linux	CUDA	enfugue-server-0.3.1-manylinux-cuda-x86_64.tar.gz.0 enfugue-server-0.3.1-manylinux-cuda-x86_64.tar.gz.1 enfugue-server-0.3.1-manylinux-cuda-x86_64.tar.gz.2	11.8.0	2.1.0

Linux

Download the three files above that make up the entire archive, then extract them. To extract these files, you must concatenate them. Rather than taking up space in your file system, you can simply stream them together to tar. A console command to do that is:

cat enfugue-server-0.3.1* | tar -xvz

You are now ready to run the server with:

./enfugue-server/enfugue.sh

Press Ctrl+C to exit.

Windows

Download the win64 files here, and extract them using a program which allows extracting from multiple archives such as 7-Zip.

If you are using 7-Zip, you should not extract both files independently. If they are in the same directory when you unzip the first, 7-Zip will automatically unzip the second. The second file cannot be extracted on its own.

Locate the file enfugue-server.exe, and double-click it to run it. To exit, locate the icon in the bottom-right hand corner of your screen (the system tray) and right-click it, then select Quit.

Installing and Running: Conda

To install with the provided Conda environments, you need to install a version of Conda.

After installing Conda and configuring it so it is available to your shell or command-line, download one of the environment files depending on your platform and graphics API.

First, choose windows-, linux- or macos- based on your platform.
Then, choose your graphics API:
- If you are on MacOS, you only have access to MPS.
- If you have an Nvidia GPU or other CUDA-compatible device, select cuda.
- Additional graphics APIs (rocm and directml) are being added and will be made available as they are developed. Please voice your desire for these to prioritize their development.

Finally, using the file you downloaded, create your Conda environment:

conda env create -f <downloaded_file.yml>

You've now installed Enfugue and all dependencies. To run it, activate the environment and then run the installed binary.

conda activate enfugue
python -m enfugue run

Optional: DWPose Support

To install DW Pose support (a better, faster pose and face detection model), after installing Enfugue, execute the following (MacOS, Linux or Windows):

mim install "mmcv>=2.0.1"
mim install "mmdet>=3.1.0"
mim install "mmpose>=1.1.0"

Optional: GPU-Accelerated Interpolation

To install dependencies for GPU-accelerated frame interpolation, execute the following command (Linux, Windows):

pip install tensorflow[and-cuda] --ignore-installed

Installing and Running: Self-Managed Environment

If you would like to manage dependencies yourself, or want to install Enfugue into an environment to share with another Stable Diffusion UI, you can install enfugue via pip. This is the only method available for AMD GPU's at present.

pip install enfugue

If you are on Linux and want TensorRT support, execute:

pip install enfugue[tensorrt]

If you are on Windows and want TensorRT support, follow the steps detailed here.

Thank you!

0.3.0

6 months ago

New Features

v030

Animation

ENFUGUE now supports animation. A huge array of changes have been made to accommodate this, including new backend pipelines, new downloadable model support, new interface elements, and a rethought execution planner.

Most importantly, all features available for images work for animation as well. This includes IP adapters, ControlNets, your custom models, LoRA, inversion, and anything else you can think of.

AnimateDiff

The premiere animation toolkit for Stable Diffusion is AnimateDiff for Stable Diffusion 1.5. When using any Stable Diffusion 1.5 model and enabling animation, AnimateDiff is loaded in the backend.

Motion Modules

Selecting motion modules in the GUI.

Motion modules are AI models that are injected into the Stable Diffusion UNet to control how that model interprets motion over time. When using AnimateDiff, you will by default use mm_sd_15_v2.ckpt, the latest base checkpoint. However, fine-tuned checkpoints are already available from the community, and these are supported in pre-configured models and on-the-fly configuration.

Browsing motion modules in CivitAI.

In addition, these are downloadable through the CivitAI download browser.

Motion LoRA

Selecting motion LoRA in the GUI.

Motion LoRA are additional models available to steer AnimateDiff; these were trained on specific camera motions, and can replicate them when using them.

These are always available in the UI, select them from the LoRA menu and they will be downloaded as needed.

Zoom In	Zoom Out	Zoom Pan Left	Zoom Pan Right

Tilt Up	Tilt Down	Rolling Anti-Clockwise	Rolling Clockwise

HotshotXL

HotshotXL is a recently released animation toolkit for Stable Diffusion XL. When you use any Stable Diffusion XL model and enable animation, Hotshot will be loaded in the backend.

Frame Windows

Frame windows in the GUI.

AnimateDiff and Hotshot XL both have limitations on how long they can animate for before losing coherence. To mitigate this, we can only ever attempt to animate a certain number of frames at a time, and blend these frame windows into one another to produce longer coherent motions. Use the Frame Window Size parameter to determine how many frames are used at once, and Frame Window Stride to indicate how many frames to step for the next window.

Left: 32 frames with no window. Notice how the motion loses coherence about halfway. Right: 32 frames with a size-16 window and size-8 stride. Notice the improved coherence of motion (as well as some distortions, which is part of the trade-off of using this technique.)

Position Encoder Slicing

Position encoder slicing in the GUI.

Both HotshotXL and AnimateDiff use 24-frame position encoding. If we cut that encoding short and interpolate the sliced encoding to a new length, we can effectively "slow down" motion. This is an experimental feature.

Left: no position encoder slicing, 16 frames, no window. Right: slicing positions at 16 frames, and scaling to 32 frames. The animation is 32 frames with no window. Notice the slower, smoother motion.

Motion Attention Scaling

Motion attention scaling in the GUI.

The application of motion during the inference process is a distinct step, and as a result of this we can apply a multiplier to how much effect that has on the final output. Using a small bit of math, we can determine at runtime the difference between the trained dimensions of the motion module and the current dimensions of your image, and use that to scale the motion. Enabling this in the UI also gives you access to a motion modifier which you can use to broadly control the "amount" of motion in a resulting video.

Left: No motion attention scaling. Right: Motion attention scaling enabled.

Prompt Travel

The prompt travel interface after enabling it.

Instead of merely offering one prompt during animation, we can interpolate between multiple prompts to change what is being animated at any given moment. Blend action words into one another to steer motion, or use entirely different prompts for morphing effects.

The resulting animation.

FILM - Frame Interpolation for Large Motion

Both HotshotXL and AnimateDiff were trained on 8 frames per second animations. In order to get higher framerates, we must create frames inbetween the AI-generated frames to smooth the motion out over more frames. Simply add a multiplication factor to create in-between frames - for example, a factor of 2 will double (less one) the total frame count by adding one frame in-between every other frame. Adding another factor will interpolate on the interpolated images, so adding a second factor of 2 will re-double (less one).

If you are upscaling and interpolating, the upscaling will be performed first.

Left: 16 frames with no interpolation (8 FPS.) Right: the same 16 frames interpolated twice with a factor of 2 (total 64 frames, 32 FPS.)

Looping and Reflecting

There are two options available to make an animation repeat seamlessly.

Reflect will play the animation in reverse after playing it forward. To alleviate the "bounce" that occurs at the inflection points, frame interpolation will be used to smoothly ease these changes.
Loop will create an animation that loops seamlessly. This is achieved through the same method as frame windows, only additionally wrapping the frame window around to the beginning. This will increase the total number of steps to make an animation. Note that this can also reduce the overall motion in the image, so it is recommended to combine this with other options such as motion attention scaling.

Left: A looping animation. Right: A reflected animation.

Tips

AnimateDiff is best used to make two-second videos at 16 frames per second. This means your frame window size should be 16 when trying to create longer animations with Stable Diffusion 1.5.
AnimateDiff performs best with Euler Discrete scheduling.
AnimateDiff version 1, as well as any motion modules derived from it (including motion LoRA,) may have visible watermarks due to the training data also having watermarks.
AnimateDiff can have artifacting around the corners of images that are larger than 512x512. To mitigate this, you can add around 8 pixels of extra space to trim off later, or use tiled diffusion.
HotShotXL is best used to make one-second videos at 8 frames per second. This means your frame window size should be 8 when trying to create longer animations with Stable Diffusion XL.
HotShotXL performs best with DDIM scheduling.

GUI Redesign

In order to accommodate animation, and as a refresher over the original design, the GUI has been entirely re-configured. The most significant changes are enumerated below.

ENFUGUE v0.3.0's GUI.

The original sidebar has been moved from the right to the left. As the sidebar represented global options, it was decided the left-hand side was the better place for this to follow along the lines of photo manipulation programs like GIMP or Photoshop.

Redesigned Sample Chooser

The chooser that allows you to switch between viewing results and viewing the canvas has been moved to it's own dedicated bar.

In addition, this form takes two forms:

The sample chooser during animation.

The sample chooser during image generation.

The new layers menu.

A layers menu has been added in the sidebar's place. This contains active options for your current layer.

Global Inpaint Options, Global Denoising Strength, Inverted Inpainting

As all invocations are now performed in a single inference step, there can only be one mask and one denoising strength. These have been moved to the global menu as a result. They will appear when there is any media on the canvas. Check the "Enable Inpainting" option to show the inpainting toolbar.

The UI when inpainting is active.

In addition, inpainting has been inverted from ENFUGUE's previous incarnation: black represents portions of the image left untouched, and white represents portions of the image denoised. This was changed to be more in line with how other UI's display inpainting masks and how they are used in the backend.

More Changes

To help alleviate confusion, numerous menus that were previously collapsed have now been made expanded by default.
Two quick options have been made available for adjusting the size of an image and the canvas in the toolbar. One will scale the element to the size of the canvas, and one will scale the canvas to the size of the image in the element.
The field previously called Engine Size is now called Tile Size.
The field previously called Chunking Size is now called Tile Stride.
The field previously called Chunking Mask is now called Tile Mask.
IP Adapter model type has been changed to a dropdown selection instead of appearing/disappearing checkboxes.

The new options for resizing canvas elements.

Tiling

Tiling has been added to ENFUGUE. Select between horizontally tiling, vertically tiling, or both. It even works with animation!

Select the "display tiled" icon in the sample chooser to see what the image looks like next to itself.

The tiling options in the UI.

A horizontally tiling image.

A vertically tiling image.

A horizontally and vertically tiling image.

Notes

Reporting Bugs, Troubleshooting

There are many, many changes in this release, it is likely that there will be bugs encountered on different operating systems, browsers, GPUs, and workflows. Please see this Wiki page for requested information when submitting bug reports, as well as where logs can be located to do some self-diagnosing.

TensorRT Builds Suspended Indefinitely

TensorRT-specific builds will no longer be released. These have led to significant amounts of confusion over the months, with very few people being able to make use of TensorRT.

It will remain available for the workflows it was previously available for, but you will need to install enfugue using one of the provided conda environment or into a different latent diffusion python environment via pip - see below for full instructions.

Pending MacOS Build

The MacOS build of v0.3.0 is pending. There has been difficulties finding a set of compatible dependencies, but it will be done soon. I apologize for the delay. You are welcome to try installing using the provided conda environment - full instructions below.

Full Changelog: https://github.com/painebenjamin/app.enfugue.ai/compare/0.2.5...0.3.0

How-To Guide

Installing and Running: Portable Distributions

Select a portable distribution if you'd like to avoid having to install other programs, or want to have an isolated executable file that doesn't interfere with other environments on your system.

Summary

Platform	Graphics API	File(s)	CUDA Version	Torch Version
Windows	CUDA	enfugue-server-0.3.0-win-cuda-x86_64.zip.001 enfugue-server-0.3.0-win-cuda-x86_64.zip.002	11.8.0	2.1.0
Linux	CUDA	enfugue-server-0.3.0-manylinux-cuda-x86_64.tar.gz.0 enfugue-server-0.3.0-manylinux-cuda-x86_64.tar.gz.1 enfugue-server-0.3.0-manylinux-cuda-x86_64.tar.gz.2	11.8.0	2.1.0

Linux

To extract these files, you must concatenate them. Rather than taking up space in your file system, you can simply stream them together to tar. A console command to do that is:

cat enfugue-server-0.3.0* | tar -xvz

You are now ready to run the server with:

./enfugue-server/enfugue.sh

Press Ctrl+C to exit.

Windows

Download the win64 files here, and extract them using a program which allows extracting from multiple archives such as 7-Zip.

Locate the file enfugue-server.exe, and double-click it to run it. To exit, locate the icon in the bottom-right hand corner of your screen (the system tray) and right-click it, then select Quit.

Installing and Running: Conda

To install with the provided Conda environments, you need to install a version of Conda.

After installing Conda and configuring it so it is available to your shell or command-line, download one of the environment files depending on your platform and graphics API.

First, choose windows-, linux- or macos- based on your platform.
Then, choose your graphics API:
- If you are on MacOS, you only have access to MPS.
- If you have an Nvidia GPU or other CUDA-compatible device, select cuda.
- Additional graphics APIs (rocm and directml) are being added and will be made available as they are developed. Please voice your desire for these to prioritize their development.

Finally, using the file you downloaded, create your Conda environment:

conda env create -f <downloaded_file.yml>

You've now installed Enfugue and all dependencies. To run it, activate the environment and then run the installed binary.

conda activate enfugue
python -m enfugue run

NOTE: The previously recommended command, enfugue run, has been observed to fail in certain environments. For this reason it is recommended to use the above more universally-compatible command.

Optional: DWPose Support

To install DW Pose support (a better, faster pose and face detection model), after installing Enfugue, execute the following (MacOS, Linux or Windows):

mim install "mmcv>=2.0.1"
mim install "mmdet>=3.1.0"
mim install "mmpose>=1.1.0"

Installing and Running: Self-Managed Environment

pip install enfugue

If you are on Linux and want TensorRT support, execute:

pip install enfugue[tensorrt]

If you are on Windows and want TensorRT support, follow the steps detailed here.

Thank you!

0.2.5

6 months ago

New Features

1. Fine-Tuned SDXL Inpainting

A source image courtesy of Michael James Beach via pexels.com

(Left) Outpainting the image to the left. (Right) Inpainting a refrigerator over the shelving.

Added support for sdxl-1.0-inpainting-0.1 and automatic XL inpainting checkpoint merging when enabled.
Simply use any Stable Diffusion XL checkpoint as your base model and use inpainting; ENFUGUE will merge the models at runtime as long as it is enabled (leave Create Inpainting Checkpoint when Available checked in the settings menu.)
The merge calculation takes some time due to the size of the model. If XL pipeline caching is enabled, a cache will be created after the first calculation, which will drastically reduce the time needed to use the inpainting model for all subsequent images.

2. FreeU

Adjusting the four FreeU factors individually (first four rows), and adjusting all four factors at once (last row.)

The same as above, using an anime model instead.

Added support for FreeU, developed by members of the S-Lab at Nanyang Technological University.
This provides a new set of tweaks for modifying a model's output without any training or added processing time.
Activate FreeU under the "Tweaks" menu for any pipeline.

3. Noise Offset

A number of noise methods, blend methods and offset amounts. Click image to view full-size.

Noise offset options under Tweaks in the sidebar.

Added noise offset options under Tweaks in the sidebar.
This allows you to inject additional noise before the denoising process using 12 different noise methods and 18 different blending methods for a total of 216 new image noising methods.
Some combinations may result in illegible or black images. It's recommended to start with a very small noise offset and increase gradually.
All noise options are deterministic; meaning that using the same seed will always generate the same noise.

Noise Examples

Blue	Brownian Fractal	Crosshatch	Default (CPU Random)	Green	Grey	Pink	Simplex	Velvet	Violet	White
	$brownian_fractal$

4. Tiny Autoencoder (VAE) Preview

- Added support for MadeByOllin's [Tiny Autoencoder](https://github.com/madebyollin/taesd) for all pipelines. - This autoencoder provides a very small memory footprint (5. CLIP Skip

Examples of CLIP Skip values using a Stable Diffusion 1.5 anime model. Click each image for full resolution.

Added support for CLIP Skip, an input parameter to any pipeline under "Tweaks."
This value represents the number of layers excluded from text encoding, starting from the end. In general, you should not use CLIP skip, however some models were trained specifically on a reduced layer count - particularly, many anime models perform significantly better on CLIP Skip 2.
Thanks to Neggles for the example images and history.

6. More Detailed Pipeline Preparation Status

Downloading checkpoints as files or cached model weights.

Added more messages during the "Preparing <Role> Pipeline" phase.
When any model needs to be downloaded (default checkpoints, ControlNets, VAE, and more,) the "Preparing Pipeline" message will be replaced with what it is downloading.

7. More Schedulers

Added the following schedulers:

DPM Discrete Scheduler (ADPM2) Karras
DPM-Solver++ 2M SDE (non-Karras) and DPM-Solver++ SDE Karras
- Not to be confused with DPM-Solver++ 2M SDE Karras
DPM Ancestral Discrete Scheduler (KDPM2A) Karras
Linear Multi-Step Discrete Scheduler Karras

8. Relative Directories in Model Picker

Previously, only the file name of models (checkpoints, LoRA, Lycoris and Textual Inversion) would be shown, hiding any manual organization you may have done.
Now, if you choose to organize your models in subdirectories, the relative path will be visible in the model pickers in the user interface.

9. Improved Gaussian Chunking

Removed visible horizontal and vertical edges along the sides of images made using gaussian chunking.

Example Masks

gaussian gaussian-bottom gaussian-left gaussian-left-bottom gaussian-left-right gaussian-left-right-bottom gaussian-left-top gaussian-left-top-bottom gaussian-left-top-right gaussian-right gaussian-right-bottom gaussian-top gaussian-top-bottom gaussian-top-right gaussian-top-right-bottom

10. Improved Results Browser

Added more columns to the results browser to make it easier to copy and paste individual settings without needing to copy the entire JSON payload.

11. Improved CivitAI Browser

Added the ability to middle-click or right-click download links in the CivitAI browser window.
Also added links to view the model or the author on CivitAI.

12. Improved Error Messages

Added error messages when loading LoRA, Lycoris and Textual Inversion.
When an AttributeError or KeyError occurs, the user will be asked to ensure they are 1.5 adaptations with 1.5 models and XL adaptations with XL models.

Previously, regular tooltips could not be copied to clipboard, they could only be copied if they were in a table.
Now, all tooltips can be copied. If a tooltip is visible on the screening, Holding Ctrl or Cmd and then performing a right-click (context menu) will copy the tooltip to the clipboard.

Changes

Changed the default SDXL model from sd_xl_base_1.0.safetensors to sd_xl_base_1.0_fp16_vae.safetensors.
Changed the default value for "Inference Steps" from 40 to 20.
Changed the default value for "Use Chunking" from true to false.
Changed the image options form from being hidden by default to being visible by default.
Made it more difficult to accidentally merge nodes.

Bug Fixes

Fixed an issue where loading saved .json files would not work with some browsers.
Fixed an issue where the inpainting model would not change when the primary model was changed and there was no explicit inpainter set.
Fixed an issue where, during cropped inpainting, an image that was used for both inpainting and ControlNet would only get cropped for inference, and not for Control, resulting in mismatched image sizes.
Fixed an issue where a textual inversion input field would appear blank after reloading, even though it was properly sent to the backend.
Fixed an issue where the Denoising Strength slider would not appear when enabling inpainting.
Fixed an issue where GPU-accelerated filtering would not work in Firefox.
Fixed an issue where inpainter or refiner VAE would not be correctly set when overriding default.

Full Changelog: https://github.com/painebenjamin/app.enfugue.ai/compare/0.2.4...0.2.5

How-To Guide

Installing and Running: Portable Distributions

Select a portable distribution if you'd like to avoid having to install other programs, or want to have an isolated executable file that doesn't interfere with other environments on your system.

Summary

Platform	Graphics API	File(s)	CUDA Version	Torch Version
MacOS	MPS	enfugue-server-0.2.5-macos-ventura-x86_64.tar.gz	N/A	2.2.0.dev20230928
Windows	CUDA	enfugue-server-0.2.5-win-cuda-x86_64.zip.001 enfugue-server-0.2.5-win-cuda-x86_64.zip.002	12.1.1	2.2.0.dev20230928
Windows	CUDA+TensorRT	enfugue-server-0.2.5-win-tensorrt-x86_64.zip.001 enfugue-server-0.2.5-win-tensorrt-x86_64.zip.002	11.7.1	1.13.1
Linux	CUDA	enfugue-server-0.2.5-manylinux-cuda-x86_64.tar.gz.0 enfugue-server-0.2.5-manylinux-cuda-x86_64.tar.gz.1 enfugue-server-0.2.5-manylinux-cuda-x86_64.tar.gz.2	12.1.1	2.2.0.dev20230928
Linux	CUDA+TensorRT	enfugue-server-0.2.5-manylinux-tensorrt-x86_64.tar.gz.0 enfugue-server-0.2.5-manylinux-tensorrt-x86_64.tar.gz.1 enfugue-server-0.2.5-manylinux-tensorrt-x86_64.tar.gz.2	11.7.1	1.13.1

TensorRT or CUDA?

The primary differences between the TensorRT and CUDA packages are CUDA version (11.7 vs. 12.1) and Torch version (1.13.1 vs. 2.2.0).

For general operation, Torch 2 and CUDA 12 will outperform Torch 1 and CUDA 11 for almost all operations. However, a TensorRT engine compiled in CUDA 11.7 and Torch 1.13.1 will outperform Torch 2 inference by a factor of up to 100%.

In essence,

If you plan to use one style very frequently (or exclusively), and have a powerful, modern Nvidia GPU, then choose TensorRT.
In all other cases, choose CUDA.

Linux

After choosing TensorRT or CUDA, download the appropriate manylinux files here, concatenate them and extract them. A console command to do that is:

cat enfugue-server-0.2.5* | tar -xvz

You are now ready to run the server with:

./enfugue-server/enfugue.sh

Press Ctrl+C to exit.

Windows

Download the win64 files here, and extract them using a program which allows extracting from multiple archives such as 7-Zip.

If you are also choosing to use TensorRT, you must perform some additional steps on Windows. Follow the steps detailed here.

Locate the file enfugue-server.exe, and double-click it to run it. To exit, locate the icon in the bottom-right hand corner of your screen (the system tray) and right-click it, then select Quit.

MacOS

Download the the macos file here, then double-click it to extract the package. When you run the application using the command below, your Mac will warn you of running downloaded packages, and you will have to perform an administrator override to allow it to run - you will be prompted to do this. To avoid this, you can run an included command like so:

./enfugue-server/unquarantine.sh

./enfugue-server/enfugue.sh

Note: while the MacOS packages are compiled on x86 machines, they are tested and designed for the new M1/M2 ARM machines thanks to Rosetta, Apple's machine code translation system.

Installing and Running: Conda

To install with the provided Conda environments, you need to install a version of Conda.

After installing Conda and configuring it so it is available to your shell or command-line, download one of the environment files depending on your platform and graphics API.

First, choose windows-, linux- or macos- based on your platform.
Then, choose your graphics API:
- If you are on MacOS, you only have access to MPS.
- If you have a powerful next-generation Nvidia GPU (3000 series and better with at least 12 GB of VRAM), use tensorrt for all of the capabilities of cuda with the added ability to compile TensorRT engines. If you do not plan on using TensorRT, select cuda for the most optimized build for this API.
- If you have any other Nvidia GPU or other CUDA-compatible device, select cuda.
- Additional graphics APIs (rocm and directml) are being added and will be available soon.

Finally, using the file you downloaded, create your Conda environment:

conda env create -f <downloaded_file.yml>

You've now installed Enfugue and all dependencies. To run it, activate the environment and then run the installed binary.

conda activate enfugue
enfugue run

Optional: DWPose Support

To install DW Pose support (a better, faster pose and face detection model), after installing Enfugue, execute the following:

mim install "mmcv>=2.0.1"
mim install "mmdet>=3.1.0"
mim install "mmpose>=1.1.0"

Installing and Running: Self-Managed Environment

pip install enfugue

If you are on Linux and want TensorRT support, execute:

pip install enfugue[tensorrt]

If you are on Windows and want TensorRT support, follow the steps detailed here.

Thank you!

0.2.4

7 months ago

New Platform

ENFUGUE has partnered with RunDiffusion to bring you a cloud-hosted solution for using ENFUGUE with just a web browser, no hardware requirements at all. Sign up for 30 minutes of free use, with further use costing as little as $0.50 an hour.

Enjoy a fine-tuned ENFUGUE experience, with numerous models pre-installed and even the lowest tier of machine more than capable of using SDXL and upscaling up to 4k.

New Features

1. IP Adapter Overhaul

Using multiple images for prompting is an exceptional way to create a personalized affectation without needing to train your own model.

The IP adapter integration has been overhauled with the following:

Any number of IP Adapter Images can now be stacked on a node, in combination with any number of control images and a reference image. This provides an incredible way of creating a "mini-LoRA," extracting the features from numerous reference images and using them to modify your prompt.
In addition, a total of five IP adapters are now available, selectable by using checkboxes in the interface.
1. Stable Diffusion 1.5
2. Stable Diffusion 1.5 with Fine-Grained Features (IP Adapter Plus)
3. Stable Diffusion 1.5 with Fine-Grained Facial Features (IP Adapter Plus Face)
4. Stable Diffusion XL
5. Stable Diffusion XL with Fine-Grained Features (IP Adapter Plus XL)

2. QR Monster, ControlNet Conditioning Start/End

Use a strong scale and lengthy conditioning time for QR codes that are scannable.

Use a weaker scale and stop conditioning short to achieve a more subtle or "hidden-eye" effect.

A new ControlNet, QR Code Monster, has been added to Enfugue. Simply select "QR" from the ControlNet dropdown to use it. There is no pre-processor for this ControlNet.
In addition, sliders have been added in the UI for when to start and when to stop ControlNet conditioning. This is a per-control-image setting that tells Enfugue when to start following ControlNet's influence and when to stop, in proportion to the length of the denoising stage. For example, a conditioning start of "0.2" would tell Enfugue to start using ControlNet about 20% of the way through creating the image, which will allow Enfugue to generate it's own randomness prior to using ControlNet for more subtle effects. The same can be done to the end of the conditioning period as well.

3. Model Merger

The two merge modes and their options.

The backend model merger has been made available in the frontend to use as desired. Select Merge Models under the Models menu to get started.

There are two modes of operation:

Add Difference - this takes three checkpoints as input, and the output will be the first model plus the difference between the latter models - i.e. the resulting model will be of the formula (a + (b - c)) for all weights common between them.
Weighted Sum - this takes two checkpoints as input, and the output will be a weighted blend between the models based upon an alpha parameter from 0 to 1, where 0 would produce entirely the first checkpoint, 1 would produce entirely the second checkpoint, and 0.5 would produce the exact mean between the two.

4. More Flexible Model Loading

Finally, model loading has been made significantly more flexible, to better facilitate sharing of resources between Enfugue and other stable diffusion applications. To this end, Enfugue will now search in configured directories to an arbitrarily nested level of directories to find versions of models before attempting to download them itself. The known filenames for each scenario have been expanded as well, see the wiki for more details.

Full Changelog: https://github.com/painebenjamin/app.enfugue.ai/compare/0.2.3...0.2.4

How-To Guide

Installing and Running: Portable Distributions

Select a portable distribution if you'd like to avoid having to install other programs, or want to have an isolated executable file that doesn't interfere with other environments on your system.

Summary

Platform	Graphics API	File(s)	CUDA Version	Torch Version
MacOS	MPS	enfugue-server-0.2.4-macos-ventura-x86_64.tar.gz	N/A	2.2.0.dev20230928
Windows	CUDA	enfugue-server-0.2.4-win-cuda-x86_64.zip.001 enfugue-server-0.2.4-win-cuda-x86_64.zip.002	12.1.1	2.2.0.dev20230928
Windows	CUDA+TensorRT	enfugue-server-0.2.4-win-tensorrt-x86_64.zip.001 enfugue-server-0.2.4-win-tensorrt-x86_64.zip.002	11.7.1	1.13.1
Linux	CUDA	enfugue-server-0.2.4-manylinux-cuda-x86_64.tar.gz.0 enfugue-server-0.2.4-manylinux-cuda-x86_64.tar.gz.1 enfugue-server-0.2.4-manylinux-cuda-x86_64.tar.gz.2	12.1.1	2.2.0.dev20230928
Linux	CUDA+TensorRT	enfugue-server-0.2.4-manylinux-tensorrt-x86_64.tar.gz.0 enfugue-server-0.2.4-manylinux-tensorrt-x86_64.tar.gz.1 enfugue-server-0.2.4-manylinux-tensorrt-x86_64.tar.gz.2	11.7.1	1.13.1

Linux

First, decide which version you want - with or without TensorRT support. TensorRT requires a powerful, modern Nvidia GPU. Then, download the appropriate manylinux files here, concatenate them and extract them. A console command to do that is:

cat enfugue-server-0.2.4* | tar -xvz

You are now ready to run the server with:

./enfugue-server/enfugue.sh

Press Ctrl+C to exit.

Windows

Download the win64 files here, and extract them using a program which allows extracting from multiple archives such as 7-Zip.

Locate the file enfugue-server.exe, and double-click it to run it. To exit, locate the icon in the bottom-right hand corner of your screen (the system tray) and right-click it, then select Quit.

MacOS

./enfugue-server/unquarantine.sh

./enfugue-server/enfugue.sh

Note: while the MacOS packages are compiled on x86 machines, they are tested and designed for the new M1/M2 ARM machines thanks to Rosetta, Apple's machine code translation system.

Upgrading

To upgrade any distribution, download and extract the appropriate upgrade package on this release. Copy all files in the upgrade package into your Enfugue installation directory, overwriting any existing files.

Installing and Running: Conda

To install with the provided Conda environments, you need to install a version of Conda.

After installing Conda and configuring it so it is available to your shell or command-line, download one of the environment files depending on your platform and graphics API.

First, choose windows-, linux- or macos- based on your platform.
Then, choose your graphics API:
- If you are on MacOS, you only have access to MPS.
- If you have a powerful next-generation Nvidia GPU (3000 series and better with at least 12 GB of VRAM), use tensorrt for all of the capabilities of cuda with the added ability to compile TensorRT engines. If you do not plan on using TensorRT, select cuda for the most optimized build for this API.
- If you have any other Nvidia GPU or other CUDA-compatible device, select cuda.
- Additional graphics APIs (rocm and directml) are being added and will be available soon.

Finally, using the file you downloaded, create your Conda environment:

conda env create -f <downloaded_file.yml>

You've now installed Enfugue and all dependencies. To run it, activate the environment and then run the installed binary.

conda activate enfugue
enfugue run

Optional: DWPose Support

To install DW Pose support (a better, faster pose and face detection model), after installing Enfugue, execute the following:

mim install "mmcv>=2.0.1"
mim install "mmdet>=3.1.0"
mim install "mmpose>=1.1.0"

Installing and Running: Self-Managed Environment

pip install enfugue

If you are on Linux and want TensorRT support, execute:

pip install enfugue[tensorrt]

If you are on Windows and want TensorRT support, follow the steps detailed here.

Thank you!

0.2.3

8 months ago

New Features

1. Image Prompt Adapter

Image Prompting is another tool in your toolbox for using an image to create something new. Here we see the girl with the pearl earring effortlessly being transformed into something reminiscent of, but distinct from, Vermeer's classic.

Tencent's AI Lab has released Image Prompt (IP) Adapter, a new method for controlling Stable Diffusion with an input image that provides a huge amount of flexibility, with more consistency than standard image-based inference, and more freedom than than ControlNet images. The best part about it - it works alongside all other control techniques, giving us dozens of new combinations of control methods users can employ.

2. DWPose*, Pose ControlNet XL

^{*DWPose is currently only available for users managing their own environments. Portable and docker users can still use OpenPose as before.}

ControlNet Pose XL is used to match human poses when using SDXL models. Source image by Loris Boulingez via Unsplash.

IDEA Research has released DWPose, a new AI model for detecting human poses, including fingers and faces, faster and more accurately than ever before.

In addition, a community member named Thibaud Zamora has released OpenPose ControlNet for SDXL, which is now the third SDXL ControlNet after Canny Edge and Depth.

You only need to select ControlNet pose to use it. In order to use DWPose, users managing their own environments must execute the following:

#!/usr/bin/env sh
pip install -U openmim
mim install mmengine
mim install "mmcv>=2.0.1"
mim install "mmdet>=3.1.0"
mim install "mmpose>=1.1.0"

3. Multi ControlNet, Merged Image Nodes

Combine this feature with the new Image Prompt adapter for easy style transfers without needing to prompt anything. The leaf image is by Daria Drewnoska and the dark abstract image by Pawel Czerwinski.

You can now merge images together on the canvas, giving each one it's own assignment(s) toward the overall diffusion plan. Click and drag one image onto another to merge them. You'll be presented the option to drop the image when you bring their headers together (i.e. bring the top of the dragged image to the top of the target image.)

The number of images you can use this way is unlimited, however there are some roles that can only be fulfilled by one image.
For example, you can only initialize inference from one image (i.e. check 'Use for Inference',) and only use one image for image prompting.
ControlNet's can be mixed, matched and reused as desired. Be mindful that each new kind of ControlNet you add increases VRAM requirements significantly. Adding a different image for the same ControlNet only increases VRAM requirements marginally.

4. Multi-Diffusion Options & Speed Boost

The new options for chunking.

Multi-diffusion speed has been improved by as much as 5 iterations per second, thanks to better algorithms for merging chunks. With this comes new options for how these chunks are masked onto each other, blending edges together. The options available as constant, bilinear and gaussian, with the default being bilinear. These images were all generated in 40 steps with a chunking size of 64.

Constant masking can produce sharp details, but visible seams unless you use very low chunking sizes.

Bilinear produces a good mix of edge blending without much detail loss.

Gaussian masking can greatly alter an image, changing where detail is applied, without visible seams.

5. SDXL Textual Inversion Support

With the rising popularity of UnaestheticXL, a negative textual inversion for SDXL by Aikimi, an implementation has been added to Enfugue for loading SDXL TI's. Add them just as you would add other Textual Inversion.

These are a little slow to load at the moment, as this is a temporary workaround pending official implementation into Diffusers.

6. Better Refining

The new refining default options.

Better options have been provided for the refining method. Use the slider at the top to control the step at which the configured refiner takes over denoising, providing a better end result than executing refining as a distinct step.

7. Better Upscaling

The new upscaling interface allows for any number of steps with any scale and method.

Upscaling has been made much more flexible by permitting you to select any number of steps with any set of configuration, rather than only permitting you one upscaling step.

The upscaling amount has additionally been unconstrained, allowing you to use an upscaling algorithm to modify the dimensions of an image by anywhere between 0.5× and 16×.

Full Changelog: https://github.com/painebenjamin/app.enfugue.ai/compare/0.2.2...0.2.3

How-To Guide

Installing and Running: Portable Distributions

Select a portable distribution if you'd like to avoid having to install other programs, or want to have an isolated executable file that doesn't interfere with other environments on your system.

Summary

Platform	Graphics API	File(s)	CUDA Version	Torch Version
MacOS	MPS	enfugue-server-0.2.3-macos-ventura-x86_64.tar.gz	N/A	2.2.0.dev20230910
Windows	CUDA	enfugue-server-0.2.3-win-cuda-x86_64.zip.001 enfugue-server-0.2.3-win-cuda-x86_64.zip.002	12.1.1	2.2.0.dev20230910
Windows	CUDA+TensorRT	enfugue-server-0.2.3-win-tensorrt-x86_64.zip.001 enfugue-server-0.2.3-win-tensorrt-x86_64.zip.002	11.7.1	1.13.1
Linux	CUDA	enfugue-server-0.2.3-manylinux-cuda-x86_64.tar.gz.0 enfugue-server-0.2.3-manylinux-cuda-x86_64.tar.gz.1 enfugue-server-0.2.3-manylinux-cuda-x86_64.tar.gz.2	12.1.1	2.2.0.dev20230910
Linux	CUDA+TensorRT	enfugue-server-0.2.3-manylinux-tensorrt-x86_64.tar.gz.0 enfugue-server-0.2.3-manylinux-tensorrt-x86_64.tar.gz.1 enfugue-server-0.2.3-manylinux-tensorrt-x86_64.tar.gz.2	11.7.1	1.13.1

Linux

cat enfugue-server-0.2.3* | tar -xvz

You are now ready to run the server with:

./enfugue-server/enfugue.sh

Press Ctrl+C to exit.

Windows

Download the win64 files here, and extract them using a program which allows extracting from multiple archives such as 7-Zip.

Locate the file enfugue-server.exe, and double-click it to run it. To exit, locate the icon in the bottom-right hand corner of your screen (the system tray) and right-click it, then select Quit.

MacOS

./enfugue-server/unquarantine.sh

./enfugue-server/enfugue.sh

Note: while the MacOS packages are compiled on x86 machines, they are tested and designed for the new M1/M2 ARM machines thanks to Rosetta, Apple's machine code translation system.

Upgrading

Installing and Running: Conda

To install with the provided Conda environments, you need to install a version of Conda.

After installing Conda and configuring it so it is available to your shell or command-line, download one of the environment files depending on your platform and graphics API.

First, choose windows-, linux- or macos- based on your platform.
Then, choose your graphics API:
- If you are on MacOS, you only have access to MPS.
- If you have a powerful next-generation Nvidia GPU (3000 series and better with at least 12 GB of VRAM), use tensorrt for all of the capabilities of cuda with the added ability to compile TensorRT engines. If you do not plan on using TensorRT, select cuda for the most optimized build for this API.
- If you have any other Nvidia GPU or other CUDA-compatible device, select cuda.
- Additional graphics APIs (rocm and directml) are being added and will be available soon.

Finally, using the file you downloaded, create your Conda environment:

conda env create -f <downloaded_file.yml>

You've now installed Enfugue and all dependencies. To run it, activate the environment and then run the installed binary.

conda activate enfugue
enfugue run

Installing and Running: Self-Managed Environment

pip install enfugue

If you are on Linux and want TensorRT support, execute:

pip install enfugue[tensorrt]

If you are on Windows and want TensorRT support, follow the steps detailed here.

Thank you!

0.2.2

8 months ago

7ed79339-42e2-4d47-82b1-2fc7b866b074

New Features

1. Depth ControlNet XL

ControlNet Depth XL has been added. Additionally, settings have been exposed that allow users to specify the path to HuggingFace repositories for all ControlNets.

This will allow you to use ControlNets released on HuggingFace the moment they're available, without needing to wait for an update.

The ControlNet settings in the UI have empty settings for the unreleased XL ControlNets.

2. Task Transparency

Improved visibility into what the back-end is doing by added the current task to the UI and API.

The current task will change many times during extended diffusion plans.

3. Context Menus and Keyboard Shortcuts

Contextual menus have been added for the currently active canvas node and/or the currently visible image sample.

These menus perform the same operations as the buttons in the header of the node or in the toolbar of the sample. This just provides another way to access these options that is always visible, without you needing to move the canvas to the buttons/toolbar.

Keyboard shortcuts have also been added for all menu items, including the contextual menus added above.

Hold down the shift key to highlight the keys to press in the menu bar. Press the key corresponding to a menu item to expand the sub-menu, and show the keys to press for those items as well.
When you aren't typing in a text input, pressing Shift+Enter will invoke the engine (i.e. click 'Enfugue.')
When you are typing in the global prompt fields, press Ctrl+Enter to invoke the engine.

Showing context menus and their corresponding keyboard shortcuts.

4. Iterations Parameter

Iterations has been added as an invocation parameter in the UI and API. This allows you to generate more images using the same settings, without needing the VRAM to generate multiple samples.

Upscaling will be performed on each sample separately.
When using multiple samples, you can multiple your sample count by the iterations to determine the total number of images you will generate.

Generating 2 samples at a time over 3 iterations.

5. Secondary Prompts

Secondary prompts have been added to all prompt inputs. This allows you to enter two separate prompts into the primary and secondary text encodes of SDXL base models.

This can have many varying affects, experimentation is encouraged.

The UI for secondary prompts.

The sample prompt with different dog breeds as secondary prompts produces visibly similar images with small modifications. Click the image for full-resolution (1024×1024)

6. Inpainting Options

More inpainting options have been made available to better control how you want to inpaint.

Cropped inpainting is now optional, and you can increase or decrease feathering to better blend your inpaints.
You can now select any model as an inpainting model, not just SD 1.5 inpainting pipelines. This includes XL models.
An additional setting has been exposed to disable Enfugue's automated inpainting checkpoint creation. If you disable this, you will need to provide your own fine-tuned (9-channel) checkpoint for inpainting if you wish to use one.

Inpainting with cropping.

Inpainting without cropping.

7. Extended VAE Options

You can now select multiple VAE for various pipelines, to further enable mixing SD 1.5 and XL. Additionally, an "other" option is provided to allow selecting any VAE hosting on HuggingFace.

Since this will download arbitrary resources, it is only enabled for administrators. Users with limited capabilities cannot select 'other.'

The VAE 'Other' option for the refiner pipeline.

8. Image Metadata

Metadata has been added to images generated by Enfugue. Drag and drop the image into Enfugue to load the same settings that generated that image into your UI.

The metadata includes the state of all of your UI elements as well as the state generated from the backend.
This does not include any details from your computer, your network, or anything else outside of what was needed to generate the image.
At the present moment, if you copy and paste the image using your operating system's clipboard, the metadata is lost on all tested platforms. This is a limitation currently placed on transferring PNG's to your clipboard and cannot be worked around, so be sure to drag-and-drop or save-and-upload if you wish to share your settings.

An example of image metadata (click image to view full size.)

9. Upscale Pipeline Options

Provided the ability to select the pipeline to use when upscaling.

The default behavior is the same as the previous behavior; it will use a refiner pipeline when one is available, otherwise it will use the base pipeline. You can now select to use the base pipeline regardless of whether or not you set a refiner pipeline.

The upscale pipeline option.

10. Intermediate Options

Added an option for how often to decode latents and generate intermediate images.

It will also take longer to generate an image with intermediates than it will without. However, you will have a chance to see how the image is coming along and stop generation prior to completion if you enable them.
The default settings is the recommended value - every 10 inference steps, an intermediate will be decoded.
If you set this to 0, intermediates will be disabled, and as a result it will be possible to unload the decoder during denoising. This can help reduce VRAM consumption if you are hitting out-of-memory errors.
This only applies during primary inference and refining. Intermediates are always disabled during upscaling.

The intermediate steps option.

11. Log Pause

Added a button to pause the log view, enabling you to take your time and read the entries rather than having to chase them when something is writing logs.

The log pause button.

12. Other

Further improved memory management resulting in lower VRAM overhead and overall faster inference. ControlNet's are now loaded to the GPU only when required, and VAE will be unloaded when no longer required. This means some users who have had issues with using the large XL ControlNet's may find them working better in this release.

Fixed Issues

The issue that previously limited multi-diffusion (i.e. chunked/sliced image generation) to a select number of schedulers has been remedied, and you can now use all* schedulers for multi-diffusion as well as single (unchunked) diffusion. For this reason, the setting for selecting different schedulers for single- and multi-diffusion has been removed.
- There is one exception, and that is the DPM Solver SDE scheduler, which will fail when using chunking.
Fixed an issue whereby typing into the model picker could sort the results in a seemingly nonsensical way when using both configured models and checkpoints.
Fixed an issue whereby MacOS users on a trackpad could not move the canvas if it was entirely occupied by a scribble node or an image node with an inpaint mask.
- Since there is no middle-mouse on MacOS trackpads, and Control+Click cannot be overridden due to MacOS, you must use the Options+Click method to pan the canvas in such scenarios.

Full Changelog: https://github.com/painebenjamin/app.enfugue.ai/compare/0.2.1...0.2.2

How-To Guide

Installing and Running: Portable Distributions

Select a portable distribution if you'd like to avoid having to install other programs, or want to have an isolated executable file that doesn't interfere with other environments on your system.

Summary

Platform	Graphics API	File(s)	CUDA Version	Torch Version
MacOS	MPS	enfugue-server-0.2.2-macos-ventura-x86_64.tar.gz	N/A	2.1.0.dev20230720
Windows	CUDA	enfugue-server-0.2.2-win-cuda-x86_64.zip.001 enfugue-server-0.2.2-win-cuda-x86_64.zip.002	12.1.1	2.1.0.dev20230720
Windows	CUDA+TensorRT	enfugue-server-0.2.2-win-tensorrt-x86_64.zip.001 enfugue-server-0.2.2-win-tensorrt-x86_64.zip.002	11.7.1	1.13.1
Linux	CUDA	enfugue-server-0.2.2-manylinux-cuda-x86_64.tar.gz.0 enfugue-server-0.2.2-manylinux-cuda-x86_64.tar.gz.1 enfugue-server-0.2.2-manylinux-cuda-x86_64.tar.gz.2	12.1.1	2.1.0.dev20230720
Linux	CUDA+TensorRT	enfugue-server-0.2.2-manylinux-tensorrt-x86_64.tar.gz.0 enfugue-server-0.2.2-manylinux-tensorrt-x86_64.tar.gz.1 enfugue-server-0.2.2-manylinux-tensorrt-x86_64.tar.gz.2	11.7.1	1.13.1

Linux

cat enfugue-server-0.2.2* | tar -xvz

You are now ready to run the server with:

./enfugue-server/enfugue.sh

Press Ctrl+C to exit.

Windows

Download the win64 files here, and extract them using a program which allows extracting from multiple archives such as 7-Zip.

Locate the file enfugue-server.exe, and double-click it to run it. To exit, locate the icon in the bottom-right hand corner of your screen (the system tray) and right-click it, then select Quit.

MacOS

./enfugue-server/unquarantine.sh

./enfugue-server/enfugue.sh

Note: while the MacOS packages are compiled on x86 machines, they are tested and designed for the new M1/M2 ARM machines thanks to Rosetta, Apple's machine code translation system.

Upgrading

Installing and Running: Conda

To install with the provided Conda environments, you need to install a version of Conda.

After installing Conda and configuring it so it is available to your shell or command-line, download one of the environment files depending on your platform and graphics API.

First, choose windows-, linux- or macos- based on your platform.
Then, choose your graphics API:
- If you are on MacOS, you only have access to MPS.
- If you have a powerful next-generation Nvidia GPU (3000 series and better with at least 12 GB of VRAM), use tensorrt for all of the capabilities of cuda with the added ability to compile TensorRT engines.
- If you have any other Nvidia GPU or other CUDA-compatible device, select cuda.
- Additional graphics APIs (rocm and directml) are being added and will be available soon.

Finally, using the file you downloaded, create your Conda environment:

conda env create -f <downloaded_file.yml>

You've now installed Enfugue and all dependencies. To run it, activate the environment and then run the installed binary.

conda activate enfugue
enfugue run

Installing and Running: Self-Managed Environment

pip install enfugue

If you are on Linux and want TensorRT support, execute:

pip install enfugue[tensorrt]

If you are on Windows and want TensorRT support, follow the steps detailed here.

Thank you!

0.2.1

9 months ago

Made with SDXL Canny Edge ControlNet

Release Notes

New Platform

ENFUGUE for Docker is Here!

Simply executing the following to pull and run:

docker pull ghcr.io/painebenjamin/app.enfugue.ai:latest
docker run --rm --gpus all --runtime nvidia -p 45554:45554 ghcr.io/painebenjamin/app.enfugue.ai:latest run

See here for more information. Unfortunately for the moment this is Linux-only.

New Features

SDXL LoRA Support
1. SDXL LoRA and SD 1.5 LoRA are not interchangeable.
2. When downloading from CivitAI, the version of SD the LoRA is compatible with will be visible.
SDXL ControlNet Support
1. Only Canny Edge is available currently.
2. There is no configuration to be done for this; when using an XL checkpoint, you simply can now use images with Canny Edge ControlNet selected.

SDXL Canny Edge ControlNet - Source Image from Unsplash

GPU-Accelerated Frontend Image Alterations
1. Click 'Adjust' for common image adjustments, or 'Filter' for a select view image filters.

Adjustments are performed in real-time - GPU acceleration possible thanks to GPU.js

Changes

Made the /invoke endpoint more flexible. See here for API documentation.
MacOS now operates in half-precision by default.
When using the Offload pipeline switch mode, there is now no case where the CPU will have two pipelines in memory at once. Pipelines are now swapped one model at a time in order to avoid high peak memory usage.
When clicking on a result in the 'Results' window, it will now place itself on the canvas, instead of opening an individual image inspector.
During initialization, Enfugue will now offer to create directories that do not exist, instead of simply producing an error.

Fixed Issues

Fixed an issue where optimized inpainting would cut off some inpaint areas in the Y dimension.
Fixed an issue where optimized inpainting would feather on the bottom or right edge of an image, resulting in an undesired vignette effect.
Fixed a rare issue where text encoders would not be loaded to the GPU before they were used.

How-To Guide

Installing and Running: Portable Distributions

Select a portable distribution if you'd like to avoid having to install other programs, or want to have an isolated executable file that doesn't interfere with other environments on your system.

Summary

Platform	Graphics API	File(s)	CUDA Version	Torch Version
MacOS	MPS	enfugue-server-0.2.1-macos-ventura-x86_64.tar.gz	N/A	2.1.0.dev20230720
Windows	CUDA	enfugue-server-0.2.1-win-cuda-x86_64.zip.001 enfugue-server-0.2.1-win-cuda-x86_64.zip.002	12.1.1	2.1.0.dev20230720
Windows	CUDA+TensorRT	enfugue-server-0.2.1-win-tensorrt-x86_64.zip.001 enfugue-server-0.2.1-win-tensorrt-x86_64.zip.002	11.7.1	1.13.1
Linux	CUDA	enfugue-server-0.2.1-manylinux-cuda-x86_64.tar.gz.0 enfugue-server-0.2.1-manylinux-cuda-x86_64.tar.gz.1 enfugue-server-0.2.1-manylinux-cuda-x86_64.tar.gz.2	12.1.1	2.1.0.dev20230720
Linux	CUDA+TensorRT	enfugue-server-0.2.1-manylinux-tensorrt-x86_64.tar.gz.0 enfugue-server-0.2.1-manylinux-tensorrt-x86_64.tar.gz.1 enfugue-server-0.2.1-manylinux-tensorrt-x86_64.tar.gz.2	11.7.1	1.13.1

Linux

cat enfugue-server-0.2.1* | tar -xvz

You are now ready to run the server with:

./enfugue-server/enfugue.sh

Press Ctrl+C to exit.

Windows

Download the win64 files here, and extract them using a program which allows extracting from multiple archives such as 7-Zip.

Locate the file enfugue-server.exe, and double-click it to run it. To exit, locate the icon in the bottom-right hand corner of your screen (the system tray) and right-click it, then select Quit.

MacOS

./enfugue-server/unquarantine.sh

./enfugue-server/enfugue.sh

Note: while the MacOS packages are compiled on x86 machines, they are tested and designed for the new M1/M2 ARM machines thanks to Rosetta, Apple's machine code translation system.

Upgrading

Installing and Running: Conda

To install with the provided Conda environments, you need to install a version of Conda.

After installing Conda and configuring it so it is available to your shell or command-line, download one of the environment files depending on your platform and graphics API.

First, choose windows-, linux- or macos- based on your platform.
Then, choose your graphics API:
- If you are on MacOS, you only have access to MPS.
- If you have a powerful next-generation Nvidia GPU (3000 series and better with at least 12 GB of VRAM), use tensorrt for all of the capabilities of cuda with the added ability to compile TensorRT engines.
- If you have any other Nvidia GPU or other CUDA-compatible device, select cuda.
- Additional graphics APIs (rocm and directml) are being added and will be available soon.

Finally, using the file you downloaded, create your Conda environment:

conda env create -f <downloaded_file.yml>

You've now installed Enfugue and all dependencies. To run it, activate the environment and then run the installed binary.

conda activate enfugue
enfugue run

Installing and Running: Self-Managed Environment

pip install enfugue

If you are on Linux and want TensorRT support, execute:

pip install enfugue[tensorrt]

If you are on Windows and want TensorRT support, follow the steps detailed here.

Thank you!

0.2.0

9 months ago

ENFUGUE is entering beta!

New Platform

ENFUGUE for Apple Silicon is Here!

Installing and Running: Portable Distributions

Select a portable distribution if you'd like to avoid having to install other programs, or want to have an isolated executable file that doesn't interfere with other environments on your system.

Summary

Platform	Graphics API	File(s)	CUDA Version	Torch Version
MacOS	MPS	enfugue-server-0.2.0-macos-ventura-x86_64.tar.gz	N/A	2.1.0.dev20230720
Windows	CUDA	enfugue-server-0.2.0-win-cuda-x86_64.zip.001 enfugue-server-0.2.0-win-cuda-x86_64.zip.002	12.1.1	2.1.0.dev20230720
Windows	CUDA+TensorRT	enfugue-server-0.2.0-win-tensorrt-x86_64.zip.001 enfugue-server-0.2.0-win-tensorrt-x86_64.zip.002	11.7.1	1.13.1
Linux	CUDA	enfugue-server-0.2.0-manylinux-cuda-x86_64.tar.gz.0 enfugue-server-0.2.0-manylinux-cuda-x86_64.tar.gz.1 enfugue-server-0.2.0-manylinux-cuda-x86_64.tar.gz.2	12.1.1	2.1.0.dev20230720
Linux	CUDA+TensorRT	enfugue-server-0.2.0-manylinux-tensorrt-x86_64.tar.gz.0 enfugue-server-0.2.0-manylinux-tensorrt-x86_64.tar.gz.1 enfugue-server-0.2.0-manylinux-tensorrt-x86_64.tar.gz.2	11.7.1	1.13.1

Linux

cat enfugue-server-0.2.0* | tar -xvz

You are now ready to run the server with:

./enfugue-server/enfugue.sh

Press Ctrl+C to exit.

Windows

Download the win64 files here, and extract them using a program which allows extracting from multiple archives such as 7-Zip.

Locate the file enfugue-server.exe, and double-click it to run it. To exit, locate the icon in the bottom-right hand corner of your screen (the system tray) and right-click it, then select Quit.

MacOS

./enfugue-server/unquarantine.sh

./enfugue-server/enfugue.sh

Note: while the MacOS packages are compiled on x86 machines, they are tested and designed for the new M1/M2 ARM machines thanks to Rosetta, Apple's machine code translation system.

Installing and Running: Conda

To install with the provided Conda environments, you need to install a version of Conda.

After installing Conda and configuring it so it is available to your shell or command-line, download one of the environment files depending on your platform and graphics API.

First, choose windows-, linux- or macos- based on your platform.
Then, choose your graphics API:
- If you are on MacOS, you only have access to MPS.
- If you have a powerful next-generation Nvidia GPU (3000 series and better with at least 12 GB of VRAM), use tensorrt for all of the capabilities of cuda with the added ability to compile TensorRT engines.
- If you have any other Nvidia GPU or other CUDA-compatible device, select cuda.
- Additional graphics APIs (rocm and directml) are being added and will be available soon.

Finally, using the file you downloaded, create your Conda environment:

conda env create -f <downloaded_file.yml>

You've now installed Enfugue and all dependencies. To run it, activate the environment and then run the installed binary.

conda activate enfugue
enfugue run

Installing and Running: Self-Managed Environment

pip install enfugue

If you are on Linux and want TensorRT support, execute:

pip install enfugue[tensorrt]

If you are on Windows and want TensorRT support, follow the steps detailed here.

New Features

Full SDXL Support
1. Simply select SDXL from the model selector and it will be downloaded when you first invoke. You can also start the download from the popup screen that should appear when you first view the application.
2. TensorRT is disabled for SDXL, and will remain so for the foreseeable future. TensorRT relies on being able to compress models to a size of 2 GB or less, and it will be very difficult to optimize SDXL's 5 GB Unet to the required size.
3. A good number of features are unsupported by SDXL at large at the moment, including all ControlNets and Inpainting. You will receive an error indicating as much if you attempt to use a ControlNet on the canvas. Upscaling ControlNet's will be ignored. If you try to use inpainting, either the configured inpainter or the SD 1.5 inpainting checkpoint will be used.

Rendering an image using SDXL base and refiner. Note some loading time is omitted.

Refinement added as a part of the general workflow.
1. Adds a refiner checkpoint in the model configuration manager.
2. Adds a refiner checkpoint selector in the "Additional Models" section when not using preconfigured models.
3. Adds a sidebar UI for refiner denoising strength, refiner guidance scale, and refiner positive/negative aesthetic scores. Note: aesthetic scores are only used when specifically using the SDXL refiner checkpoint, as that's the only model that understands them at the moment. If you choose a different checkpoint as a refiner, these will be ignored.
4. When refining an image using the SDXL refiner that is smaller than 1024×1024, the image will be scaled up appropriately for diffusion, and scaled back down when returned to the user.

Configuring SDXL in the model picker (left) and making a preconfigured model for SDXL (right)

Added the ability to specify inpainting checkpoints separately as a part of the same configuration as above.
Added the ability to specify VAE checkpoint separately as a part of the same configuration as above.
Added a large array of default values that can be specified in pre-configured models, including things like guidance scale, refinement strength, diffusion size, etc.
Added the ability to specify schedulers as a part of the same configuration as above.
1. All Karras schedulers are supported.
2. A second scheduler input is provided for use when doing multi-diffusion, as not all Karras schedulers will work with this.
3. The default schedulers are DDIM for SD 1.5 and Euler Discrete for SDXL, both for regular and multi-diffusion.
Added support for LyCORIS as part of the same configuration as above, as well as added to the UI for the CivitAI browser.

Browswing CivitAI's LyCORIS Database

Added smart inpaint for images with transparency and a hand-drawn mask; this is now performed in a single step.

Intelligent inpaint mask merging

Added smart inpaint for large images with small inpaint masks; a minimal bounding box will now be located and blended into the final image, allowing for quick inpainting on very large images.

Denoising only what was requested can reduce processing time by as much as 90%

Improved region prompt performance with less processing time and less concept bleeding between regions.

Region prompting is now more predictable.

Added the following ControlNets and their corresponding image processors:
1. PIDI (Soft Edge)
2. Line Art
3. Anime Line Art
4. Depth (MiDaS)
5. Normal (Estimate)
6. OpenPose
Added a log glance view that is always visible when there are logs to be read to further improve transparency.

The log glance view (upper right) and the log window.

Added a button to enable/disable animations in the front-end. This will disable all sliding gradients and spinners, but will keep the progress bar functioning.

Enabling and disabling animations.

Consolidated to a single more obvious "Stop Engine" button that is always visible when the engine is running.

The 'Stop Engine' button in the upper-right-hand corner.

Added the following configuration options:
1. Pipeline Switch Mode: this controls how the backend changes between the normal pipeline, inpainter pipeline, and refiner pipeline. The default method is to offload them to the CPU, you can also unload them completely or keep them all in VRAM.
2. Pipeline Cache Mode: This controls how checkpoints are cached into diffusers caches. These caches load much more quickly than checkpoints, but take up additional space. The default is to cache SDXL and TensorRT pipelines. You cannot disable caching TensorRT pipelines, but you can enable caching all pipelines.
3. Precision Mode: This allows the user to force full-precision (FP32) for all models. The default options will use half-precision (FP16) when it is available. You should only change this option if you encounter issues; ENFUGUE will disable half-precision in situations where it cannot be used, such as when using HIP (AMD devices) or MPS (Macs.)

The new pipeline configuration options. More information is available in the UI, hover over the inputs for details.