FoldFlow Save

FoldFlow: SE(3)-Stochastic Flow Matching for Protein Backbone Generation

Project README

SE(3)-Stochastic Flow Matching for Protein Backbone Generation

OT-CFM Preprint pytorch

Description

This is the official repository for the paper SE(3)-Stochastic Flow Matching for Protein Backbone Generation.

We propose a new family of Flow Matching methods called FoldFlow tailored for distributions on SE(3) and with a focus on protein backbone generation. Our 3 proposed methods are:

  • The first one is FoldFlow-base. Inspired by Riemannian Flow Matching, we develop a Flow Matching approach to generate data living on SO(3) manifold.
  • The second one is FoldFlow-OT which generalizes FoldFlow-base by drawing samples from a minibatch optimal transport coupling similarly to OT-CFM.
  • The third one is FoldFlow-SFM, a stochastic version of FoldFlow-OT.

Our experiments include:

  • Generation of synthetic SO(3) data.
  • Protein backbone design.
  • Equilibrium conformation generation.

Note that our methods can be adapted for all applications where data live on the SO(3)/SE(3) manifold.

foldflow

Installation

Install dependencies

# clone project
git clone https://github.com/DreamFold/FoldFlow.git
cd FoldFlow

# [OPTIONAL] create conda environment
conda create -n foldflow python=3.9
conda activate foldflow

# install requirements
pip install -r requirements.txt

To run our jupyter notebooks, use the following commands after installing our package.

# install ipykernel
conda install -c anaconda ipykernel

# install conda env in jupyter notebooj
python -m ipykernel install --user --name=foldflow

# launch our notebooks with the foldflow kernel

Current Code

The current repository only contains toy experiments for learning an SO(3) multimodal density using all three FoldFlow models.

Planned Updates

  • Inference code for protein experiments
  • Training code for protein experiments
  • Equilibrium conformation generation

Citations

If this codebase is useful towards other research efforts please consider citing us.

@misc{bose2023se3stochastic,
      title={SE(3)-Stochastic Flow Matching for Protein Backbone Generation}, 
      author={Avishek Joey Bose and Tara Akhound-Sadegh and Kilian Fatras and Guillaume Huguet and Jarrid Rector-Brooks and Cheng-Hao Liu and Andrei Cristian Nica and Maksym Korablyov and Michael Bronstein and Alexander Tong},
      year={2023},
      eprint={2310.02391},
      archivePrefix={arXiv},
      primaryClass={cs.LG}
}

Contribute

We welcome issues and pull requests (especially bug fixes) and contributions. We will try our best to improve readability and answer questions!

Licences

FoldFlow by Dreamfold is licensed under Attribution-NonCommercial 4.0 International

Warning: the current code uses PyTorch 1.13 and torchdyn 1.0.6.

This code base is heavily inspired from the TorchCFM library! You can check Flow Matching with data living on Euclidean spaces there https://github.com/atong01/conditional-flow-matching

Open Source Agenda is not affiliated with "FoldFlow" Project. README Source: DreamFold/FoldFlow
Stars
98
Open Issues
2
Last Commit
5 months ago
Repository

Open Source Agenda Badge

Open Source Agenda Rating