Commits · e2479cb5393f4458cce98ad20272b64aad6ae9cb · OpenDAS / OpenFold

06 May, 2024 1 commit

Add more efficient script to generate all-seqs FASTA · e2479cb5

Lukas Jarosch authored May 05, 2024

The previous data_dir_to_fasta.py script is very slow and requires fully reparsing mmCIF files. This new script is much faster and uses the sequence information from the alignment data instead. Note that this will not include chains for which alignments could not be generated, but we can't use those during training anyways.

e2479cb5

20 Mar, 2024 2 commits

Add script for expanding the alignment dir with duplicates · 94819bf1

Lukas Jarosch authored Mar 19, 2024

This adds support for duplicate chain expansion for the alignment dir format. This script can be run on the flattened non-redundant RODA alignments to add explicit directories for all of the duplicate chains in the duplicate_chains file, symlinked to their representative chain alignment directory.

94819bf1

Add duplicate chain file support to alignment DB script · ee0c5dbe

Lukas Jarosch authored Mar 19, 2024

This makes it more straightforward to create an alignment database directly from the flattened RODA downloads

ee0c5dbe

19 Mar, 2024 2 commits
- Add default shard number · e6780504
  Lukas Jarosch authored Mar 19, 2024
  
  e6780504
- Improve type hints and formatting · 77860bb7
  Lukas Jarosch authored Mar 19, 2024
  
  77860bb7
20 Feb, 2024 2 commits
- Add improved alignment-db creation script · 70918209
  Lukas Jarosch authored Oct 06, 2023
```
- much faster due to the use of threading and mp
- also supports sharding
```
  70918209
- Scripts from Lukas to be used in improved setup process · 50949b9a
  Christina Floristean authored Sep 08, 2023
  
  50949b9a
14 Feb, 2024 1 commit
- Add log statement to weight conversion script · e31e0378
  Jennifer authored Feb 13, 2024
  
  e31e0378
12 Feb, 2024 1 commit
- bugfixes and adds a section to convert optim files · 775f77dd
  Jennifer authored Feb 12, 2024
  
  775f77dd
09 Feb, 2024 1 commit
- Adjust weight conversion and add a script for converting checkpoints. · 260592e0
  Jennifer authored Feb 09, 2024
  
  260592e0
08 Feb, 2024 1 commit
- updates zero_to_fp32.py for new deepspeed version and import_weight bugfix · 1df591b0
  Jennifer authored Feb 08, 2024
  
  1df591b0
08 Dec, 2023 1 commit
- Remove env restart from setup script · 67a00a6c
  Christina Floristean authored Dec 08, 2023
  
  67a00a6c
29 Nov, 2023 1 commit
- Adds Soloseq parameter download script. · 4a50c9c4
  jnwei authored Nov 29, 2023
  
  4a50c9c4
13 Nov, 2023 1 commit
- Update setup script and refactor qkv prep · 7fb12cf5
  Christina Floristean authored Nov 13, 2023
  
  7fb12cf5
03 Nov, 2023 1 commit
- Fix for loading old OF weights into refactored model · f65b75fe
  Christina Floristean authored Nov 03, 2023
  
  f65b75fe
30 Oct, 2023 1 commit
- Added multimer inference to README · 15850092
  Christina Floristean authored Oct 30, 2023
  
  15850092
27 Oct, 2023 1 commit
- Adds KMP_AFFINITY to conda environment. · e3716118
  Jennifer Wei authored Oct 27, 2023
  
  e3716118
24 Oct, 2023 2 commits
- Separate out input parsing code in `EmbeddingGenerator` · 86b990d6
  Sachin Kadyan authored Oct 24, 2023
```
Bugfix: Corrected paths for just-in-time embedding generation
```
  86b990d6
- Just-in-time embedding generation for the SoloSeq model · 8185c307
  Sachin Kadyan authored Oct 24, 2023
  
  8185c307
23 Oct, 2023 2 commits
- More cleaning of bulk embedding generation script · 92835fd5
  Sachin Kadyan authored Oct 23, 2023
  
  92835fd5
- Cleaned up `precompute_embeddings.py`. · 0026173e
  Sachin Kadyan authored Oct 22, 2023
  
  0026173e
21 Oct, 2023 1 commit
- New script for generating ESM embeddings in bulk · bcc6d97b
  Sachin Kadyan authored Oct 09, 2023
  
  bcc6d97b
20 Oct, 2023 2 commits
- Remove conda env config setting and update to README · d6ae9f58
  jnwei authored Oct 20, 2023
  
  d6ae9f58
- - Moves python packages to conda installation · fcba3358
  Jennifer Wei authored Oct 20, 2023
```
  instead of pip
- Adds helper line to automatically prepend conda library to
  $LD_LIBRARY_PATH
```
  fcba3358
17 Oct, 2023 2 commits
- Updating $LD_LIBRARY_PATH to include conda environment library. · 705c2677
  Jennifer Wei authored Oct 17, 2023
  
  705c2677
- update installation scripts. · 4fde713c
  Jennifer Wei authored Oct 17, 2023
  
  4fde713c
16 Oct, 2023 2 commits
- Refactoring multimer data pipeline and permutation alignment. · 0cf1541c
  Christina Floristean authored Oct 16, 2023
  
  0cf1541c
- Removes conda installation from installation script. · 7922bd57
  Jennifer Wei authored Oct 16, 2023
  
  7922bd57
06 Oct, 2023 1 commit
- Add improved alignment-db creation script · a3bb3c40
  Lukas Jarosch authored Oct 06, 2023
```
- much faster due to the use of threading and mp
- also supports sharding
```
  a3bb3c40
20 Sep, 2023 1 commit
- Preserves one copy of `tests/test_data/sample_feats.pickle.gz` for unit · 73ff40b6
  Jennifer Wei authored Sep 20, 2023
```
tests in test_data_transforms.py
```
  73ff40b6
13 Sep, 2023 1 commit
- Clean up DS kernel integration and test, add cutlass to installation procedure · 2bf18520
  Christina Floristean authored Sep 13, 2023
  
  2bf18520
08 Sep, 2023 1 commit
- Scripts from Lukas to be used in improved setup process · afd91982
  Christina Floristean authored Sep 08, 2023
  
  afd91982
02 Aug, 2023 1 commit
- Bug fixes for multimer inference and monomer training · fdcb72e8
  Christina Floristean authored Aug 02, 2023
  
  fdcb72e8
02 Jun, 2023 1 commit
- Fixed bug in triangle multiplicative update and added early stop recycling. · c1129bef
  Christina Floristean authored Jun 02, 2023
  
  c1129bef
26 Apr, 2023 1 commit
- Advance python version to 3.9 to build docker in Ubuntu Lunar Lobster · 942f4d6e
  Vaclav Hanzl authored Apr 26, 2023
  
  942f4d6e
18 Apr, 2023 1 commit
- Added UniRef30 to data pipeline · 425bdb5e
  Christina Floristean authored Apr 18, 2023
  
  425bdb5e
17 Apr, 2023 1 commit
- Multimer v3 updates · 68828c49
  Christina Floristean authored Apr 17, 2023
  
  68828c49
14 Mar, 2023 1 commit

Fix check for max_seqlen. · bdbfef1d

Jonathan King authored Mar 14, 2023

Previously, long sequences were not excluded from the script.
This commit changes the comparison to exclude sequences with length
greater than args.max_seqlen.

bdbfef1d

07 Mar, 2023 1 commit

Add missing warmup_num_steps parameter. · 9b2a4f83

Jonathan King authored Mar 07, 2023

warmup_decay_num_steps is a valid user argument but is not added to the config.
This commit adds this argument to the configuration output.

9b2a4f83

08 Oct, 2022 1 commit

Process *.tar and *.tar.gz files. · 67f23568

Jonathan King authored Oct 08, 2022

Because download_mm_seqs_dbs.sh downloads and gunzips its target file (uniref30_2103.tar.gz), this script mistakenly does not process the .tar file. This fix expands the glob to match *.tar*.

67f23568