[```openfold/utils/loss.py```](https://github.com/dingquanyu/openfold/blob/permutation/openfold/utils/loss.py), in which the forward function is modified in
original ```AlphaFoldLoss``` class;
create a child class called ```AlphaFoldMultimerLoss``` that not only inherited all the loss calculations but also
has multi-chain permutation codes;
some loss calculations have to be modified e.g. in ```fape``` loss, ```tm``` loss calculations, an extra validation was added to check if the input tensor belongs to tensor_7 or tensor_4*4 for example : https://github.com/dingquanyu/openfold/blob/02b008dc4b8c2e9e680826444c605297eeb9ffb4/openfold/utils/loss.py#L190-L193
Unlike training monomer, chain_cache_data is not required but the train_mmcifs_cache is required. In this case, I selected these 9 mmcifs that are already in the previous test_data folder as a training set. ```./tests/test_data/train_mmcifs_cache.json``` in the command above record the information of these 9 structures and is needed to run the training code.
[```openfold/config.py```](https://github.com/dingquanyu/openfold/blob/permutation/openfold/config.py) has seen a couple of modifications as well. Some namings were wrong and previous script forgot to update config.loss with multimer_model_config_update
## Issues
Testing the codes on cpu works fine but when running it on a gpu, it causes ```RuntimeError: CUDA error: device-side assert triggered``` at unexpected steps.
For example, this error was raised while calculating the best rotation matrix that aligns selected anchors during multi-chain permutation steps, I have to use
```torch.masked_select``` and ```torch.index_select``` in https://github.com/dingquanyu/openfold/blob/a1ef4c8fa99da5cff9501051de71be440ca3cedf/openfold/utils/loss.py#L2043 and https://github.com/dingquanyu/openfold/blob/a1ef4c8fa99da5cff9501051de71be440ca3cedf/openfold/utils/loss.py#L2060 instead of simply slicing the matrix like ```matrix[index]```.
These files are newly added:
[```tests/test_permutation.py```](https://github.com/dingquanyu/openfold/blob/permutation/tests/test_permutation.py): A unittest script
that tests permutation functions.
Later on the same ```CUDA error: device-side assert triggered``` error was raised while adding dimensions to the ```atom_pred_positions``` in https://github.com/dingquanyu/openfold/blob/a1ef4c8fa99da5cff9501051de71be440ca3cedf/openfold/utils/loss.py#L989
and [```tests/test_data/label_2.pkl```](https://github.com/dingquanyu/openfold/blob/permutation/tests/test_data/label_2.pkl) are 2 fake ground truth structures.
```label_1.pkl``` has 9 residues and ```label_2.pkl``` has 13 residues
### Notes
29/06/23 Fill NaN in the lddt scores with the matrix mean for now because the test data are randomly generated and it gives NaN in the lddt score somehow.
**Delete** this step before merging to Multimer branch
I've dumped the matrices in a pickle and load them individually outside the programme to a GPU then the indexing steps worked without the CUDA error.
>tr|W2SNV0|W2SNV0_NECAM Uncharacterized protein OS=Necator americanus OX=51031 GN=NECAME_00897 PE=4 SV=1
ELRAHIQKEIEHHEKQVENHKAILERHRKRVKEIEESQK--
>tr|A0A183FHQ9|A0A183FHQ9_HELBK Uncharacterized protein OS=Heligmosomoides polygyrus bakeri OX=375939 PE=4 SV=1
DLKAQIEKEIKHHEEQVESHQAVLERHRQRVKELEEAAN--
>tr|A0A0N4XTJ8|A0A0N4XTJ8_NIPBR ATPase inhibitor mai-2, mitochondrial (inferred by orthology to a C. elegans protein) OS=Nippostrongylus brasiliensis OX=27835
#=GS UniRef90_A0A1A6FZ83/69-104 DE [subseq from] ATP synthase F1 subunit epsilon n=1 Tax=Neotoma lepida TaxID=56216 RepID=A0A1A6FZ83_NEOLE
#=GS UniRef90_A0A498M405/1709-1745 DE [subseq from] Brain-specific angiogenesis inhibitor 1-like protein n=2 Tax=Cypriniformes TaxID=7952 RepID=A0A498M405_LABRO
#=GS UniRef90_A0A8C7CS59/60-96 DE [subseq from] ATP synthase F1 subunit epsilon n=2 Tax=Salmoninae TaxID=504568 RepID=A0A8C7CS59_ONCKI
#=GS UniRef90_A0A3Q1FBX7/71-107 DE [subseq from] ATP synthase F1 subunit epsilon n=1 Tax=Acanthochromis polyacanthus TaxID=80966 RepID=A0A3Q1FBX7_9TELE
#=GS UniRef90_A0A8C5C6I4/71-107 DE [subseq from] ATP synthase F1 subunit epsilon n=1 Tax=Gadus morhua TaxID=8049 RepID=A0A8C5C6I4_GADMO
>tr|W2SNV0|W2SNV0_NECAM Uncharacterized protein OS=Necator americanus OX=51031 GN=NECAME_00897 PE=4 SV=1
ELRAHIQKEIEHHEKQVENHKAILERHRKRVKEIEESQK--
>tr|A0A183FHQ9|A0A183FHQ9_HELBK Uncharacterized protein OS=Heligmosomoides polygyrus bakeri OX=375939 PE=4 SV=1
DLKAQIEKEIKHHEEQVESHQAVLERHRQRVKELEEAAN--
>tr|A0A0N4XTJ8|A0A0N4XTJ8_NIPBR ATPase inhibitor mai-2, mitochondrial (inferred by orthology to a C. elegans protein) OS=Nippostrongylus brasiliensis OX=27835
#=GS UniRef90_A0A1A6FZ83/69-104 DE [subseq from] ATP synthase F1 subunit epsilon n=1 Tax=Neotoma lepida TaxID=56216 RepID=A0A1A6FZ83_NEOLE
#=GS UniRef90_A0A498M405/1709-1745 DE [subseq from] Brain-specific angiogenesis inhibitor 1-like protein n=2 Tax=Cypriniformes TaxID=7952 RepID=A0A498M405_LABRO
#=GS UniRef90_A0A8C7CS59/60-96 DE [subseq from] ATP synthase F1 subunit epsilon n=2 Tax=Salmoninae TaxID=504568 RepID=A0A8C7CS59_ONCKI
#=GS UniRef90_A0A3Q1FBX7/71-107 DE [subseq from] ATP synthase F1 subunit epsilon n=1 Tax=Acanthochromis polyacanthus TaxID=80966 RepID=A0A3Q1FBX7_9TELE
#=GS UniRef90_A0A8C5C6I4/71-107 DE [subseq from] ATP synthase F1 subunit epsilon n=1 Tax=Gadus morhua TaxID=8049 RepID=A0A8C5C6I4_GADMO