README.md 4.3 KB
Newer Older
1
2
3
4
5
6
7
# BOMLIP-CSP

An open-source Python framework that integrates machine learning interatomic 
potentials (MLIPs) with a tailored batched optimization strategy, enabling rapid, 
unbiased structure prediction across the full density range


fanding2000's avatar
change  
fanding2000 committed
8
## Install the BOMLIP-CSP
9
```sh
10
11
git clone https://github.com/pic-ai-robotic-chemistry/BOMLIP-CSP.git --recursive
conda create -n BOMLIP_CSP python=3.10 -y
fanding2000's avatar
change  
fanding2000 committed
12
conda activate BOMLIP_CSP
13
cd BOMLIP-CSP/batchASE
14
# using chmod -R 755 in your BOMLIP-CSP folder if you do not have exection permission
fanding2000's avatar
change  
fanding2000 committed
15
16
17
18
19
20
21
22
23
./reproduce/init_mace.sh
source util/env.sh
cd ..
```

## Perform a complete CSP process

Starting the exclusive mode to further accelerate the process if you have administrator privileges.
```sh
24
sudo ./util/mps_start.sh
fanding2000's avatar
change  
fanding2000 committed
25
```
26

fanding2000's avatar
change  
fanding2000 committed
27
28
29
30
31
32
The main program of the CSP process.

BATCHED GEOMETRY OPTIMIZATION REQUIRES GPU USAGE!
PLEASE CONFIRM THE GPU AND WORKER SETTINGS IN THE SHELL SCRIPT BEFORE RUNNING THIS COMMAND!
which includes --gpu_offset --n_gpus --num_workers.
```sh
33
./csp.sh
fanding2000's avatar
change  
fanding2000 committed
34
```
35

fanding2000's avatar
change  
fanding2000 committed
36
37
End the exclusive mode after running.
```sh
38
39
40
41
sudo ./util/mps_clean.sh
```

## Perform conformer search / structure generation / structure optimization separately
fanding2000's avatar
change  
fanding2000 committed
42
### conformer search
43
In csp.sh, the argument --mode controls the jobs to do.
fanding2000's avatar
change  
fanding2000 committed
44

45
46
Use conformer_only to perform conformer search task only.
```sh
fanding2000's avatar
change  
fanding2000 committed
47
48
python "${TOP_DIR}/main.py" --path ${TAR_DIR} --smiles "C1CC2=COC=C12" \
     --num_generation 100 --generate_conformers 10 --mode conformer_only > generate_conformer.log 2>&1
49
```
fanding2000's avatar
change  
fanding2000 committed
50
### structure generation
51
Or use structure_only to perform structure generation only.
fanding2000's avatar
change  
fanding2000 committed
52
53
In this mode, conformers (generated by this program or provided by yourself as .xyz file from other methods) should be provided in folder ${TAR_DIR}/molecule_${i}/conformers as conformer_${j}.xyz files
where i start from 1 and j start from 0.
54
```sh
fanding2000's avatar
change  
fanding2000 committed
55
56
57
python "${TOP_DIR}/main.py" --path ${TAR_DIR} --molecule_num_in_cell 1 \
     --space_group_list 14,61 --add_name XULDUD --max_workers 16 --num_generation 100 \
     --use_conformers 4 --mode structure_only > generate_structure.log 2>&1
58
```
fanding2000's avatar
change  
fanding2000 committed
59
60
61
Conformer search and structure generation could also be done in python script with higher freedom (e.g. higher Z', higher-order co-crystal or control trail structure number for each space group), see structure_generate.py.

### structure optimization
62
63
Structure optimization is done by a seperate command
```sh
64
python "${TOP_DIR}/batchASE/scripts/opt_batch.py" ...
65
66
```

67
Explanations for all arguments are provided in main.py and batchASE/scripts/opt_batch.py.
68
69


fanding2000's avatar
change  
fanding2000 committed
70
### If you want to configure the 7net environment.
71
72
73
74

```sh
#!/bin/bash
conda create -n 7net-cueq python=3.10 -y && conda activate 7net-cueq
75
cd BOMLIP-CSP/batchASE
76
./reproduce/init_7net.sh && source util/env.sh
77
```
78

79
80
The optimization command for 7net is given in csp.sh
```sh
81
# Use a fixed batch size for structural optimization
82
python "${TOP_DIR}/batchASE/scripts/opt_batch.py" --target_folder "${TAR_DIR}/structures" \
83
84
85
86
    --molecule_single 13 --gpu_offset 0 --n_gpus 8 --num_workers 48 --batch_size 2 \
    --max_steps 3000 --filter1 UnitCellFilter --filter2 UnitCellFilter \
    --optimizer1 BFGSFusedLS --optimizer2 BFGS --num_threads 2 --cueq true \
    --use_ordered_files true --model sevennet > opt.log 2>&1
87
88
```

zcxzcx1's avatar
zcxzcx1 committed
89
90
91
## Method
See the BOMLIP-CSP paper for more details.

92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
## License

This project is licensed under the MIT License - see the [LICENSE](LICENSE) file for details.

### Third-party Dependencies

This project includes dependencies with various licenses:
- **MACE**: MIT License (compatible)
- **FairChem**: MIT License (compatible)
- **SevenNet**: GPL v3 License (Note: GPL is a copyleft license)

### License Compatibility Notice

**Important**: This project can run completely without relying on SevenNet. 
This project includes SevenNet as an optional dependency, which is licensed under GPL v3.
If you use SevenNet functionality, you should be aware of the GPL licensing requirements.
For commercial use or to avoid GPL restrictions, consider using only the MACE calculator 
functionality.

## Citation

If you use this code in your research, please cite:

```bibtex
@software{BOMLIP_CSP,
  author = {Chengxi Zhao, Zhaojia Ma, Dingrui Fan},
  title = {BOMLIP_CSP: Integrating machine learning interatomic potentials with batched optimization for crystal structure prediction},
  year = {2025},
  url = {https://github.com/pic-ai-robotic-chemistry/BOMLIP-CSP}
}
122
```