README.md 4.18 KB
Newer Older
1
2
3
4
5
6
7
# BOMLIP-CSP

An open-source Python framework that integrates machine learning interatomic 
potentials (MLIPs) with a tailored batched optimization strategy, enabling rapid, 
unbiased structure prediction across the full density range


fanding2000's avatar
change  
fanding2000 committed
8
## Install the BOMLIP-CSP
9
```sh
10
11
git clone https://github.com/pic-ai-robotic-chemistry/BOMLIP-CSP.git --recursive
conda create -n BOMLIP_CSP python=3.10 -y
fanding2000's avatar
change  
fanding2000 committed
12
conda activate BOMLIP_CSP
13
cd BOMLIP-CSP/mace-bench
fanding2000's avatar
change  
fanding2000 committed
14
15
16
17
18
19
20
21
22
./reproduce/init_mace.sh
source util/env.sh
cd ..
```

## Perform a complete CSP process

Starting the exclusive mode to further accelerate the process if you have administrator privileges.
```sh
23
sudo ./util/mps_start.sh
fanding2000's avatar
change  
fanding2000 committed
24
```
25

fanding2000's avatar
change  
fanding2000 committed
26
27
28
29
30
31
The main program of the CSP process.

BATCHED GEOMETRY OPTIMIZATION REQUIRES GPU USAGE!
PLEASE CONFIRM THE GPU AND WORKER SETTINGS IN THE SHELL SCRIPT BEFORE RUNNING THIS COMMAND!
which includes --gpu_offset --n_gpus --num_workers.
```sh
32
./csp.sh
fanding2000's avatar
change  
fanding2000 committed
33
```
34

fanding2000's avatar
change  
fanding2000 committed
35
36
End the exclusive mode after running.
```sh
37
38
39
40
sudo ./util/mps_clean.sh
```

## Perform conformer search / structure generation / structure optimization separately
fanding2000's avatar
change  
fanding2000 committed
41
### conformer search
42
In csp.sh, the argument --mode controls the jobs to do.
fanding2000's avatar
change  
fanding2000 committed
43

44
45
Use conformer_only to perform conformer search task only.
```sh
fanding2000's avatar
change  
fanding2000 committed
46
47
python "${TOP_DIR}/main.py" --path ${TAR_DIR} --smiles "C1CC2=COC=C12" \
     --num_generation 100 --generate_conformers 10 --mode conformer_only > generate_conformer.log 2>&1
48
```
fanding2000's avatar
change  
fanding2000 committed
49
### structure generation
50
Or use structure_only to perform structure generation only.
fanding2000's avatar
change  
fanding2000 committed
51
52
In this mode, conformers (generated by this program or provided by yourself as .xyz file from other methods) should be provided in folder ${TAR_DIR}/molecule_${i}/conformers as conformer_${j}.xyz files
where i start from 1 and j start from 0.
53
```sh
fanding2000's avatar
change  
fanding2000 committed
54
55
56
python "${TOP_DIR}/main.py" --path ${TAR_DIR} --molecule_num_in_cell 1 \
     --space_group_list 14,61 --add_name XULDUD --max_workers 16 --num_generation 100 \
     --use_conformers 4 --mode structure_only > generate_structure.log 2>&1
57
```
fanding2000's avatar
change  
fanding2000 committed
58
59
60
Conformer search and structure generation could also be done in python script with higher freedom (e.g. higher Z', higher-order co-crystal or control trail structure number for each space group), see structure_generate.py.

### structure optimization
61
62
Structure optimization is done by a seperate command
```sh
fanding2000's avatar
change  
fanding2000 committed
63
python "${TOP_DIR}/mace-bench/scripts/opt_batch.py" ...
64
65
```

fanding2000's avatar
change  
fanding2000 committed
66
Explanations for all arguments are provided in main.py and mace-bench/scripts/opt_batch.py.
67
68


fanding2000's avatar
change  
fanding2000 committed
69
### If you want to configure the 7net environment.
70
71
72
73

```sh
#!/bin/bash
conda create -n 7net-cueq python=3.10 -y && conda activate 7net-cueq
74
cd BOMLIP-CSP/mace-bench
75
./reproduce/init_7net.sh && source util/env.sh
76
```
77

78
79
The optimization command for 7net is given in csp.sh
```sh
80
# Use a fixed batch size for structural optimization
81
82
83
84
85
python "${TOP_DIR}/mace-bench/scripts/opt_batch.py" --target_folder "${TAR_DIR}/structures" \
    --molecule_single 13 --gpu_offset 0 --n_gpus 8 --num_workers 48 --batch_size 2 \
    --max_steps 3000 --filter1 UnitCellFilter --filter2 UnitCellFilter \
    --optimizer1 BFGSFusedLS --optimizer2 BFGS --num_threads 2 --cueq true \
    --use_ordered_files true --model sevennet > opt.log 2>&1
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
```

## License

This project is licensed under the MIT License - see the [LICENSE](LICENSE) file for details.

### Third-party Dependencies

This project includes dependencies with various licenses:
- **MACE**: MIT License (compatible)
- **FairChem**: MIT License (compatible)
- **SevenNet**: GPL v3 License (Note: GPL is a copyleft license)

### License Compatibility Notice

**Important**: This project can run completely without relying on SevenNet. 
This project includes SevenNet as an optional dependency, which is licensed under GPL v3.
If you use SevenNet functionality, you should be aware of the GPL licensing requirements.
For commercial use or to avoid GPL restrictions, consider using only the MACE calculator 
functionality.

## Citation

If you use this code in your research, please cite:

```bibtex
@software{BOMLIP_CSP,
  author = {Chengxi Zhao, Zhaojia Ma, Dingrui Fan},
  title = {BOMLIP_CSP: Integrating machine learning interatomic potentials with batched optimization for crystal structure prediction},
  year = {2025},
  url = {https://github.com/pic-ai-robotic-chemistry/BOMLIP-CSP}
}
118
```