Initial commit: FourCastNet source code only

bdd87fae · zhangwenbo · bdd87fae · bdd87fae · bdd87fae · bdd87fae
Commit bdd87fae authored Feb 04, 2026 by zhangwenbo
20 changed files
--- a/.gitignore
+++ b/.gitignore
+data.tar.gz
+data/
+*.tar.gz
+*.zip
--- a/AUTHORS
+++ b/AUTHORS
+The code was authored by the following people:
+Jaideep Pathak - NVIDIA Corporation
+Shashank Subramanian - NERSC, Lawrence Berkeley National Laboratory
+Peter Harrington - NERSC, Lawrence Berkeley National Laboratory
+Sanjeev Raja - NERSC, Lawrence Berkeley National Laboratory 
+Ashesh Chattopadhyay - Rice University 
+Morteza Mardani - NVIDIA Corporation 
+Thorsten Kurth - NVIDIA Corporation 
+David Hall - NVIDIA Corporation 
+Zongyi Li - California Institute of Technology, NVIDIA Corporation 
+Kamyar Azizzadenesheli - Purdue University 
+Pedram Hassanzadeh - Rice University 
+Karthik Kashinath - NVIDIA Corporation 
+Animashree Anandkumar - California Institute of Technology, NVIDIA Corporation
--- a/LICENSE
+++ b/LICENSE
+#BSD 3-Clause License
+#
+#Copyright (c) 2022, FourCastNet authors
+#All rights reserved.
+#
+#Redistribution and use in source and binary forms, with or without
+#modification, are permitted provided that the following conditions are met:
+#
+#1. Redistributions of source code must retain the above copyright notice, this
+#   list of conditions and the following disclaimer.
+#
+#2. Redistributions in binary form must reproduce the above copyright notice,
+#   this list of conditions and the following disclaimer in the documentation
+#   and/or other materials provided with the distribution.
+#
+#3. Neither the name of the copyright holder nor the names of its
+#   contributors may be used to endorse or promote products derived from
+#   this software without specific prior written permission.
+#
+#THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS"
+#AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE
+#IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE
+#DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT HOLDER OR CONTRIBUTORS BE LIABLE
+#FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL
+#DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR
+#SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER
+#CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY,
+#OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
+#OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+#
+#The code was authored by the following people:
+#
+#Jaideep Pathak - NVIDIA Corporation
+#Shashank Subramanian - NERSC, Lawrence Berkeley National Laboratory
+#Peter Harrington - NERSC, Lawrence Berkeley National Laboratory
+#Sanjeev Raja - NERSC, Lawrence Berkeley National Laboratory 
+#Ashesh Chattopadhyay - Rice University 
+#Morteza Mardani - NVIDIA Corporation 
+#Thorsten Kurth - NVIDIA Corporation 
+#David Hall - NVIDIA Corporation 
+#Zongyi Li - California Institute of Technology, NVIDIA Corporation 
+#Kamyar Azizzadenesheli - Purdue University 
+#Pedram Hassanzadeh - Rice University 
+#Karthik Kashinath - NVIDIA Corporation 
+#Animashree Anandkumar - California Institute of Technology, NVIDIA Corporation
--- a/README.md
+++ b/README.md
+# Fourcastnet_train
+## 项目简介
+---
+## 环境部署
+### 1. 拉取镜像
+```bash
+docker pull harbor.sourcefind.cn:5443/dcu/admin/base/pytorch:2.5.1-ubuntu22.04-dtk25.04.4-1230-py3.10-20260115
+```
+### 2. 创建容器
+```bash
+docker run -it \
+--network=host \
+--hostname=localhost \
+--name=hunyuan \
+-v /opt/hyhal:/opt/hyhal:ro \
+-v $PWD:/workspace \
+--ipc=host \
+--device=/dev/kfd \
+--device=/dev/mkfd \
+--device=/dev/dri \
+--shm-size=512G \
+--privileged \
+--group-add video \
+--cap-add=SYS_PTRACE \
+-u root \
+--security-opt seccomp=unconfined \
+harbor.sourcefind.cn:5443/dcu/admin/base/pytorch:2.5.1-ubuntu22.04-dtk25.04.4-1230-py3.10-20260115 \
+/bin/bash
+```
+---
+## 测试步骤
+### 1. 拉取代码
+```bash
+git clone http://developer.sourcefind.cn/codes/bw-bestperf/hunyuanvideo-i2v.git
+cd hunyuanvideo-i2v/
+```
+### 2. 安装依赖
+```bash
+pip install -r requirements.txt -i https://pypi.tuna.tsinghua.edu.cn/simple
+pip install yunchang==0.6.0 xfuser==0.4.2
+bash fix.sh  # 适配xfuser
+```
+### 3. 下载模型
+安装 ModelScope：
+```bash
+pip install modelscope
+```
+下载所需模型：
+```bash
+mkdir ckpts
+modelscope download --model Tencent-Hunyuan/HunyuanVideo --local_dir ./ckpts
+modelscope download --model AI-ModelScope/HunyuanVideo-I2V --local_dir ./ckpts
+modelscope download --model AI-ModelScope/clip-vit-large-patch14 --local_dir ckpts/text_encoder_2
+modelscope download --model AI-ModelScope/llava-llama-3-8b-v1_1-transformers --local_dir ckpts/text_encoder_i2v
+```
+执行修复脚本：
+```bash
+bash modified/fix.sh
+```
+设置环境变量，禁用 HIP 缓存 allocator 防止 OOM：
+```bash
+export PYTORCH_NO_HIP_MEMORY_CACHING=1
+```
+---
+## 测试代码示例（四卡多GPU测试）
+导出设备环境和禁用缓存：
+```bash
+export HIP_VISIBLE_DEVICES=4,5,6,7
+export PYTORCH_NO_HIP_MEMORY_CACHING=1
+```
+运行文本到视频多GPU推理：
+```bash
+ALLOW_RESIZE_FOR_SP=1 torchrun --nproc_per_node=4 \
+    sample_image2video.py \
+    --model HYVideo-T/2 \
+    --prompt "An Asian man with short hair in black tactical uniform and white clothes waves a firework stick." \
+    --i2v-mode \
+    --i2v-image-path ./assets/demo/i2v/imgs/0.jpg \
+    --i2v-resolution 720p \
+    --i2v-stability \
+    --infer-steps 50 \
+    --video-length 129 \
+    --flow-reverse \
+    --flow-shift 17.0 \
+    --seed 42 \
+    --embedded-cfg-scale 6.0 \
+    --save-path ./results \
+    --ulysses-degree 4 \
+    --ring-degree 1 \
+    --num-videos 1 2>&1 | tee z_logs/bw_image2video_4ka.log
+```
+---
+## 配置选项说明
+| 参数                 | 说明                           | 默认值 / 示例                             |
+| -------------------- | ------------------------------ | ---------------------------------------- |
+| `--model`            | 指定使用的模型名称             | `HYVideo-T/2`                            |
+| `--prompt`           | 文本描述，用于生成视频         | `"An Asian man with short hair..."`     |
+| `--i2v-mode`         | 启用文本到视频模式             |                                          |
+| `--i2v-image-path`   | 输入图像路径                   | `./assets/demo/i2v/imgs/0.jpg`          |
+| `--i2v-resolution`   | 输出视频分辨率                 | `720p`                                  |
+| `--i2v-stability`    | 稳定性增强选项                 |                                          |
+| `--infer-steps`      | 推理步数，影响生成质量与速度   | `50`                                    |
+| `--video-length`     | 生成视频的长度（帧数）         | `129`                                   |
+| `--flow-reverse`     | 是否反转光流                   |                                          |
+| `--flow-shift`       | 光流偏移值                     | `17.0`                                  |
+| `--seed`             | 随机种子，保证结果可复现       | `42`                                    |
+| `--embedded-cfg-scale` | Condition scaling比例          | `6.0`                                   |
+| `--save-path`        | 生成结果保存路径               | `./results`                             |
+| `--ulysses-degree`   | 自定义参数，具体含义见代码说明 | `4`                                     |
+| `--ring-degree`      | 自定义参数，具体含义见代码说明 | `1`                                     |
+| `--num-videos`       | 生成视频数量                   | `1`                                     |
+---
+## 贡献指南
+欢迎对 hunyuan-I2V 项目进行贡献！请遵循以下步骤：
+1. Fork 本仓库，并新建分支进行功能开发或问题修复。
+2. 提交规范的 commit 信息，描述清晰。
+3. 提交 Pull Request，简述修改内容及目的。
+4. 遵守项目代码规范和测试标准。
+5. 参与代码评审，积极沟通改进方案。
+---
+## 许可证
+本项目遵循 MIT 许可证，详见 [LICENSE](./LICENSE) 文件。
+---
+感谢您的关注与支持！如有问题，欢迎提交 Issue 或联系维护团队。
--- a/assets/FourCastNet.gif
+++ b/assets/FourCastNet.gif
--- a/assets/nersc.png
+++ b/assets/nersc.png
--- a/assets/nvidia.png
+++ b/assets/nvidia.png
--- a/config/AFNO-50epoch.yaml
+++ b/config/AFNO-50epoch.yaml
+### base config ###
+full_field: &FULL_FIELD
+  loss: 'l2'
+  lr: 1E-3
+  scheduler: 'ReduceLROnPlateau'
+  num_data_workers: 2
+  dt: 1 # how many timesteps ahead the model will predict
+  n_history: 0 #how many previous timesteps to consider
+  prediction_type: 'iterative'
+  prediction_length: 2 #applicable only if prediction_type == 'iterative'
+  n_initial_conditions: 5 #applicable only if prediction_type == 'iterative'
+  ics_type: "default"
+  save_raw_forecasts: !!bool True
+  save_channel: !!bool False
+  masked_acc: !!bool False
+  maskpath: None
+  perturb: !!bool False
+  add_grid: !!bool False
+  N_grid_channels: 0
+  gridtype: 'sinusoidal' #options 'sinusoidal' or 'linear'
+  roll: !!bool False
+  max_epochs: 50
+  batch_size: 64
+  #afno hyperparams
+  num_blocks: 8
+  nettype: 'afno'
+  patch_size: 8
+  width: 56
+  modes: 32
+  #options default, residual
+  target: 'default' 
+  in_channels: [0,1]
+  out_channels: [0,1] #must be same as in_channels if prediction_type == 'iterative'
+  normalization: 'zscore' #options zscore (minmax not supported) 
+  train_data_path: '/pscratch/sd/j/jpathak/wind/train'
+  valid_data_path: '/pscratch/sd/j/jpathak/wind/test'
+  inf_data_path: '/pscratch/sd/j/jpathak/wind/out_of_sample' # test set path for inference
+  exp_dir: '/pscratch/sd/j/jpathak/ERA5_expts_gtc/wind'
+  time_means_path:   '/pscratch/sd/j/jpathak/wind/time_means.npy'
+  global_means_path: '/pscratch/sd/j/jpathak/wind/global_means.npy'
+  global_stds_path:  '/pscratch/sd/j/jpathak/wind/global_stds.npy'
+  orography: !!bool False
+  orography_path: None
+  log_to_screen: !!bool True
+  log_to_wandb: !!bool False
+  save_checkpoint: !!bool True
+  enable_nhwc: !!bool False
+  optimizer_type: 'FusedAdam'
+  crop_size_x: None
+  crop_size_y: None
+  two_step_training: !!bool False
+  plot_animations: !!bool False
+  add_noise: !!bool False
+  noise_std: 0
+afno_backbone: &backbone
+  <<: *FULL_FIELD
+  log_to_wandb: !!bool False
+  lr: 5E-4
+  batch_size: 8
+  max_epochs: 50
+  scheduler: 'CosineAnnealingLR'
+  in_channels: [0, 1 ,2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19]
+  out_channels: [0, 1 ,2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19]
+  orography: !!bool False
+  orography_path: None 
+  exp_dir: './exp'
+  train_data_path: './data/train'
+  valid_data_path: './data/valid'
+  inf_data_path:   './data/valid'
+  time_means_path:   './data/time_means.npy'
+  global_means_path: './data/global_means.npy'
+  global_stds_path:  './data/global_stds.npy'
+afno_backbone_orography: &backbone_orography 
+  <<: *backbone
+  orography: !!bool True
+  orography_path: '/pscratch/sd/s/shas1693/data/era5/static/orography.h5'
+afno_backbone_finetune: 
+  <<: *backbone
+  lr: 1E-4
+  batch_size: 64
+  log_to_wandb: !!bool False
+  max_epochs: 50
+  pretrained: !!bool True
+  two_step_training: !!bool True
+  pretrained_ckpt_path: '/pscratch/sd/s/shas1693/results/era5_wind/afno_backbone/0/training_checkpoints/best_ckpt.tar'
+perturbations:
+  <<: *backbone
+  lr: 1E-4
+  batch_size: 64
+  max_epochs: 50
+  pretrained: !!bool True
+  two_step_training: !!bool True
+  pretrained_ckpt_path: '/pscratch/sd/j/jpathak/ERA5_expts_gtc/wind/afno_20ch_bs_64_lr5em4_blk_8_patch_8_cosine_sched/1/training_checkpoints/best_ckpt.tar'
+  prediction_length: 24
+  ics_type: "datetime"
+  n_perturbations: 100 
+  save_channel: !bool True
+  save_idx: 4
+  save_raw_forecasts: !!bool False
+  date_strings: ["2018-01-01 00:00:00"] 
+  inference_file_tag: " "
+  valid_data_path: "/pscratch/sd/j/jpathak/ "
+  perturb: !!bool True
+  n_level: 0.3
+### PRECIP ###
+precip: &precip
+  <<: *backbone
+  in_channels: [0, 1 ,2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19]
+  out_channels: [0]
+  nettype: 'afno'
+  nettype_wind: 'afno'
+  log_to_wandb: !!bool False
+  lr: 2.5E-4
+  batch_size: 64
+  max_epochs: 25
+  precip: '/pscratch/sd/p/pharring/ERA5/precip/total_precipitation'
+  time_means_path_tp: '/pscratch/sd/p/pharring/ERA5/precip/total_precipitation/time_means.npy'
+  model_wind_path: '/pscratch/sd/s/shas1693/results/era5_wind/afno_backbone_finetune/0/training_checkpoints/best_ckpt.tar'
+  precip_eps: !!float 1e-5
--- a/config/AFNO.yaml
+++ b/config/AFNO.yaml
+### base config ###
+full_field: &FULL_FIELD
+  loss: 'l2'
+  lr: 1E-3
+  scheduler: 'ReduceLROnPlateau'
+  num_data_workers: 4
+  dt: 1 # how many timesteps ahead the model will predict
+  n_history: 0 #how many previous timesteps to consider
+  prediction_type: 'iterative'
+  prediction_length: 41 #applicable only if prediction_type == 'iterative'
+  n_initial_conditions: 5 #applicable only if prediction_type == 'iterative'
+  ics_type: "default"
+  save_raw_forecasts: !!bool True
+  save_channel: !!bool False
+  masked_acc: !!bool False
+  maskpath: None
+  perturb: !!bool False
+  add_grid: !!bool False
+  N_grid_channels: 0
+  gridtype: 'sinusoidal' #options 'sinusoidal' or 'linear'
+  roll: !!bool False
+  max_epochs: 50
+  batch_size: 64
+  #afno hyperparams
+  num_blocks: 8
+  nettype: 'afno'
+  patch_size: 8
+  width: 56
+  modes: 32
+  #options default, residual
+  target: 'default' 
+  in_channels: [0,1]
+  out_channels: [0,1] #must be same as in_channels if prediction_type == 'iterative'
+  normalization: 'zscore' #options zscore (minmax not supported) 
+  train_data_path: '/pscratch/sd/j/jpathak/wind/train'
+  valid_data_path: '/pscratch/sd/j/jpathak/wind/test'
+  inf_data_path: '/pscratch/sd/j/jpathak/wind/out_of_sample' # test set path for inference
+  exp_dir: '/pscratch/sd/j/jpathak/ERA5_expts_gtc/wind'
+  time_means_path:   '/pscratch/sd/j/jpathak/wind/time_means.npy'
+  global_means_path: '/pscratch/sd/j/jpathak/wind/global_means.npy'
+  global_stds_path:  '/pscratch/sd/j/jpathak/wind/global_stds.npy'
+  orography: !!bool False
+  orography_path: None
+  log_to_screen: !!bool True
+  log_to_wandb: !!bool True
+  save_checkpoint: !!bool True
+  enable_nhwc: !!bool False
+  optimizer_type: 'FusedAdam'
+  crop_size_x: None
+  crop_size_y: None
+  two_step_training: !!bool False
+  plot_animations: !!bool False
+  add_noise: !!bool False
+  noise_std: 0
+afno_backbone: &backbone
+  <<: *FULL_FIELD
+  log_to_wandb: !!bool True
+  lr: 5E-4
+  batch_size: 64
+  max_epochs: 150
+  scheduler: 'CosineAnnealingLR'
+  in_channels: [0, 1 ,2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19]
+  out_channels: [0, 1 ,2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19]
+  orography: !!bool False
+  orography_path: None 
+  exp_dir: '/pscratch/sd/s/shas1693/results/era5_wind'
+  # train_data_path: '/pscratch/sd/s/shas1693/data/era5/train'
+  # valid_data_path: '/pscratch/sd/s/shas1693/data/era5/test'
+  # inf_data_path:   '/pscratch/sd/s/shas1693/data/era5/out_of_sample'
+  # time_means_path:   '/pscratch/sd/s/shas1693/data/era5/time_means.npy'
+  # global_means_path: '/pscratch/sd/s/shas1693/data/era5/global_means.npy'
+  # global_stds_path:  '/pscratch/sd/s/shas1693/data/era5/global_stds.npy'
+    # ==== 数据路径（关键）====
+  train_data_path: '/workspace/FourCastNet/data/train'
+  valid_data_path: '/workspace/FourCastNet/data/train'   # 先用同一份也可以跑
+  inf_data_path:   '/workspace/FourCastNet/data/train'
+  # ==== 统计量（如果你现在还没有，先这样处理）====
+  time_means_path:   '/workspace/FourCastNet/data/time_means.npy'
+  global_means_path: '/workspace/FourCastNet/data/global_means.npy'
+  global_stds_path:  '/workspace/FourCastNet/data/global_stds.npy'
+afno_backbone_orography: &backbone_orography 
+  <<: *backbone
+  orography: !!bool True
+  orography_path: '/pscratch/sd/s/shas1693/data/era5/static/orography.h5'
+afno_backbone_finetune: 
+  <<: *backbone
+  lr: 1E-4
+  batch_size: 64
+  log_to_wandb: !!bool True
+  max_epochs: 50
+  pretrained: !!bool True
+  two_step_training: !!bool True
+  pretrained_ckpt_path: '/pscratch/sd/s/shas1693/results/era5_wind/afno_backbone/0/training_checkpoints/best_ckpt.tar'
+perturbations:
+  <<: *backbone
+  lr: 1E-4
+  batch_size: 64
+  max_epochs: 50
+  pretrained: !!bool True
+  two_step_training: !!bool True
+  pretrained_ckpt_path: '/pscratch/sd/j/jpathak/ERA5_expts_gtc/wind/afno_20ch_bs_64_lr5em4_blk_8_patch_8_cosine_sched/1/training_checkpoints/best_ckpt.tar'
+  prediction_length: 24
+  ics_type: "datetime"
+  n_perturbations: 100 
+  save_channel: !bool True
+  save_idx: 4
+  save_raw_forecasts: !!bool False
+  date_strings: ["2018-01-01 00:00:00"] 
+  inference_file_tag: " "
+  valid_data_path: "/pscratch/sd/j/jpathak/ "
+  perturb: !!bool True
+  n_level: 0.3
+### PRECIP ###
+precip: &precip
+  <<: *backbone
+  in_channels: [0, 1 ,2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19]
+  out_channels: [0]
+  nettype: 'afno'
+  nettype_wind: 'afno'
+  log_to_wandb: !!bool True
+  lr: 2.5E-4
+  batch_size: 64
+  max_epochs: 25
+  precip: '/pscratch/sd/p/pharring/ERA5/precip/total_precipitation'
+  time_means_path_tp: '/pscratch/sd/p/pharring/ERA5/precip/total_precipitation/time_means.npy'
+  model_wind_path: '/pscratch/sd/s/shas1693/results/era5_wind/afno_backbone_finetune/0/training_checkpoints/best_ckpt.tar'
+  precip_eps: !!float 1e-5
--- a/copernicus/get_data_pl_short_length.py
+++ b/copernicus/get_data_pl_short_length.py
+import cdsapi
+c = cdsapi.Client()
+c.retrieve(
+    'reanalysis-era5-pressure-levels',
+    {
+        'product_type': 'reanalysis',
+        'format': 'netcdf',
+        'variable': [
+            'geopotential', 'relative_humidity', 'temperature',
+            'u_component_of_wind', 'v_component_of_wind',
+        ],
+        'pressure_level': [
+            '50', '500', '850',
+            '1000',
+        ],
+        'year': '2021',
+        'month': '10',
+        'day': [
+            '19', '20', '21',
+            '22', '23', '24',
+            '25', '26', '27',
+            '28', '29', '30',
+            '31',
+        ],
+        'time': [
+            '00:00', '06:00', '12:00',
+            '18:00',
+        ],
+    },
+    '/project/projectdirs/dasrepo/ERA5/oct_2021_19_31_pl.nc')
--- a/copernicus/get_data_sfc_short_length.py
+++ b/copernicus/get_data_sfc_short_length.py
+import cdsapi
+c = cdsapi.Client()
+c.retrieve(
+    'reanalysis-era5-single-levels',
+    {
+        'product_type': 'reanalysis',
+        'format': 'netcdf',
+        'variable': [
+            '10m_u_component_of_wind', '10m_v_component_of_wind', '2m_temperature',
+            'mean_sea_level_pressure', 'surface_pressure', 'total_column_water_vapour',
+        ],
+        'year': '2021',
+        'month': '10',
+        'day': [
+            '19', '20', '21',
+            '22', '23', '24',
+            '25', '26', '27',
+            '28', '29', '30',
+            '31',
+        ],
+        'time': [
+            '00:00', '06:00', '12:00',
+            '18:00',
+        ],
+    },
+    '/project/projectdirs/dasrepo/ERA5/oct_2021_19_31_sfc.nc')
+#    '/project/projectdirs/dasrepo/ERA5/oct_2021_19_31_pl.nc')
--- a/copernicus/get_data_u_v_6hr.py
+++ b/copernicus/get_data_u_v_6hr.py
+import cdsapi
+import numpy as np
+import os
+usr = 'j' # j,k,a
+base_path = '/project/projectdirs/dasrepo/ERA5/wind_levels/6hr/' + usr
+if not os.path.isdir(base_path):
+    os.makedirs(base_path)
+year_dict = {'j': np.arange(1981, 1991), 'k': np.arange(1993,2006), 'a' : np.arange(2006, 2021)}
+years = year_dict[usr]  
+#t1 = [str(jj).zfill(2) for jj in range(1,4)] 
+#t2 = [str(jj).zfill(2) for jj in range(4,7)] 
+#t3 = [str(jj).zfill(2) for jj in range(7,10)] 
+#t4 = [str(jj).zfill(2) for jj in range(10,13)] 
+#
+#trimesters = [t1, t2, t3, t4]
+#months = [str(jj).zfill(2) for jj in range(1,13)] 
+pressure_levels = [300]
+c = cdsapi.Client()
+for pressure_level in pressure_levels:
+    for year in years:
+        year_str = str(year) 
+        pressure_str = str(pressure_level)
+        file_str = base_path + '/u_v_z_pressure_level_'+ pressure_str + '_'  + year_str  + '.nc'
+        print(year_str)
+        print(file_str)
+        c.retrieve(
+            'reanalysis-era5-pressure-levels',
+            {
+                'product_type': 'reanalysis',
+                'format': 'netcdf',
+                'pressure_level': pressure_str,
+                'variable': [
+                    'u_component_of_wind', 'v_component_of_wind', 'geopotential',
+                ],          
+                'year': year_str,
+                'month': [
+                    '01', '02', '03',
+                    '04', '05', '06',
+                    '07', '08', '09',
+                    '10', '11', '12',
+                ],
+                'day': [
+                    '01', '02', '03',
+                    '04', '05', '06',
+                    '07', '08', '09',
+                    '10', '11', '12',
+                    '13', '14', '15',
+                    '16', '17', '18',
+                    '19', '20', '21',
+                    '22', '23', '24',
+                    '25', '26', '27',
+                    '28', '29', '30',
+                    '31',
+                ],
+                'time': [
+                    '00:00', '06:00', '12:00','18:00',
+                ],          
+            },
+            file_str)
--- a/data_process/get_stats.py
+++ b/data_process/get_stats.py
+#BSD 3-Clause License
+#
+#Copyright (c) 2022, FourCastNet authors
+#All rights reserved.
+#
+#Redistribution and use in source and binary forms, with or without
+#modification, are permitted provided that the following conditions are met:
+#
+#1. Redistributions of source code must retain the above copyright notice, this
+#   list of conditions and the following disclaimer.
+#
+#2. Redistributions in binary form must reproduce the above copyright notice,
+#   this list of conditions and the following disclaimer in the documentation
+#   and/or other materials provided with the distribution.
+#
+#3. Neither the name of the copyright holder nor the names of its
+#   contributors may be used to endorse or promote products derived from
+#   this software without specific prior written permission.
+#
+#THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS"
+#AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE
+#IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE
+#DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT HOLDER OR CONTRIBUTORS BE LIABLE
+#FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL
+#DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR
+#SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER
+#CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY,
+#OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
+#OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+#
+#The code was authored by the following people:
+#
+#Jaideep Pathak - NVIDIA Corporation
+#Shashank Subramanian - NERSC, Lawrence Berkeley National Laboratory
+#Peter Harrington - NERSC, Lawrence Berkeley National Laboratory
+#Sanjeev Raja - NERSC, Lawrence Berkeley National Laboratory 
+#Ashesh Chattopadhyay - Rice University 
+#Morteza Mardani - NVIDIA Corporation 
+#Thorsten Kurth - NVIDIA Corporation 
+#David Hall - NVIDIA Corporation 
+#Zongyi Li - California Institute of Technology, NVIDIA Corporation 
+#Kamyar Azizzadenesheli - Purdue University 
+#Pedram Hassanzadeh - Rice University 
+#Karthik Kashinath - NVIDIA Corporation 
+#Animashree Anandkumar - California Institute of Technology, NVIDIA Corporation
+import torch
+import numpy as np
+import h5py
+years = [1979, 1989, 1999, 2004, 2010]
+global_means = np.zeros((1,21,1,1))
+global_stds = np.zeros((1,21,1,1))
+time_means = np.zeros((1,21,721, 1440))
+for ii, year in enumerate(years):
+    with h5py.File('/pscratch/sd/s/shas1693/data/era5/train/'+ str(year) + '.h5', 'r') as f:
+        rnd_idx = np.random.randint(0, 1460-500)
+        global_means += np.mean(f['fields'][rnd_idx:rnd_idx+500], keepdims=True, axis = (0,2,3))
+        global_stds += np.var(f['fields'][rnd_idx:rnd_idx+500], keepdims=True, axis = (0,2,3))
+global_means = global_means/len(years)
+global_stds = np.sqrt(global_stds/len(years))
+time_means = time_means/len(years)
+np.save('/pscratch/sd/s/shas1693/data/era5/global_means.npy', global_means)
+np.save('/pscratch/sd/s/shas1693/data/era5/global_stds.npy', global_stds)
+np.save('/pscratch/sd/s/shas1693/data/era5/time_means.npy', time_means)
+print("means: ", global_means)
+print("stds: ", global_stds)
--- a/data_process/normalize_orography.py
+++ b/data_process/normalize_orography.py
+#BSD 3-Clause License
+#
+#Copyright (c) 2022, FourCastNet authors
+#All rights reserved.
+#
+#Redistribution and use in source and binary forms, with or without
+#modification, are permitted provided that the following conditions are met:
+#
+#1. Redistributions of source code must retain the above copyright notice, this
+#   list of conditions and the following disclaimer.
+#
+#2. Redistributions in binary form must reproduce the above copyright notice,
+#   this list of conditions and the following disclaimer in the documentation
+#   and/or other materials provided with the distribution.
+#
+#3. Neither the name of the copyright holder nor the names of its
+#   contributors may be used to endorse or promote products derived from
+#   this software without specific prior written permission.
+#
+#THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS"
+#AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE
+#IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE
+#DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT HOLDER OR CONTRIBUTORS BE LIABLE
+#FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL
+#DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR
+#SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER
+#CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY,
+#OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
+#OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+#
+#The code was authored by the following people:
+#
+#Jaideep Pathak - NVIDIA Corporation
+#Shashank Subramanian - NERSC, Lawrence Berkeley National Laboratory
+#Peter Harrington - NERSC, Lawrence Berkeley National Laboratory
+#Sanjeev Raja - NERSC, Lawrence Berkeley National Laboratory 
+#Ashesh Chattopadhyay - Rice University 
+#Morteza Mardani - NVIDIA Corporation 
+#Thorsten Kurth - NVIDIA Corporation 
+#David Hall - NVIDIA Corporation 
+#Zongyi Li - California Institute of Technology, NVIDIA Corporation 
+#Kamyar Azizzadenesheli - Purdue University 
+#Pedram Hassanzadeh - Rice University 
+#Karthik Kashinath - NVIDIA Corporation 
+#Animashree Anandkumar - California Institute of Technology, NVIDIA Corporation
+import h5py
+import numpy as np
+with h5py.File('/pscratch/sd/s/shas1693/data/era5/static/orography.h5','a') as f:
+    orog = f['orog'][:]
+    omean = np.mean(orog)
+    print(omean)
+    ostd = np.std(orog)
+    print(ostd)
+    orog -= omean
+    orog /= ostd
+    f['orog'][...] = orog
+    f.flush()
--- a/data_process/parallel_copy.py
+++ b/data_process/parallel_copy.py
+#BSD 3-Clause License
+#
+#Copyright (c) 2022, FourCastNet authors
+#All rights reserved.
+#
+#Redistribution and use in source and binary forms, with or without
+#modification, are permitted provided that the following conditions are met:
+#
+#1. Redistributions of source code must retain the above copyright notice, this
+#   list of conditions and the following disclaimer.
+#
+#2. Redistributions in binary form must reproduce the above copyright notice,
+#   this list of conditions and the following disclaimer in the documentation
+#   and/or other materials provided with the distribution.
+#
+#3. Neither the name of the copyright holder nor the names of its
+#   contributors may be used to endorse or promote products derived from
+#   this software without specific prior written permission.
+#
+#THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS"
+#AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE
+#IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE
+#DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT HOLDER OR CONTRIBUTORS BE LIABLE
+#FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL
+#DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR
+#SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER
+#CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY,
+#OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
+#OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+#
+#The code was authored by the following people:
+#
+#Jaideep Pathak - NVIDIA Corporation
+#Shashank Subramanian - NERSC, Lawrence Berkeley National Laboratory
+#Peter Harrington - NERSC, Lawrence Berkeley National Laboratory
+#Sanjeev Raja - NERSC, Lawrence Berkeley National Laboratory 
+#Ashesh Chattopadhyay - Rice University 
+#Morteza Mardani - NVIDIA Corporation 
+#Thorsten Kurth - NVIDIA Corporation 
+#David Hall - NVIDIA Corporation 
+#Zongyi Li - California Institute of Technology, NVIDIA Corporation 
+#Kamyar Azizzadenesheli - Purdue University 
+#Pedram Hassanzadeh - Rice University 
+#Karthik Kashinath - NVIDIA Corporation 
+#Animashree Anandkumar - California Institute of Technology, NVIDIA Corporation
+import h5py
+from mpi4py import MPI
+import numpy as np
+import time
+from netCDF4 import Dataset as DS
+import os
+def writetofile(src, dest, channel_idx, varslist):
+    if os.path.isfile(src):
+        batch = 2**6
+        rank = MPI.COMM_WORLD.rank
+        Nproc = MPI.COMM_WORLD.size
+        Nimgtot = 1460#src_shape[0]
+        Nimg = Nimgtot//Nproc
+        print("Nimgtot",Nimgtot)
+        print("Nproc",Nproc)
+        print("Nimg",Nimg)
+        base = rank*Nimg
+        end = (rank+1)*Nimg if rank<Nproc - 1 else Nimgtot
+        idx = base
+        for variable_name in varslist:
+            fsrc = DS(src, 'r', format="NETCDF4").variables[variable_name]
+            fdest = h5py.File(dest, 'a', driver='mpio', comm=MPI.COMM_WORLD)
+            start = time.time()
+            while idx<end:
+                if end - idx < batch:
+                    ims = fsrc[idx:end]
+                    print(ims.shape)
+                    fdest['fields'][idx:end, channel_idx, :, :] = ims
+                    break
+                else:
+                    ims = fsrc[idx:idx+batch]
+                    fdest['fields'][idx:idx+batch, channel_idx, :, :] = ims
+                    idx+=batch
+                    ttot = time.time() - start
+                    eta = (end - base)/((idx - base)/ttot)
+                    hrs = eta//3600
+                    mins = (eta - 3600*hrs)//60
+                    secs = (eta - 3600*hrs - 60*mins)
+            ttot = time.time() - start
+            hrs = ttot//3600
+            mins = (ttot - 3600*hrs)//60
+            secs = (ttot - 3600*hrs - 60*mins)
+            channel_idx += 1 
+#year_dict = {'j': np.arange(1979, 1993), 'k': np.arange(1993, 2006), 'a' : np.arange(2006, 2021)}
+dir_dict = {}
+for year in np.arange(1979, 1993):
+    dir_dict[year] = 'j'
+for year in np.arange(1993, 2006):
+    dir_dict[year] = 'k'
+for year in np.arange(2006, 2021):
+    dir_dict[year] = 'a'
+print(dir_dict)
+years = np.arange(1979, 2018)
+for year in years:
+    print(year)
+    src = '/global/cscratch1/sd/jpathak/ERA5_u10v10t2m_netcdf/10m_u_v_2m_t_gp_lsm_toa_' + str(year) + '.nc'
+    dest = '/global/cscratch1/sd/jpathak/ERA5/wind/vlevels/' + str(year) + '.h5'
+    writetofile(src, dest, 0, ['u'])
+    writetofile(src, dest, 1, ['v'])
+    src = '/project/projectdirs/dasrepo/jpathak/ERA5_more_data/z_u_v_1000_'+str(year)+'.nc'
+    writetofile(src, dest, 2, ['u'])
+    writetofile(src, dest, 3, ['v'])
+    writetofile(src, dest, 4, ['z'])
+    usr = dir_dict[year]
+    src ='/project/projectdirs/dasrepo/ERA5/wind_levels/6hr/' + usr +  '/u_v_z_pressure_level_850_' +str(year) + '.nc'
+    writetofile(src, dest, 5, ['u'])
+    writetofile(src, dest, 6, ['v'])
+    writetofile(src, dest, 7, ['z'])
+    usr = dir_dict[year]
+    src ='/project/projectdirs/dasrepo/ERA5/wind_levels/6hr/' + usr +  '/u_v_z_pressure_level_500_' +str(year) + '.nc'
+    writetofile(src, dest, 8, ['u'])
+    writetofile(src, dest, 9, ['v'])
+    writetofile(src, dest, 10, ['z'])
+    src = '/project/projectdirs/dasrepo/jpathak/ERA5_more_data/z50_'+str(year)+'.nc'
+    writetofile(src, dest, 11, ['z'])
--- a/data_process/parallel_copy_small_set.py
+++ b/data_process/parallel_copy_small_set.py
+#BSD 3-Clause License
+#
+#Copyright (c) 2022, FourCastNet authors
+#All rights reserved.
+#
+#Redistribution and use in source and binary forms, with or without
+#modification, are permitted provided that the following conditions are met:
+#
+#1. Redistributions of source code must retain the above copyright notice, this
+#   list of conditions and the following disclaimer.
+#
+#2. Redistributions in binary form must reproduce the above copyright notice,
+#   this list of conditions and the following disclaimer in the documentation
+#   and/or other materials provided with the distribution.
+#
+#3. Neither the name of the copyright holder nor the names of its
+#   contributors may be used to endorse or promote products derived from
+#   this software without specific prior written permission.
+#
+#THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS"
+#AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE
+#IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE
+#DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT HOLDER OR CONTRIBUTORS BE LIABLE
+#FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL
+#DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR
+#SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER
+#CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY,
+#OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
+#OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+#
+#The code was authored by the following people:
+#
+#Jaideep Pathak - NVIDIA Corporation
+#Shashank Subramanian - NERSC, Lawrence Berkeley National Laboratory
+#Peter Harrington - NERSC, Lawrence Berkeley National Laboratory
+#Sanjeev Raja - NERSC, Lawrence Berkeley National Laboratory 
+#Ashesh Chattopadhyay - Rice University 
+#Morteza Mardani - NVIDIA Corporation 
+#Thorsten Kurth - NVIDIA Corporation 
+#David Hall - NVIDIA Corporation 
+#Zongyi Li - California Institute of Technology, NVIDIA Corporation 
+#Kamyar Azizzadenesheli - Purdue University 
+#Pedram Hassanzadeh - Rice University 
+#Karthik Kashinath - NVIDIA Corporation 
+#Animashree Anandkumar - California Institute of Technology, NVIDIA Corporation
+# Instructions: 
+# Set Nimgtot correctly
+import h5py
+from mpi4py import MPI
+import numpy as np
+import time
+from netCDF4 import Dataset as DS
+import os
+def writetofile(src, dest, channel_idx, varslist, src_idx=0, frmt='nc'):
+    if os.path.isfile(src):
+        batch = 2**4
+        rank = MPI.COMM_WORLD.rank
+        Nproc = MPI.COMM_WORLD.size
+        Nimgtot = 52#src_shape[0]
+        Nimg = Nimgtot//Nproc
+        base = rank*Nimg
+        end = (rank+1)*Nimg if rank<Nproc - 1 else Nimgtot
+        idx = base
+        for variable_name in varslist:
+            if frmt == 'nc':
+                fsrc = DS(src, 'r', format="NETCDF4").variables[variable_name]
+            elif frmt == 'h5':
+                fsrc = h5py.File(src, 'r')[varslist[0]]
+            print("fsrc shape", fsrc.shape)
+            fdest = h5py.File(dest, 'a', driver='mpio', comm=MPI.COMM_WORLD)
+            start = time.time()
+            while idx<end:
+                if end - idx < batch:
+                    if len(fsrc.shape) == 4:
+                        ims = fsrc[idx:end,src_idx]
+                    else:
+                        ims = fsrc[idx:end]
+                    print(ims.shape)
+                    fdest['fields'][idx:end, channel_idx, :, :] = ims
+                    break
+                else:
+                    if len(fsrc.shape) == 4:
+                        ims = fsrc[idx:idx+batch,src_idx]
+                    else:
+                        ims = fsrc[idx:idx+batch]
+                    #ims = fsrc[idx:idx+batch]
+                    print("ims shape", ims.shape)
+                    fdest['fields'][idx:idx+batch, channel_idx, :, :] = ims
+                    idx+=batch
+                    ttot = time.time() - start
+                    eta = (end - base)/((idx - base)/ttot)
+                    hrs = eta//3600
+                    mins = (eta - 3600*hrs)//60
+                    secs = (eta - 3600*hrs - 60*mins)
+            ttot = time.time() - start
+            hrs = ttot//3600
+            mins = (ttot - 3600*hrs)//60
+            secs = (ttot - 3600*hrs - 60*mins)
+            channel_idx += 1 
+filestr = 'oct_2021_19_31'
+dest = '/global/cscratch1/sd/jpathak/21var/oct_2021_19_21.h5'
+src = '/project/projectdirs/dasrepo/ERA5/oct_2021_19_31_sfc.nc'
+#u10 v10 t2m
+writetofile(src, dest, 0, ['u10'])
+writetofile(src, dest, 1, ['v10'])
+writetofile(src, dest, 2, ['t2m'])
+#sp mslp
+src = '/project/projectdirs/dasrepo/ERA5/oct_2021_19_31_sfc.nc'
+writetofile(src, dest, 3, ['sp'])
+writetofile(src, dest, 4, ['msl'])
+#t850
+src = '/project/projectdirs/dasrepo/ERA5/oct_2021_19_31_pl.nc'
+writetofile(src, dest, 5, ['t'], 2)
+#uvz1000
+src = '/project/projectdirs/dasrepo/ERA5/oct_2021_19_31_pl.nc'
+writetofile(src, dest, 6, ['u'], 3)
+writetofile(src, dest, 7, ['v'], 3)
+writetofile(src, dest, 8, ['z'], 3)
+#uvz850
+src = '/project/projectdirs/dasrepo/ERA5/oct_2021_19_31_pl.nc'
+writetofile(src, dest, 9, ['u'], 2)
+writetofile(src, dest, 10, ['v'], 2)
+writetofile(src, dest, 11, ['z'], 2)
+#uvz 500
+src = '/project/projectdirs/dasrepo/ERA5/oct_2021_19_31_pl.nc'
+writetofile(src, dest, 12, ['u'], 1)
+writetofile(src, dest, 13, ['v'], 1)
+writetofile(src, dest, 14, ['z'], 1)
+#t500
+src = '/project/projectdirs/dasrepo/ERA5/oct_2021_19_31_pl.nc'
+writetofile(src, dest, 15, ['t'], 1)
+#z50
+src = '/project/projectdirs/dasrepo/ERA5/oct_2021_19_31_pl.nc'
+writetofile(src, dest, 16, ['z'], 0)
+#r500 
+src = '/project/projectdirs/dasrepo/ERA5/oct_2021_19_31_pl.nc'
+writetofile(src, dest, 17, ['r'], 1)
+#r850
+src = '/project/projectdirs/dasrepo/ERA5/oct_2021_19_31_pl.nc'
+writetofile(src, dest, 18, ['r'], 2)
+#tcwv
+src = '/project/projectdirs/dasrepo/ERA5/oct_2021_19_31_sfc.nc'
+writetofile(src, dest, 19, ['tcwv'])
+#sst
+#src = '/project/projectdirs/dasrepo/ERA5/oct_2021_19_31_sfc.nc'
+#writetofile(src, dest, 20, ['sst'])
--- a/docker/Dockerfile
+++ b/docker/Dockerfile
+FROM nvcr.io/nvidia/pytorch:21.11-py3
+# update repo info
+RUN apt update -y
+# install mpi4py
+RUN pip install mpi4py
+# h5py
+RUN pip install h5py
+# other python stuff
+RUN pip install wandb && \
+    pip install ruamel.yaml && \
+    pip install --upgrade tqdm && \
+    pip install timm && \
+    pip install einops
+# benchy
+RUN pip install git+https://github.com/romerojosh/benchy.git
+# set wandb to offline
+#ENV WANDB_MODE offline
+# copy source code
+RUN mkdir -p /opt/ERA5_wind
+COPY config /opt/ERA5_wind/config
+COPY copernicus /opt/ERA5_wind/copernicus
+COPY docker /opt/ERA5_wind/docker
+COPY networks /opt/ERA5_wind/networks
+COPY utils /opt/ERA5_wind/utils
+COPY plotting /opt/ERA5_wind/plotting
+COPY mpu /opt/ERA5_wind/mpu
+COPY *.py /opt/ERA5_wind/
+COPY *.sh /opt/ERA5_wind/
+COPY perf_tests /opt/perf_tests
+# create dummy git image
+RUN cd /opt/ERA5_wind && git init
--- a/docker/build.sh
+++ b/docker/build.sh
+#!/bin/bash
+repo=gitlab-master.nvidia.com:5005/tkurth/era5_wind
+#tag=latest
+#tag=debug
+tag=jaideep_legacy_dataloader
+cd ../
+# build
+docker build -t ${repo}:${tag} -f docker/Dockerfile .
+# push
+docker push ${repo}:${tag}
+# retag and repush
+#docker tag ${repo}:${tag} thorstenkurth/era5-wind:${tag}
+#docker push thorstenkurth/era5-wind:${tag}
--- a/export_DDP_vars.sh
+++ b/export_DDP_vars.sh
+export RANK=$SLURM_PROCID
+export WORLD_RANK=$SLURM_PROCID
+export LOCAL_RANK=$SLURM_LOCALID
+export WORLD_SIZE=$SLURM_NTASKS
+export MASTER_PORT=29500 # default from torch launcher
+export WANDB_START_METHOD="thread"
--- a/inference/README_inference.md
+++ b/inference/README_inference.md
+## Inference scripts
+To run inference from base dir:
+* single member inference:
+  * inference.py for backbone variables: (see help for other options)
+    ```bash
+    python inference/inference.py --config afno_backbone --run_num 0
+    ```
+  * inference_precip.py for total precipitation: (see help for other options)
+    ```bash
+    python inference/inference.py --config precip --run_num 0
+    ```
+* ensemble inference:
+  * inference_ensemble.py for wind variables: (see help for other options), use submit_batch_ensemble.sh to submit parallelized ensembles across
+  different initial conditions
+  * inference_ensemble_precip.py for total precipitation: (see help for other options), use submit_batch_ensemble.sh to submit parallelized ensembles across
+  different initial conditions; change to inference_ensemble_precip.py in the launch cmd