Commit 3e22e549 authored by helloyongyang's avatar helloyongyang
Browse files

update doc

parent 67eeafcf
# Feature Caching # Feature Cache
## Cache Acceleration Algorithms To demonstrate some video playback effects, you can get better display effects and corresponding documentation content on this [🔗 page](https://github.com/ModelTC/LightX2V/blob/main/docs/EN/source/method_tutorials/cache_source.md).
- In the inference process of diffusion models, cache reuse is an important acceleration algorithm.
- Its core idea is to skip redundant computations at certain time steps and improve inference efficiency by reusing historical cache results.
- The key to the algorithm lies in how to decide at which time steps to perform cache reuse, usually based on dynamic judgment of model state changes or error thresholds.
- During inference, key content such as intermediate features, residuals, and attention outputs need to be cached. When entering a reusable time step, the cached content is directly utilized, and the current output is reconstructed through approximation methods like Taylor expansion, thereby reducing repetitive computations and achieving efficient inference.
### TeaCache
The core idea of `TeaCache` is to accumulate the **relative L1** distance between adjacent time step inputs. When the cumulative distance reaches the set threshold, it determines that the current time step should not use cache reuse; conversely, when the cumulative distance does not reach the set threshold, cache reuse is used to accelerate the inference process.
- Specifically, the algorithm calculates the relative L1 distance between the current input and the previous step's input at each inference step and accumulates it.
- When the cumulative distance does not exceed the threshold, it indicates that the model state change is not obvious, so the most recent cached content is directly reused, skipping some redundant computations. This can significantly reduce the number of forward computations of the model and improve inference speed.
In actual effectiveness, TeaCache achieves significant acceleration while ensuring generation quality. The video comparison before and after acceleration is as follows:
| Before Acceleration | After Acceleration |
|:------:|:------:|
| Single H200 inference time: 58s | Single H200 inference time: 17.9s |
| ![Before acceleration effect](../../../../assets/gifs/1.gif) | ![After acceleration effect](../../../../assets/gifs/2.gif) |
- Speedup ratio: **3.24**
- Config: [wan_t2v_1_3b_tea_480p.json](https://github.com/ModelTC/lightx2v/tree/main/configs/caching/teacache/wan_t2v_1_3b_tea_480p.json)
- Reference paper: [https://arxiv.org/abs/2411.19108](https://arxiv.org/abs/2411.19108)
### TaylorSeer Cache
The core of `TaylorSeer Cache` lies in using Taylor formula to recalculate cached content as residual compensation for cache reuse time steps.
- The specific approach is that at cache reuse time steps, not only simply reusing historical cache, but also approximating reconstruction of current output through Taylor expansion. This can further improve output accuracy while reducing computational load.
- Taylor expansion can effectively capture subtle changes in model state, allowing errors caused by cache reuse to be compensated, thus ensuring generation quality while accelerating.
`TaylorSeer Cache` is suitable for scenarios with high output accuracy requirements and can further improve model inference performance based on cache reuse.
| Before Acceleration | After Acceleration |
|:------:|:------:|
| Single H200 inference time: 57.7s | Single H200 inference time: 41.3s |
| ![Before acceleration effect](../../../../assets/gifs/3.gif) | ![After acceleration effect](../../../../assets/gifs/4.gif) |
- Speedup ratio: **1.39**
- Config: [wan_t2v_taylorseer](https://github.com/ModelTC/lightx2v/tree/main/configs/caching/taylorseer/wan_t2v_taylorseer.json)
- Reference paper: [https://arxiv.org/abs/2503.06923](https://arxiv.org/abs/2503.06923)
### AdaCache
The core idea of `AdaCache` is to dynamically adjust the stride of cache reuse based on partial cached content in specified block chunks.
- The algorithm analyzes feature differences between two adjacent time steps within specific blocks and adaptively determines the next cache reuse time step interval based on the difference magnitude.
- When model state changes are small, the stride automatically increases, reducing cache update frequency; when state changes are large, the stride decreases to ensure output quality.
This allows flexible adjustment of caching strategies based on dynamic changes during actual inference, achieving more efficient acceleration and better generation results. AdaCache is suitable for application scenarios with high requirements for both inference speed and generation quality.
| Before Acceleration | After Acceleration |
|:------:|:------:|
| Single H200 inference time: 227s | Single H200 inference time: 83s |
| ![Before acceleration effect](../../../../assets/gifs/5.gif) | ![After acceleration effect](../../../../assets/gifs/6.gif) |
- Speedup ratio: **2.73**
- Config: [wan_i2v_ada](https://github.com/ModelTC/lightx2v/tree/main/configs/caching/adacache/wan_i2v_ada.json)
- Reference paper: [https://arxiv.org/abs/2411.02397](https://arxiv.org/abs/2411.02397)
### CustomCache
`CustomCache` combines the advantages of `TeaCache` and `TaylorSeer Cache`.
- It combines the real-time and rationality of `TeaCache` in cache decision-making, determining when to perform cache reuse through dynamic thresholds.
- At the same time, it utilizes `TaylorSeer`'s Taylor expansion method to make use of cached content.
This not only efficiently determines the timing of cache reuse but also maximally utilizes cached content to improve output accuracy and generation quality. Actual testing shows that `CustomCache` generates better video quality than solutions using `TeaCache`, `TaylorSeer Cache`, or `AdaCache` alone across multiple content generation tasks, making it one of the currently best-performing comprehensive cache acceleration algorithms.
| Before Acceleration | After Acceleration |
|:------:|:------:|
| Single H200 inference time: 57.9s | Single H200 inference time: 16.6s |
| ![Before acceleration effect](../../../../assets/gifs/7.gif) | ![After acceleration effect](../../../../assets/gifs/8.gif) |
- Speedup ratio: **3.49**
- Config: [wan_t2v_custom_1_3b](https://github.com/ModelTC/lightx2v/tree/main/configs/caching/custom/wan_t2v_custom_1_3b.json)
## Usage
The config files for feature caching are [here](https://github.com/ModelTC/lightx2v/tree/main/configs/caching)
By specifying --config_json to a specific config file, you can test different cache algorithms.
[Here](https://github.com/ModelTC/lightx2v/tree/main/scripts/cache) are some running scripts for use.
# Feature Caching
## Cache Acceleration Algorithm
- In the inference process of diffusion models, cache reuse is an important acceleration algorithm.
- The core idea is to skip redundant computations at certain time steps by reusing historical cache results to improve inference efficiency.
- The key to the algorithm is how to decide which time steps to perform cache reuse, usually based on dynamic judgment of model state changes or error thresholds.
- During inference, key content such as intermediate features, residuals, and attention outputs need to be cached. When entering reusable time steps, the cached content is directly utilized, and the current output is reconstructed through approximation methods like Taylor expansion, thereby reducing repeated calculations and achieving efficient inference.
### TeaCache
The core idea of `TeaCache` is to accumulate the **relative L1** distance between adjacent time step inputs. When the accumulated distance reaches a set threshold, it determines that the current time step should not use cache reuse; conversely, when the accumulated distance does not reach the set threshold, cache reuse is used to accelerate the inference process.
- Specifically, the algorithm calculates the relative L1 distance between the current input and the previous step input at each inference step and accumulates it.
- When the accumulated distance does not exceed the threshold, it indicates that the model state change is not obvious, so the most recently cached content is directly reused, skipping some redundant calculations. This can significantly reduce the number of forward computations of the model and improve inference speed.
In practical effects, TeaCache achieves significant acceleration while ensuring generation quality. On a single H200 card, the time consumption and video comparison before and after acceleration are as follows:
<table>
<tr>
<td align="center">
Before acceleration: 58s
</td>
<td align="center">
After acceleration: 17.9s
</td>
</tr>
<tr>
<td align="center">
<video src="https://github.com/user-attachments/assets/1781df9b-04df-4586-b22f-5d15f8e1bff6" width="100%"></video>
</td>
<td align="center">
<video src="https://github.com/user-attachments/assets/e93f91eb-3825-4866-90c2-351176263a2f" width="100%"></video>
</td>
</tr>
</table>
- Acceleration ratio: **3.24**
- Config: [wan_t2v_1_3b_tea_480p.json](https://github.com/ModelTC/lightx2v/tree/main/configs/caching/teacache/wan_t2v_1_3b_tea_480p.json)
- Reference paper: [https://arxiv.org/abs/2411.19108](https://arxiv.org/abs/2411.19108)
### TaylorSeer Cache
The core of `TaylorSeer Cache` lies in using Taylor's formula to recalculate cached content as residual compensation for cache reuse time steps.
- The specific approach is to not only simply reuse historical cache at cache reuse time steps, but also approximately reconstruct the current output through Taylor expansion. This can further improve output accuracy while reducing computational load.
- Taylor expansion can effectively capture minor changes in model state, allowing errors caused by cache reuse to be compensated, thereby ensuring generation quality while accelerating.
`TaylorSeer Cache` is suitable for scenarios with high output accuracy requirements and can further improve model inference performance based on cache reuse.
<table>
<tr>
<td align="center">
Before acceleration: 57.7s
</td>
<td align="center">
After acceleration: 41.3s
</td>
</tr>
<tr>
<td align="center">
<video src="https://github.com/user-attachments/assets/2d04005c-853b-4752-884b-29f8ea5717d2" width="100%"></video>
</td>
<td align="center">
<video src="https://github.com/user-attachments/assets/270e3624-c904-468c-813e-0c65daf1594d" width="100%"></video>
</td>
</tr>
</table>
- Acceleration ratio: **1.39**
- Config: [wan_t2v_taylorseer](https://github.com/ModelTC/lightx2v/tree/main/configs/caching/taylorseer/wan_t2v_taylorseer.json)
- Reference paper: [https://arxiv.org/abs/2503.06923](https://arxiv.org/abs/2503.06923)
### AdaCache
The core idea of `AdaCache` is to dynamically adjust the step size of cache reuse based on partial cached content in specified block chunks.
- The algorithm analyzes feature differences between two adjacent time steps within specific blocks and adaptively determines the next cache reuse time step interval based on the difference magnitude.
- When model state changes are small, the step size automatically increases, reducing cache update frequency; when state changes are large, the step size decreases to ensure output quality.
This allows flexible adjustment of caching strategies based on dynamic changes in the actual inference process, achieving more efficient acceleration and better generation results. AdaCache is suitable for application scenarios that have high requirements for both inference speed and generation quality.
<table>
<tr>
<td align="center">
Before acceleration: 227s
</td>
<td align="center">
After acceleration: 83s
</td>
</tr>
<tr>
<td align="center">
<video src="https://github.com/user-attachments/assets/33b2206d-17e6-4433-bed7-bfa890f9fa7d" width="100%"></video>
</td>
<td align="center">
<video src="https://github.com/user-attachments/assets/084dbe3d-6ff3-4afc-9a7c-453ec53b3672" width="100%"></video>
</td>
</tr>
</table>
- Acceleration ratio: **2.73**
- Config: [wan_i2v_ada](https://github.com/ModelTC/lightx2v/tree/main/configs/caching/adacache/wan_i2v_ada.json)
- Reference paper: [https://arxiv.org/abs/2411.02397](https://arxiv.org/abs/2411.02397)
### CustomCache
`CustomCache` combines the advantages of `TeaCache` and `TaylorSeer Cache`.
- It combines the real-time and reasonable cache decision-making of `TeaCache`, determining when to perform cache reuse through dynamic thresholds.
- At the same time, it utilizes `TaylorSeer`'s Taylor expansion method to make use of cached content.
This not only efficiently determines the timing of cache reuse but also maximizes the utilization of cached content, improving output accuracy and generation quality. Actual testing shows that `CustomCache` produces video quality superior to using `TeaCache`, `TaylorSeer Cache`, or `AdaCache` alone across multiple content generation tasks, making it one of the currently optimal comprehensive cache acceleration algorithms.
<table>
<tr>
<td align="center">
Before acceleration: 57.9s
</td>
<td align="center">
After acceleration: 16.6s
</td>
</tr>
<tr>
<td align="center">
<video src="https://github.com/user-attachments/assets/304ff1e8-ad1c-4013-bcf1-959ac140f67f" width="100%"></video>
</td>
<td align="center">
<video src="https://github.com/user-attachments/assets/d3fb474a-79af-4f33-b965-23d402d3cf16" width="100%"></video>
</td>
</tr>
</table>
- Acceleration ratio: **3.49**
- Config: [wan_t2v_custom_1_3b](https://github.com/ModelTC/lightx2v/tree/main/configs/caching/custom/wan_t2v_custom_1_3b.json)
## Usage
The config files for feature caching are located [here](https://github.com/ModelTC/lightx2v/tree/main/configs/caching)
By specifying --config_json to the specific config file, you can test different cache algorithms.
[Here](https://github.com/ModelTC/lightx2v/tree/main/scripts/cache) are some running scripts for use.
# 特征缓存 # 特征缓存
## 缓存加速算法 由于要展示一些视频的播放效果,你可以在这个[🔗 页面](https://github.com/ModelTC/LightX2V/blob/main/docs/ZH_CN/source/method_tutorials/cache_source.md)获得更好的展示效果以及相对应的文档内容。
- 在扩散模型的推理过程中,缓存复用是一种重要的加速算法。
- 其核心思想是在部分时间步跳过冗余计算,通过复用历史缓存结果提升推理效率。
- 算法的关键在于如何决策在哪些时间步进行缓存复用,通常基于模型状态变化或误差阈值动态判断。
- 在推理过程中,需要缓存如中间特征、残差、注意力输出等关键内容。当进入可复用时间步时,直接利用已缓存的内容,通过泰勒展开等近似方法重构当前输出,从而减少重复计算,实现高效推理。
### TeaCache
`TeaCache`的核心思想是通过对相邻时间步输入的**相对L1**距离进行累加,当累计距离达到设定阈值时,判定当前时间步不使用缓存复用;相反,当累计距离未达到设定阈值时则使用缓存复用加速推理过程。
- 具体来说,算法在每一步推理时计算当前输入与上一步输入的相对L1距离,并将其累加。
- 当累计距离未超过阈值,说明模型状态变化不明显,则直接复用最近一次缓存的内容,跳过部分冗余计算。这样可以显著减少模型的前向计算次数,提高推理速度。
实际效果上,TeaCache 在保证生成质量的前提下,实现了明显的加速。在单卡H200上,加速前后的用时与视频对比如下:
<table>
<tr>
<td align="center">
加速前:58s
</td>
<td align="center">
加速后:17.9s
</td>
</tr>
<tr>
<td align="center">
<video src="https://github.com/user-attachments/assets/1781df9b-04df-4586-b22f-5d15f8e1bff6" width="100%"></video>
</td>
<td align="center">
<video src="https://github.com/user-attachments/assets/e93f91eb-3825-4866-90c2-351176263a2f" width="100%"></video>
</td>
</tr>
</table>
- 加速比为:**3.24**
- config:[wan_t2v_1_3b_tea_480p.json](https://github.com/ModelTC/lightx2v/tree/main/configs/caching/teacache/wan_t2v_1_3b_tea_480p.json)
- 参考论文:[https://arxiv.org/abs/2411.19108](https://arxiv.org/abs/2411.19108)
### TaylorSeer Cache
`TaylorSeer Cache`的核心在于利用泰勒公式对缓存内容进行再次计算,作为缓存复用时间步的残差补偿。
- 具体做法是在缓存复用的时间步,不仅简单地复用历史缓存,还通过泰勒展开对当前输出进行近似重构。这样可以在减少计算量的同时,进一步提升输出的准确性。
- 泰勒展开能够有效捕捉模型状态的微小变化,使得缓存复用带来的误差得到补偿,从而在加速的同时保证生成质量。
`TaylorSeer Cache`适用于对输出精度要求较高的场景,能够在缓存复用的基础上进一步提升模型推理的表现。
- 加速前:
- 单卡H200推理耗时:57.7s
https://github.com/user-attachments/assets/2d04005c-853b-4752-884b-29f8ea5717d2
- 加速后:
- 单卡H200推理耗时:41.3s
https://github.com/user-attachments/assets/270e3624-c904-468c-813e-0c65daf1594d
- 加速比为:**1.39**
- config:[wan_t2v_taylorseer](https://github.com/ModelTC/lightx2v/tree/main/configs/caching/taylorseer/wan_t2v_taylorseer.json)
- 参考论文:[https://arxiv.org/abs/2503.06923](https://arxiv.org/abs/2503.06923)
### AdaCache
`AdaCache`的核心思想是根据指定block块中的部分缓存内容,动态调整缓存复用的步长。
- 算法会分析相邻两个时间步在特定 block 内的特征差异,根据差异大小自适应地决定下一个缓存复用的时间步间隔。
- 当模型状态变化较小时,步长自动加大,减少缓存更新频率;当状态变化较大时,步长缩小,保证输出质量。
这样可以根据实际推理过程中的动态变化,灵活调整缓存策略,实现更高效的加速和更优的生成效果。AdaCache 适合对推理速度和生成质量都有较高要求的应用场景。
- 加速前:
- 单卡H200推理耗时:227s
https://github.com/user-attachments/assets/33b2206d-17e6-4433-bed7-bfa890f9fa7d
- 加速后:
- 单卡H200推理耗时:83s
https://github.com/user-attachments/assets/084dbe3d-6ff3-4afc-9a7c-453ec53b3672
- 加速比为:**2.73**
- config:[wan_i2v_ada](https://github.com/ModelTC/lightx2v/tree/main/configs/caching/adacache/wan_i2v_ada.json)
- 参考论文:[https://arxiv.org/abs/2411.02397](https://arxiv.org/abs/2411.02397)
### CustomCache
`CustomCache`综合了`TeaCache``TaylorSeer Cache`的优势。
- 它结合了`TeaCache`在缓存决策上的实时性和合理性,通过动态阈值判断何时进行缓存复用.
- 同时利用`TaylorSeer`的泰勒展开方法对已缓存内容进行利用。
这样不仅能够高效地决定缓存复用的时机,还能最大程度地利用缓存内容,提升输出的准确性和生成质量。实际测试表明,`CustomCache`在多个内容生成任务上,生成的视频质量优于单独使用`TeaCache、TaylorSeer Cache``AdaCache`的方案,是目前综合性能最优的缓存加速算法之一。
- 加速前:
- 单卡H200推理耗时:57.9s
https://github.com/user-attachments/assets/304ff1e8-ad1c-4013-bcf1-959ac140f67f
- 加速后
- 单卡H200推理耗时:16.6s
https://github.com/user-attachments/assets/d3fb474a-79af-4f33-b965-23d402d3cf16
- 加速比为:**3.49**
- config:[wan_t2v_custom_1_3b](https://github.com/ModelTC/lightx2v/tree/main/configs/caching/custom/wan_t2v_custom_1_3b.json)
## 使用方式
特征缓存的config文件在[这里](https://github.com/ModelTC/lightx2v/tree/main/configs/caching)
通过指定--config_json到具体的config文件,即可以测试不同的cache算法
[这里](https://github.com/ModelTC/lightx2v/tree/main/scripts/cache)有一些运行脚本供使用。
# 特征缓存
## 缓存加速算法
- 在扩散模型的推理过程中,缓存复用是一种重要的加速算法。
- 其核心思想是在部分时间步跳过冗余计算,通过复用历史缓存结果提升推理效率。
- 算法的关键在于如何决策在哪些时间步进行缓存复用,通常基于模型状态变化或误差阈值动态判断。
- 在推理过程中,需要缓存如中间特征、残差、注意力输出等关键内容。当进入可复用时间步时,直接利用已缓存的内容,通过泰勒展开等近似方法重构当前输出,从而减少重复计算,实现高效推理。
### TeaCache
`TeaCache`的核心思想是通过对相邻时间步输入的**相对L1**距离进行累加,当累计距离达到设定阈值时,判定当前时间步不使用缓存复用;相反,当累计距离未达到设定阈值时则使用缓存复用加速推理过程。
- 具体来说,算法在每一步推理时计算当前输入与上一步输入的相对L1距离,并将其累加。
- 当累计距离未超过阈值,说明模型状态变化不明显,则直接复用最近一次缓存的内容,跳过部分冗余计算。这样可以显著减少模型的前向计算次数,提高推理速度。
实际效果上,TeaCache 在保证生成质量的前提下,实现了明显的加速。在单卡H200上,加速前后的用时与视频对比如下:
<table>
<tr>
<td align="center">
加速前:58s
</td>
<td align="center">
加速后:17.9s
</td>
</tr>
<tr>
<td align="center">
<video src="https://github.com/user-attachments/assets/1781df9b-04df-4586-b22f-5d15f8e1bff6" width="100%"></video>
</td>
<td align="center">
<video src="https://github.com/user-attachments/assets/e93f91eb-3825-4866-90c2-351176263a2f" width="100%"></video>
</td>
</tr>
</table>
- 加速比为:**3.24**
- config:[wan_t2v_1_3b_tea_480p.json](https://github.com/ModelTC/lightx2v/tree/main/configs/caching/teacache/wan_t2v_1_3b_tea_480p.json)
- 参考论文:[https://arxiv.org/abs/2411.19108](https://arxiv.org/abs/2411.19108)
### TaylorSeer Cache
`TaylorSeer Cache`的核心在于利用泰勒公式对缓存内容进行再次计算,作为缓存复用时间步的残差补偿。
- 具体做法是在缓存复用的时间步,不仅简单地复用历史缓存,还通过泰勒展开对当前输出进行近似重构。这样可以在减少计算量的同时,进一步提升输出的准确性。
- 泰勒展开能够有效捕捉模型状态的微小变化,使得缓存复用带来的误差得到补偿,从而在加速的同时保证生成质量。
`TaylorSeer Cache`适用于对输出精度要求较高的场景,能够在缓存复用的基础上进一步提升模型推理的表现。
<table>
<tr>
<td align="center">
加速前:57.7s
</td>
<td align="center">
加速后:41.3s
</td>
</tr>
<tr>
<td align="center">
<video src="https://github.com/user-attachments/assets/2d04005c-853b-4752-884b-29f8ea5717d2" width="100%"></video>
</td>
<td align="center">
<video src="https://github.com/user-attachments/assets/270e3624-c904-468c-813e-0c65daf1594d" width="100%"></video>
</td>
</tr>
</table>
- 加速比为:**1.39**
- config:[wan_t2v_taylorseer](https://github.com/ModelTC/lightx2v/tree/main/configs/caching/taylorseer/wan_t2v_taylorseer.json)
- 参考论文:[https://arxiv.org/abs/2503.06923](https://arxiv.org/abs/2503.06923)
### AdaCache
`AdaCache`的核心思想是根据指定block块中的部分缓存内容,动态调整缓存复用的步长。
- 算法会分析相邻两个时间步在特定 block 内的特征差异,根据差异大小自适应地决定下一个缓存复用的时间步间隔。
- 当模型状态变化较小时,步长自动加大,减少缓存更新频率;当状态变化较大时,步长缩小,保证输出质量。
这样可以根据实际推理过程中的动态变化,灵活调整缓存策略,实现更高效的加速和更优的生成效果。AdaCache 适合对推理速度和生成质量都有较高要求的应用场景。
<table>
<tr>
<td align="center">
加速前:227s
</td>
<td align="center">
加速后:83s
</td>
</tr>
<tr>
<td align="center">
<video src="https://github.com/user-attachments/assets/33b2206d-17e6-4433-bed7-bfa890f9fa7d" width="100%"></video>
</td>
<td align="center">
<video src="https://github.com/user-attachments/assets/084dbe3d-6ff3-4afc-9a7c-453ec53b3672" width="100%"></video>
</td>
</tr>
</table>
- 加速比为:**2.73**
- config:[wan_i2v_ada](https://github.com/ModelTC/lightx2v/tree/main/configs/caching/adacache/wan_i2v_ada.json)
- 参考论文:[https://arxiv.org/abs/2411.02397](https://arxiv.org/abs/2411.02397)
### CustomCache
`CustomCache`综合了`TeaCache``TaylorSeer Cache`的优势。
- 它结合了`TeaCache`在缓存决策上的实时性和合理性,通过动态阈值判断何时进行缓存复用.
- 同时利用`TaylorSeer`的泰勒展开方法对已缓存内容进行利用。
这样不仅能够高效地决定缓存复用的时机,还能最大程度地利用缓存内容,提升输出的准确性和生成质量。实际测试表明,`CustomCache`在多个内容生成任务上,生成的视频质量优于单独使用`TeaCache、TaylorSeer Cache``AdaCache`的方案,是目前综合性能最优的缓存加速算法之一。
<table>
<tr>
<td align="center">
加速前:57.9s
</td>
<td align="center">
加速后:16.6s
</td>
</tr>
<tr>
<td align="center">
<video src="https://github.com/user-attachments/assets/304ff1e8-ad1c-4013-bcf1-959ac140f67f" width="100%"></video>
</td>
<td align="center">
<video src="https://github.com/user-attachments/assets/d3fb474a-79af-4f33-b965-23d402d3cf16" width="100%"></video>
</td>
</tr>
</table>
- 加速比为:**3.49**
- config:[wan_t2v_custom_1_3b](https://github.com/ModelTC/lightx2v/tree/main/configs/caching/custom/wan_t2v_custom_1_3b.json)
## 使用方式
特征缓存的config文件在[这里](https://github.com/ModelTC/lightx2v/tree/main/configs/caching)
通过指定--config_json到具体的config文件,即可以测试不同的cache算法
[这里](https://github.com/ModelTC/lightx2v/tree/main/scripts/cache)有一些运行脚本供使用。
Markdown is supported
0% or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment