Commit Graph

12 Commits

Author SHA1 Message Date
Christina Floristean
a67036064d Convert back to AF2 scheduler/optimizer in deepspeed config 2023-09-08 11:21:22 -04:00
Christina Floristean
f0a320e0d2 Integrated deepspeed attention kernel and added initial tests. 2023-09-07 16:51:27 -04:00
Gustaf Ahdritz
f362ebffa7 Undo DeepSpeed config change 2022-03-29 20:38:31 -04:00
Gustaf Ahdritz
47f1d66ad1 Fix bug in DeepSpeed config 2022-03-29 15:18:29 -04:00
Gustaf Ahdritz
3dcc01a76e Tweak training script. Install new LR scheduler 2022-03-20 17:46:19 -04:00
Gustaf Ahdritz
32c137e276 Remove buggy DeepSpeed LR scheduler 2022-02-06 17:28:07 -05:00
Gustaf Ahdritz
29be56e544 Add bfloat16 training option 2021-11-29 18:38:56 -05:00
Gustaf Ahdritz
4bd437511a Add cache clearing to config 2021-11-19 18:23:26 -05:00
Gustaf Ahdritz
263661a39f Clear cuda cache between extra MSA blocks to alleviate fragmentation 2021-11-19 17:22:17 -05:00
Gustaf Ahdritz
9df2af524c Remove experimental features committed by mistake 2021-10-29 00:18:08 -04:00
Gustaf Ahdritz
d5480cbe69 Fix horrible in-place operation bug 2021-10-29 00:15:48 -04:00
Gustaf Ahdritz
8bbe9e5a94 Add sample DeepSpeed config 2021-10-12 13:43:57 -04:00