Christina Floristean
|
a67036064d
|
Convert back to AF2 scheduler/optimizer in deepspeed config
|
2023-09-08 11:21:22 -04:00 |
|
Christina Floristean
|
f0a320e0d2
|
Integrated deepspeed attention kernel and added initial tests.
|
2023-09-07 16:51:27 -04:00 |
|
Gustaf Ahdritz
|
f362ebffa7
|
Undo DeepSpeed config change
|
2022-03-29 20:38:31 -04:00 |
|
Gustaf Ahdritz
|
47f1d66ad1
|
Fix bug in DeepSpeed config
|
2022-03-29 15:18:29 -04:00 |
|
Gustaf Ahdritz
|
3dcc01a76e
|
Tweak training script. Install new LR scheduler
|
2022-03-20 17:46:19 -04:00 |
|
Gustaf Ahdritz
|
32c137e276
|
Remove buggy DeepSpeed LR scheduler
|
2022-02-06 17:28:07 -05:00 |
|
Gustaf Ahdritz
|
29be56e544
|
Add bfloat16 training option
|
2021-11-29 18:38:56 -05:00 |
|
Gustaf Ahdritz
|
4bd437511a
|
Add cache clearing to config
|
2021-11-19 18:23:26 -05:00 |
|
Gustaf Ahdritz
|
263661a39f
|
Clear cuda cache between extra MSA blocks to alleviate fragmentation
|
2021-11-19 17:22:17 -05:00 |
|
Gustaf Ahdritz
|
9df2af524c
|
Remove experimental features committed by mistake
|
2021-10-29 00:18:08 -04:00 |
|
Gustaf Ahdritz
|
d5480cbe69
|
Fix horrible in-place operation bug
|
2021-10-29 00:15:48 -04:00 |
|
Gustaf Ahdritz
|
8bbe9e5a94
|
Add sample DeepSpeed config
|
2021-10-12 13:43:57 -04:00 |
|