DeepSpeed (research) developer at @microsoft
-
Microsoft
- Seattle, WA
- https://rasley.io
- @jeffra45
Highlights
- Pro
Block or Report
Block or report jeffra
Report abuse
Contact GitHub support about this user’s behavior. Learn more about reporting abuse.
Report abusePinned
-
microsoft/DeepSpeed Public
DeepSpeed is a deep learning optimization library that makes distributed training and inference easy, efficient, and effective.
-
-
microsoft/Megatron-DeepSpeed Public
Forked from NVIDIA/Megatron-LM
Ongoing research training transformer language models at scale, including: BERT & GPT-2
1,299 contributions in the last year
Less
More
Activity overview
Contributed to
microsoft/DeepSpeed,
microsoft/DeepSpeed-MII,
jeffra/sandbox
and 12 other
repositories
Contribution activity
March 2023
Created 11 commits in 2 repositories
Created 1 repository
- jeffra/ColossalAI Python
Created a pull request in hpcaitech/ColossalAI that received 3 comments
prevent op_builder being installed in site-pkgs
+1
−0
•
3
comments
Opened 3 other pull requests in 1 repository
Reviewed 12 pull requests in 2 repositories
microsoft/DeepSpeed
11 pull requests
- Remove bf16 from inference config dtye enum
- Softmax Scheduling Cleanup
- fix check params
- fix return prev key and value , added strides to from_blob
- Assert mp_size is factor of model dimensions
- Fix Broken Links
- pre-commit check for torch.cuda in code
- deepspeed.init_distributed() support for TCP protocols
- Improve overflow handling
- Check for local CUDA graphs when enable_cuda_graph=True
- TP unsupported models and assertions
microsoft/DeepSpeed-MII
1 pull request
12
contributions
in private repositories
Mar 1 – Mar 16







