I'm an engineer at xAI focusing on multimodal, video generation and world models (Grok Imagine v0.9).
🤗Open Source Projects:
- Cosmos: state-of-the-art generative world models
- NeMo DFM: large-scale training and inference framework for diffusion models
- Megatron-LM MoE: Scaling up mixture of experts
- NeMo: scalable training framework for LLMs transformers
- LongVILA: Long-Context VLM for long videos (ICLR'25)
- ActGPT: browser-use agent
- Channel Pruning: Accelerating Very Deep Neural Networks (ICCV'17)
- Epipolar Transformers: Accurate multi-camera pose understanding (CVPR'20)
- AMC: AutoML for model compression (ECCV'18)
- KL Loss: Accurate Object Detection (CVPR'19)
- FSAF: single-shot object detection (CVPR'19)
- 🤓Grok Heavy Tungsten Cube





