Week 14
Weekly Progress Update (April 14–15 & April 21–22)
Meeting Participants: Haocheng Lu, Yufeng Xu, Haochen Yang, Minjun Zhu
Meeting Mode: In Person & Online
Group Accomplishments
- Continued working on our individual assigned issues.
- Shared debugging strategies and provided feedback on each other’s open pull requests.
- Reviewed the progress on issues from previous meetings and identified the ones actively being worked on.
- Prepared presentation slides for upcoming meetings.
- Practiced delivering the presentation to ensure clarity and flow.
Individual Accomplishments
-
Haocheng Lu:
Added support for manually settinghead_dim
in Qwen2 MoE models. Updated the configuration, modified the attention module, and added corresponding tests to align with behavior seen in Llama and Mixtral models.
➔ PR #37643 -
Yufeng Xu:
Submitted a pull request to fix incorrect installation instructions, currently under review.
➔ PR #37640 -
Haochen Yang:
Investigated a state dictionary bug inIterableDataset
(datasets library) and started testing a proposed local fix. Documented findings in the related issue discussions. -
Minjun Zhu:
Submitted a PR for adding tests for the new Tensor Parallel integration.
➔ PR #37596
Submitted a PR to add resume checkpoint support for the ClearML callback, including a new test file.
➔ PR #37635
Provided the issue opener with base code for fine-tuning the SigLIP2 model.
➔ Issue #37627