Week11

Group Project Update & Reflection on Shivam’s Talk

Over the past few weeks, our group has made steady and meaningful progress on our open-source contribution project. During our in-person and online meetings, we reviewed everyone’s work during spring break and aligned on ways to keep up momentum. As a team—Haocheng Lu, Yufeng Xu, Haochen Yang, and Minjun Zhu—we’ve worked through issues related to model behavior, API inconsistencies, documentation bugs, and template standardization.

Each team member brought something valuable to the table. For example, Yufeng proposed fixes for issues involving KV-cache inconsistencies and lm_head parameters. Haochen Yang contributed to the datasets repo by improving documentation clarity and pointing out outdated issues. Minjun and I have both submitted PRs to improve readability and correctness in the Transformers codebase. Overall, our team has maintained a sustainable contribution pace while helping one another navigate technical hurdles.

For my part, I’ve made two main types of contributions:

  • Code-based Contribution: One of my more technical contributions was addressing Issue #36510, which involved incorrect position_ids updates in the model generation loop. I proposed a detailed explanation and potential fix, diving deep into how generation caching works in Transformer models. It pushed me to better understand indexing, tensor shapes, and the logic behind cache management in Hugging Face’s generation API.

  • Non-code Contribution: I also submitted PR #36839 to fix a broken documentation link. While this may seem minor, documentation is a crucial entry point for new users and contributors. Ensuring it’s clear and functional helps improve the accessibility of open source projects.

One challenge that keeps coming up is how deep and specialized the knowledge often needs to be. Issues may seem simple at first glance, but frequently involve complex model internals or subtle behaviors across different components. That’s why Shivam’s talk this week was especially impactful for me.


Reflection on Shivam Balikondwar’s Presentation

Before Shivam’s presentation, I often questioned whether I was “qualified” to contribute to open-source projects. Many of the issues I encountered felt too advanced, and I sometimes hesitated to engage deeply, fearing I might not fully understand the logic or end up making a mistake.

But hearing Shivam share how he began his open-source journey without a computer science degree—and learned everything along the way—was truly eye-opening. It made me realize that contributing isn’t about being perfect from the start. It’s about learning in public, asking questions, and making small, meaningful steps forward.

This helped me reframe how I view my own contributions. For example, I used to think fixing documentation or links was “too small” to count. But now I see that every contribution—whether it’s code or not—plays a role in supporting the open-source ecosystem. Even reviewing an issue or clarifying a bug can help others.

I’ve also started to feel less afraid of not knowing everything. Shivam emphasized that not knowing is part of the journey, and that mindset gave me the confidence to explore more technical issues and dive into areas I previously avoided.

Now, instead of seeing challenges as roadblocks, I’ve started to see them as learning checkpoints. Shivam’s talk didn’t just inspire me—it fundamentally changed the way I approach open source.

Written before or on April 6, 2025