
GitHub Usability Study
To investigate GitHub's "Pull Request" feature and evaluate the transition from the old interface to the new alpha version of it.

Role
UX Designer & Researcher
Usability Analyst​
Played a key role in setting up the GitHub testing environment, moderating and observing multiple studies. I analyzed data to gather insights and led the design suggestions based on testing outcomes.​

Tools



Figma
Notion
GitHub
Goal
Study
Assess the intuitiveness of the new UI and evaluate the transition from old to new UI
6 participants tested in remote within-subject A/B testing, acting as both code authors and reviewers
Result
Overall, satisfactory transition with a steep learning curve.
​
Decrease in SUS score by 12%
Task completion rate of 66.6%

Team
5 UX Designers

Duration
10 weeks
Overview
GitHub redesigned its Pull Request feature with a focus on improving accessibility.
A feature for developers to propose, review, and refine code changes by collaborating with teammates before merging and deploying the updates.
Participants appreciated the modern, intuitive UI but noted a steep learning curve, reflected in a SUS drop from 71 to 62.5.
6
Participants
66.6%
Task success rate
-12%
SUS score from old to new UI
The Study
01
GOALS



Identify success measures of the new pull request feature's interface compared to the previous UI, specifically focusing on task completion rates
Identify usability issues in locating and resolving comments and merging pull requests.
Assess the intuitiveness and mergeability of the new UI for existing users.
02
PLAN
Pull requests enable code review, discussion, and collaboration. We explored how comments are added, viewed, and tracked to improve accessibility and user experience.
The study considered 2 key user roles:

Code Author
The developer requesting the pull request.

Code Reviewer
The team member is responsible for reviewing and approving the changes.

Demo org we set up
Pull request conversation from the scene we set up

Comments on individual line of code within "Files changed"
03
METHOD
We tested 6 Professional Developers with working knowledge of Github workflows, since they will be most impacted by the usability issues. The study comprised of 3 main components:
1
2
3
Usability Tasks
We asked participants to perform tasks in various scenarios on the Github website. We recorded screen activity to measure time on task and the number of clicks taken for each task, helping us identify any bottlenecks in the user flow.
SUS Questionnaire
After testing each environment, we followed up with a short SUS questionnaire using a likert scale to capture their immediate satisfaction and ease of use for both interfaces. We also had them select adjectives to describe their impressions.
User Interview
Participants shared their overall personal thoughts on the experience and the merge ability of new features, giving us rich attitudinal insights



Study with the GitHub participant
Participant
Me as a moderator
Each user was assigned tasks of either code author or code reviewer. We executed a Remote moderated usability test using a within-subject A/B testing
To account for the two UIs (old and new) and two user roles (Code Author and Code Reviewer), we created 4 varied testing conditions to mitigate potential biases, such as increased preference for the first interface encountered. Each participant was subjected to one of these conditions.
Code Author
Code Reviewer
Old UI
A
B
New UI
C
D
Testing Conditions
Code Author Tasks
1. Finding comments by reviewers

Code Reviewer Tasks
1. Posting a comment

2. Resolving comments
2. Replying to the comment by the Author


3. Determining why pull request couldn't be merged

3. Suggesting code changes

04
DATA COLLECTION
We collected both Quantitative and Qualitative data to measure various success rates and gather insights.
Qualitative
Pre & post Test Interviews, Observations, Adjectives association
Quantitative
Task completion rate, Time on task, System Usability Score (SUS) responses, and user feedback on perceived success and task difficulty
05
FINDINGS & RECOMMNEDATIONS
Participants expressed satisfaction with the new Pull Request UI, though quantitative data highlighted areas of challenge.
Task success rate
66.6%
All participants attempted all the tasks while only 6 out of 9 tasks reached a full success rate.
System Usability Score
-12%
The old UI averaged 75.5, outperforming the new UI's 66.75, showing the new design hasn't fully met user expectations.
Some of the key findings and the recommendations are:
Key Finding
1. Users find the ‘resolve conversation’ icon hard to locate, which disrupts workflow.
2 out of 4 participants were able to complete the task successfully where all participants voiced concerns in locating the resolve icon.

Severity
4- Critical
Task Success
50%
“I'm able to see the comments as an overlay on the code (in old UI), and was able to resolve it. I could see the resolve button there easily. So if I could have the same experience or a more improved experience that would be good.”
Design Recommendation
Resolve
Temporarily displaying the title "Resolve Conversation" alongside the icon to facilitate initial familiarization with its function and encourage habitual usage. This title can be removed at a later stage.
​
OR
​
​
​
​Include a permanent “Resolve Conversation” CTA at the bottom of the overlay
Resolve Conversation
This approach will facilitate a smoother transition from the old UI to the new one and reduce the learning curve associated with locating the icon and understanding its function.
Key Finding
2. Users find difficulty distinguishing different types of comments like code comments, suggestions etc
Severity
4- Critical
2 out of 6 participants were able to complete the task successfully with major issues in finding code suggestions amongst comments and suggestions.

Task Success
40%
“At first (old UI) I was able to see all the comments, but now I cannot easily differentiate that. Was this a comment actually or a suggestion or just a normal general comment?”
Suggestion comment

General code comment
Design Recommendation
Distinguish various types of comments by assigning them different markers.
For instance, suggestions could be highlighted in blue, while general comments could be highlighted in green or use icons or symbols to denote different types of comments.

This approach will aid in identifying, prioritizing, and taking quicker actions on the comments.
06
LEARNINGS
Collaborating with GitHub’s research team was an enriching experience. Their feedback helped us iterate on our study design, ensuring that we were capturing meaningful data.



Deep Collaboration
Real-World Impact
Participant Screening
Collaborating with GitHub’s research team highlighted the value of learning from industry experts. Their attention to detail emphasized the need for refining methods and approaches to achieve more impactful outcomes.
Learning: Design for real-world impact
Contributing to a feature that affects millions of developers underscored the importance of thoughtful design and the role of accessibility in enhancing user experience.
Screening for more active GitHub users would have improved the authenticity of the insights, ensuring we gathered data from developers who regularly interact with the platform.