Virginia Tech
Browse

Data associated with "Are We on the Same Page? Examining Developer Perception Alignment in Open Source Code Reviews"

<p dir="ltr">The aim of this research is to identify the expectations of contributors (specifically, developers) and maintainers (or reviewers) regarding the code review process for effective bias mitigation in open source software development projects. Additionally, the research will examine whether these expectations align with those established in the project guidelines. Furthermore, we intend to uncover common practices across various projects as outlined in their respective guidelines. This dataset includes the list of open source project repositories analyzed, survey instruments and Jupyter notebooks from the analysis. Due to consent language limitation we're not able to share individual survey responses. </p><p dir="ltr"><br></p>

History

Related Materials

  1. 1.
    arXiv - Is supplement to https://arxiv.org/abs/2504.18407

Publisher

University Libraries, Virginia Tech

Corresponding Author Name

Yoseph Berhanu Alebachew

Corresponding Author E-mail Address

yoseph@vt.edu

Files/Folders in Dataset and Description

Data Files Description repos.csv – List of GitHub repositories analyzed in the study • Rank: Repository rank based on popularity or selection criteria • Repo Id: Unique identifier for the repository • Repo Name: Repository name • Full Name: Full name including the owner (e.g., owner/repo) • fork: Whether the repo is a fork (true/false) • url: GitHub URL of the repository • homepage: Associated project homepage if provided • size: Size of the repository in KB • language: Primary programming language used • topics: Repository topics/tags (comma-separated) • open_issues: Number of open issues • watchers: Number of watchers • created_at: Creation date of the repository • updated_at: Date of the last update • Owner Id: Unique ID of the repository owner • Owner Login: GitHub username of the repository owner • Owner URL: URL of the owner profile • Owner Type: Type of owner (User or Organization) • description: Short description of the repository • Stars: Number of stars • Forks Count: Number of forks survey_questions.csv – Contains metadata for all survey questions presented to contributors and maintainers • Id: Unique identifier for the question • Question: The text of the question • Maintainer QID: Question code as shown in the maintainers’ version • Contributor QID: Question code as shown in the contributors’ version • Category: Thematic category of the question (e.g., collaboration, onboarding) • Research Question: Associated research theme or inquiry • Type: Data type or format of response (e.g., Likert, open-ended) survey_response_keys.csv – Codebook mapping survey response keys to their descriptions • Question: Corresponding question ID or text • Key: Encoded value from the survey response (e.g., 1, A, Y) • Desc: Description or label for the response option (e.g., Strongly Agree, Yes, Developer) Jupyter Notebooks • 1-analysis.ipynb – Main notebook for survey data analysis, visualization, and interpretation • 2-repo_analysis.ipynb – Analysis of repository metadata such as stars, forks, issues, and contributor patterns Survey Instruments • contributors_survey_questions.md – Markdown file listing the questions shown to contributors • maintainers_survey_questions.md – Markdown file listing the questions shown to maintainers

Usage metrics

    Computer Science

    Categories

    Exports

    RefWorks
    BibTeX
    Ref. manager
    Endnote
    DataCite
    NLM
    DC