(dds_chapter)=
# Development Distribution Score

The distribution of responsibility and workload among contributors is essential to a stable community. A common way to estimate the risk resulting from information and capabilities not being shared among team members is the so-called [Bus Factor](https://en.wikipedia.org/wiki/Bus_factor). _The "bus factor" is the minimum number of team members that have to suddenly disappear from a project before the project stalls due to lack of knowledgeable or competent personnel._

**In this study, a proxy is developed to quantify the bus factor, the Development Distribution Score (DDS)**. The DDS weighs how the development is distributed between project contributors by benchmarking the contributor with the most commits in relation to the other contributors. The distribution of knowledge, work, and governance is critical to a project's long-term viability. When a project or organisation undergoes significant social or technological changes (for example, personnel leave a project or can no longer contribute), others have the knowledge and capacity to continue with the initiative. The metric compares a project's reliance on a small number of contributors and, as a result, its resilience to change. Projects with a low DDS appear to be more vulnerable to decisions made by a single organisation or developer, which affect not only other developers or users, but also the dependencies to other projects. 

**The commits of the strongest contributor are measured in relation to the total number of commits.** Although commits are not an absolute measure of an individual's performance within a project, they do reflect working relationships after a certain period of development. Furthermore, it makes it possible to assess the status of a project without having to make direct comparisons to other projects. The DDS value is calculated using the following formula:

```{figure} ../images/dds_calc.png
---
align: center
width: 80%
---
```
**For instance, a DDS of 0.1 means that 90% of the transfers come from a single developer.** Without the high engagement of that individual, it will become challenging for the rest of the community to maintain and further develop the existing code base. The following table shows the statistical median of the DDS on the whole dataset.

In [1]:
import numpy as np
import pandas as pd
import plotly.io as pio
import plotly.graph_objects as go
import plotly.express as px
from opensustainTemplate import *

In [2]:
df_active = pd.read_csv("../csv/project_analysis.csv")
df_raw = pd.read_csv("../csv/projects.csv")

In [3]:
df_personal_projects = df_active[df_active["organization"].isna()]
df_organization_projects = df_active[df_active["organization"].notna()]
df_inactive = df_raw[(df_raw["project_active"] == False)]
df_top_stargazers = df_active[(df_active["stargazers_count"] > 100)]

fig = go.Figure(
    data=[
        go.Table(
            columnwidth=[100, 50],
            header=dict(
                values=["Group", "Median DDS"],
                line_color="#000000",
                fill_color="#ffffff",
                font_size=18,
            ),
            cells=dict(
                line_color="#ffffff",
                fill_color="#ffffff",
                font_size=16,
                height=30,
                values=[
                    [
                        "All projects",
                        "Active projects in personal namespace",
                        "Active organisation projects",
                        "Active projects",
                        "Inactive projects",
                        "Active projects more than 50 Stars",
                        "Projects with most contributors",
                    ],
                    [
                        round(df_raw["development_distribution_score"].median(), 3),
                        round(
                            df_personal_projects[
                                "development_distribution_score"
                            ].median(),
                            3,
                        ),
                        round(
                            df_organization_projects[
                                "development_distribution_score"
                            ].median(),
                            3,
                        ),
                        round(df_active["development_distribution_score"].median(), 3),
                        round(
                            df_inactive["development_distribution_score"].median(), 3
                        ),
                        round(
                            df_top_stargazers[
                                "development_distribution_score"
                            ].median(),
                            3,
                        ),
                        round(
                            df_active.nlargest(50, "contributors")[
                                "development_distribution_score"
                            ].median(),
                            3,
                        ),
                    ],
                ],
            ),
        )
    ]
)

fig.update_layout(height=255)
fig["layout"].update(margin=dict(l=20, r=20, b=0, t=20))
config = {
  'toImageButtonOptions': {
    'format': 'svg', # one of png, svg, jpeg, webp
  },
  'responsive':'true'
}
config = {
  'toImageButtonOptions': {
    'format': 'svg', # one of png, svg, jpeg, webp
  },
  'responsive':'true'
}
fig.show(config=config)

```{figure} data:image/gif;base64,R0lGODlhAQABAIAAAAAAAP///yH5BAEAAAAALAAAAAABAAEAAAIBRAA7
:figclass: caption-hack
:name: median-dds

\- Median Development Distribution Score within various groups of projects
```

**Across all active and inactive projects, the median DDS is at 0.304. This means that most open source projects depend heavily on a single developer contributing 70% of the commits to a project.** For inactive projects, this value drops to 0.136 while active projects have a DDS of 0.335. The highest values are shown for projects within GitHub organisations, with a DDS of 0.405. The top 50 projects (ranked by stars) have a median DDS of 0.415 and the development communities with the most contributors have the highest DDS value of 0.688 **demonstrating that workload is more evenly distributed among individuals in a large development community**. 

In particular, the difference between inactive and active projects makes it clear that the DDS is an important indicator for the longevity of an open source project. However, a high DDS is not advantageous in every case. [Brooks' Law](https://en.wikipedia.org/wiki/Brooks's_law) is an observation about software project management according to which "adding manpower to a late software project makes it later". Especially for projects of high complexity, very large team sizes quickly lead to overhead and communication problems. Under these conditions, the distribution of work between many can become problematic. One solution to this is to split software projects into modular components that can be managed by smaller groups. This approach is known as the [Unix philosophy](https://en.wikipedia.org/wiki/Unix_philosophy) in software development. 

The following scatter diagram shows the distribution of DDS within different topics. Each circle represents a project and the size of the circles is scaled relative to the size score.

In [4]:
fig = px.scatter(
    df_active,
    x="project_age_in_years",
    y="topic",
    size="size",
    color="development_distribution_score",
    color_continuous_scale=color_continuous_scale,
    custom_data=["project_name", "oneliner", "git_url"],
    size_max=10,
)

fig.update_layout(
    coloraxis_colorbar=dict(title='<a href="https://report.opensustain.tech/chapters/development-distribution-score.html" style = "color: black >DDS</a>',
        orientation='h',
        y=-0.15,
    ),
    yaxis=dict(type="category", categoryorder="total ascending"),
    yaxis_title=None,
    xaxis_title="Project age in years",
    height=1100,  # Added parameter
    # width=1210,
    title="Development Distribution Score within topics",
    hoverlabel=dict(
        bgcolor="white",
    ),
    dragmode=False,
)
fig.update_traces(
    hovertemplate="<br>".join(
        [
            "Project Name: <b>%{customdata[0]}</b>",
            "Project Info: <b>%{customdata[1]}</b>",
            "Git URL: <b>%{customdata[2]}</b>",
        ]
    )
)
fig.add_layout_image(
    dict(
        source=logo_img,
        xref="paper",
        yref="paper",
        x=1,
        y=1,
        sizex=0.05,
        sizey=0.05,
        xanchor="right",
        yanchor="top",
    )
)
fig["layout"].update(margin=dict(l=0, r=0, b=0, t=100))
fig["layout"]["xaxis"]["autorange"] = "reversed"

# Override the save image button’s options
config = {'responsive': True, 
            'toImageButtonOptions':{
                'width': 1200,
                'height': 1200,
                'format': 'png',
                'filename': 'Development Distribution Score within topics'}}

config = {
  'toImageButtonOptions': {
    'format': 'svg', # one of png, svg, jpeg, webp
  },
  'responsive':'true'
}
fig.show(config=config)

```{figure} data:image/gif;base64,R0lGODlhAQABAIAAAAAAAP///yH5BAEAAAAALAAAAAABAAEAAAIBRAA7
:figclass: caption-hack
:name: median-dds-overview

\- Development Distribution Score within topics
```