# Ranking

Ranking all projects by a total score can provide a much deeper understanding of the ecosystem. While quantifying the state and health of a project remains challenging, using a multidimensional index creates a more comprehensive picture. A repository's total score is a composite index of three dimensions: **size**, **community** and **activity**. Each dimension contains several indicators, represented by an index used to rank projects relative to each other. Each dimension index is referred to as a score.

```{figure} ../images/ranking_calc.png
---
align: center
width: 100%
---
\- The relationship between the dimensions and indicators within the ranking. 
```

For example, the activity score ranks each project using an index composed of _Total Commits Last Year_, _Issues Closed Last Year_, _Day Until Last Issue Closed_, and _Last Release Data_, which is normalised by 1. The weighted sum of all scores (activity, community, and size) is referred to as the "total score". The figure above shows the relationship between the dimensions and indicators within the ranking (for implementation detail, see this [code cell](#dimensions-and-calculations).)

Unlike [stars](./popularity.ipynb), which can provide insight into a project's overall popularity, ranking by total score unveils unpopular but otherwise strong projects. For example, larger projects like [EnergyPlus](https://github.com/NREL/EnergyPlus) suddenly rises to the top. However, as with any index, there are limitations. In this case, monolithic software developments have a higher probability of achieving a high score, meaning that projects which rely more on a modular approach (i.e., projects distributed across multiple repositories) may be significantly underrepresented.

The ranking of the projects according to their [activity](#activity-score) rating highlights the young projects that are still developing rapidly. [DeepTreeAttention](https://github.com/weecology/DeepTreeAttention) in particular stands out here, which is mainly developed by a single person. Other young projects such as [Ozon3](https://github.com/Ozon3Org/Ozon3), [cmip6-downscaling](https://github.com/carbonplan/cmip6-downscaling) or [PowerSimulations.jl](https://github.com/NREL-SIIP/PowerSimulations.jl) show not only a very high activity but the DDS also shows a strong growth of the community of this young projects.

The real value of such health analytics comes into play when development and community data is combined with usage data. Unfortunately, this data is currently only available to a limited extent via Python dependencies. Further work is required to extend usage metrics to include other software package managers and survey methods.

`````{admonition} Tip
:class: tip
Click the project name and go directly to the repository.
`````

In [17]:
import numpy as np
import pandas as pd
import plotly.io as pio
import plotly.graph_objects as go
import plotly.express as px
from opensustainTemplate import *

In [18]:
df_active = pd.read_csv("../csv/project_analysis.csv")

In [19]:
df_total_score = df_active.nlargest(40, "total_score")

fig = px.bar(
    df_total_score,
    x=df_total_score["total_score"],
    y=df_total_score["project_name"],
    orientation="h",
    range_x=(0.85, 0.96),
    custom_data=["oneliner", "topic", "git_url"],
    color=df_total_score["development_distribution_score"],
    color_continuous_scale=color_continuous_scale,
)

fig.update_layout(
    height=1000,  # Added parameter
    #width=600,
    xaxis_title="Total Score",
    yaxis_title=None,
    title="Top 40 total score",
    coloraxis_colorbar=dict(
        title='<a href="https://report.opensustain.tech/chapters/development-distribution-score.html" style = "color: black >DDS</a>',
        orientation='h',
        y=-0.15,
        x=0.4
    ),
    hoverlabel=dict(bgcolor="white"),
)
fig.update(layout_showlegend=False)
fig["layout"].update(margin=dict(l=200, r=0, b=0, t=40))

fig.add_layout_image(
    dict(
        source=logo_img,
        xref="paper",
        yref="paper",
        x=1,
        y=0,
        sizex=0.05,
        sizey=0.05,
        xanchor="right",
        yanchor="bottom",
    )
)

fig.update_traces(
    hovertemplate="<br>".join(
        [
            "Project Info: <b>%{customdata[0]}</b>",
            "Topic: <b>%{customdata[1]}</b>",
            "Git URL: <b>%{customdata[2]}</b>",
        ]
    )
)
fig["layout"]["yaxis"]["autorange"] = "reversed"
config = {
  'toImageButtonOptions': {
    'format': 'svg', # one of png, svg, jpeg, webp
  },
  'responsive':'true'
}
fig.show(config=config)

```{figure} data:image/gif;base64,R0lGODlhAQABAIAAAAAAAP///yH5BAEAAAAALAAAAAABAAEAAAIBRAA7
:figclass: caption-hack
:name: total-score
\- The 40 Projects with the highest total score
```

In [20]:
df_activity_score = df_active.nlargest(40, "activity")

fig = px.bar(
    df_activity_score,
    x=df_activity_score["activity"],
    y=df_activity_score["project_name"],
    orientation="h",
    custom_data=["oneliner", "topic", "git_url"],
    color=df_activity_score["development_distribution_score"],
    color_continuous_scale=color_continuous_scale,
    range_x=(0.7, 0.9),
)

fig.update_layout(
    height=1000,  # Added parameter
    #width=600,
    xaxis_title="Activity score",
    yaxis_title=None,
    title="Top 40 activity score",
    coloraxis_colorbar=dict(
        title='<a href="https://report.opensustain.tech/chapters/development-distribution-score.html" style = "color: black >DDS</a>',
        orientation='h',
        y=-0.15,
        x=0.4
    ),
    hoverlabel=dict(
        bgcolor="white",
    ),
    dragmode=False,
)
fig.update(layout_showlegend=False)
fig["layout"].update(margin=dict(l=200, r=0, b=0, t=40))

fig.add_layout_image(
    dict(
        source=logo_img,
        xref="paper",
        yref="paper",
        x=1,
        y=0,
        sizex=0.05,
        sizey=0.05,
        xanchor="right",
        yanchor="bottom",
    )
)

fig.update_traces(
    hovertemplate="<br>".join(
        [
            "Project Info: <b>%{customdata[0]}</b>",
            "Topic: <b>%{customdata[1]}</b>",
            "Git URL: <b>%{customdata[2]}</b>",
        ]
    )
)
fig["layout"]["yaxis"]["autorange"] = "reversed"
config = {
  'toImageButtonOptions': {
    'format': 'svg', # one of png, svg, jpeg, webp
  },
  'responsive':'true'
}
fig.show(config=config)

```{figure} data:image/gif;base64,R0lGODlhAQABAIAAAAAAAP///yH5BAEAAAAALAAAAAABAAEAAAIBRAA7
:figclass: caption-hack
:name: activity-score
\- The 40 Projects with the highest activity score
```

In [21]:
df_community_score = df_active.nlargest(40, "community")

fig = px.bar(
    df_community_score,
    x=df_community_score["community"],
    y=df_community_score["project_name"],
    orientation="h",
    range_x=(0.85, 1),
    custom_data=["oneliner", "topic", "git_url"],
    color=df_community_score["development_distribution_score"],
    color_continuous_scale=color_continuous_scale,
)

fig.update_layout(
    height=1000,  # Added parameter
    #width=600,
    xaxis_title="Community score",
    yaxis_title=None,
    title="Top 40 community score",
    coloraxis_colorbar=dict(
        title='<a href="https://report.opensustain.tech/chapters/development-distribution-score.html" style = "color: black >DDS</a>',
        orientation='h',
        y=-0.15,
        x=0.4
    ),
    hoverlabel=dict(
        bgcolor="white",
    ),
    dragmode=False,
)
fig.update(layout_showlegend=False)
fig["layout"].update(margin=dict(l=200, r=0, b=0, t=40))

fig.add_layout_image(
    dict(
        source=logo_img,
        xref="paper",
        yref="paper",
        x=1,
        y=0,
        sizex=0.05,
        sizey=0.05,
        xanchor="right",
        yanchor="bottom",
    )
)

fig.update_traces(
    hovertemplate="<br>".join(
        [
            "Project Info: <b>%{customdata[0]}</b>",
            "Topic: <b>%{customdata[1]}</b>",
            "Git URL: <b>%{customdata[2]}</b>",
        ]
    )
)
fig["layout"].update(margin=dict(l=30, r=0, b=0, t=40))
fig["layout"]["yaxis"]["autorange"] = "reversed"
config = {
  'toImageButtonOptions': {
    'format': 'svg', # one of png, svg, jpeg, webp
  },
  'responsive':'true'
}
fig.show(config=config)

```{figure} data:image/gif;base64,R0lGODlhAQABAIAAAAAAAP///yH5BAEAAAAALAAAAAABAAEAAAIBRAA7
:figclass: caption-hack
:name: community-score
\- The 40 Projects with the highest community score
```

In [22]:
df_size_score = df_active.nlargest(40, "size")

fig = px.bar(
    df_size_score,
    x=df_size_score["size"],
    y=df_size_score["project_name"],
    orientation="h",
    range_x=(0.93, 1),
    custom_data=["oneliner", "topic", "git_url"],
    color=df_size_score["development_distribution_score"],
    color_continuous_scale=color_continuous_scale,
)

fig.update_layout(
    height=1000,  # Added parameter
    #width=600,
    xaxis_title="Size score",
    yaxis_title=None,
    title="Top 40 size score",
    coloraxis_colorbar=dict(
        title='<a href="https://report.opensustain.tech/chapters/development-distribution-score.html" style = "color: black >DDS</a>',
        orientation='h',
        y=-0.15,
        x=0.4
    ),
    dragmode=False,
    hoverlabel=dict(bgcolor="white"),
)
fig.update(layout_showlegend=False)
fig.add_layout_image(
    dict(
        source=logo_img,
        xref="paper",
        yref="paper",
        x=1,
        y=0,
        sizex=0.05,
        sizey=0.05,
        xanchor="right",
        yanchor="bottom",
    )
)

fig.update_traces(
    hovertemplate="<br>".join(
        [
            "Project Info: <b>%{customdata[0]}</b>",
            "Topic: <b>%{customdata[1]}</b>",
            "Git URL: <b>%{customdata[2]}</b>",
        ]
    )
)
fig["layout"].update(margin=dict(l=30, r=0, b=0, t=40))
fig["layout"]["yaxis"]["autorange"] = "reversed"
config = {
  'toImageButtonOptions': {
    'format': 'svg', # one of png, svg, jpeg, webp
  },
  'responsive':'true'
}
fig.show(config=config)

```{figure} data:image/gif;base64,R0lGODlhAQABAIAAAAAAAP///yH5BAEAAAAALAAAAAABAAEAAAIBRAA7
:figclass: caption-hack
:name: size-score

\- The 40 Projects with the highest size score
```

## Dimensions and Calculations

```python
# Each project is ranked according to different indicators in the dimensions of community, activity and size. 
# A value of 1 represents the highest rank and 0 the lowest.
# The individual values are summed up within the dimensions to create the scores for the different dimensions.
df_active["activity"] = (
    df_active["total_commits_last_year"].rank(pct=True)
    + df_active["issues_closed_last_year"].rank(pct=True)
    + df_active["days_until_last_issue_closed"].rank(pct=True)
    + df_active["last_released_date"].rank(pct=True, na_option="top")
) / 4

df_active["community"] = (
    df_active["contributors"].rank(pct=True)
    + df_active["development_distribution_score"].rank(pct=True)
    + df_active["reviews_per_pr"].rank(pct=True)
) / 3

df_active["size"] = (
    df_active["total_number_of_commits"].rank(pct=True)
    + df_active["contributors"].rank(pct=True)
    + df_active["closed_issues"].rank(pct=True)
    + df_active["closed_pullrequests"].rank(pct=True)
) / 4

# The scores are summed up and normalised so that 1 represents the largest total score. 
df_active["total_score"] = (
    df_active["activity"] / df_active["activity"].max()
    + df_active["community"] / df_active["community"].max()
    + df_active["size"] / df_active["size"].max()
) / 3
```