Skip to content
377 changes: 377 additions & 0 deletions pep-0581.rst
Original file line number Diff line number Diff line change
@@ -0,0 +1,377 @@
PEP: 581
Title: Using GitHub Issues for CPython
Version: $Revision$
Last-Modified: $Date$
Author: Mariatta Wijaya <mariatta@python.org>
Discussions-To: ``#pep581`` stream in zulip
Status: Draft
Type: Process
Content-Type: text/x-rst
Created: 20-Jun-2018


Abstract
========

This PEP outlines the steps required to migrate Python's issue tracker
from Roundup to using GitHub Issues.


Rationale
=========

CPython's development moved to GitHub on February 2017. All other projects within
The PSF's organization are hosted on GitHub and are using GitHub issues.
CPython is still using Roundup as the issue tracker in bugs.python.org (bpo) [1]_.

Why GitHub
----------

GitHub has a lot of nice features readily available out of the box, that are not
currently available in Roundup / bpo.

- APIs we can use to build integrations and automations. There are various existing

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Perhaps add something more specific about creating and deploying bots similar to miss-islington that can provide automation related to CPython-specific workflow.

integrations and applications available from GitHub Marketplace to help with
our workflow. New applications are easily installed and enabled. In addition,
we've had great success with building our own GitHub bots, like miss-islington [14]_,
bedevere [15]_, and the-knights-who-say-ni [16]_.

- Ability to embed/drag and drop screenshots, debug log files into GitHub pull
requests and issues.

- Administrators and core developers can edit issues comments and pull requests.

- Ability to reply to issues and pull request conversations via email.

- It supports two factor authentication.

- It supports markdown and emoji.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There is also a Preview tab, very useful for non-trivial formatting.

It also supports votes, but I'm not sure that it's an advantage :-D

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I like these features too. Will add these in my next iteration :) Thanks.


- It has a preview tab, so we can see how our comment will be rendered, before
actually posting.

- It supports voting (reactions).

- It supports permalink [2]_, allowing us to quote and copy paste
source code easily.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

One can embed links in bpo messages, and they are clickable.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The specific type of quoting being talked about here is this one #681 (comment) which bpo does not support. Again, you have to click away to get to the code.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

After seeing yours and Zach's proper examples, and trying it for myself, I agree that link + quote of complete lines (typical for code) is easier. But it is pretty easy on bpo. So I would agree with "copy paste source code lines more easily."

Copying partial or complete sentences from docs, with extraneous material before and after the desired quote removed, as it typical for doc issues, requires a separate copy paste on either tracker.


- Core developers don't have to maintain the issue infrastructure/site, giving
us more time to focus on the development of Python.

- We can automatically close issues when a PR has been merged [3]_.

- Lowers the barrier to contribution. With more than 28 million users, an open
source contributor is more likely to already have an account, and familiar
with GitHub interface, making it easier to start contributing.

- Email notifications contain metadata [4]_, integrated with GMail, and
allows you to systematically filter emails.

- Provides additional privacy such as offering the user a choice to hide an
email address yet still allow communication with the user through @-mentions.

Issues with bpo / Roundup
-------------------------

- Less than five people maintain bpo. Some of them are core developers.

- It is in Mercurial. Without any CI available, it puts heavy burden on existing
(few) maintainers in terms of reviewing, testing, and applying patches.

- At its current state, it is not equipped to accept lots of contributions from
people who aren't already familiar with the code base.

- The upstream Roundup is in Mercurial. There is an open discussion about
moving the source code of bpo to GitHub [5]_. If the source code of
bpo does move to GitHub, it will become difficult to update patches from
upstream. But as long as it is in Mercurial, it is be difficult to maintain
and onboard new contributors.

- The user interface needs update and redesign. It will require UX/UI research

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

...current web standards, including accessibility.

to keep it up to date with current web standards, including accessibility.

- Email address is exposed. There is no choice to mask it.

- There is no REST API available. There is an open issue in Roundup for adding
REST API [6]_. Last activity was in 2016.

- It sends a number of unnecessary emails and notifications, and it is difficult,
if not impossible, to configure. An example is the nosy email, where we get
email notification whenever someone added themselves as "nosy".
An issue has been filed in upstream Roundup about this since 2012 with
little traction [7]_.

- Creating an account has been a hassle. We've had reports when people have
trouble creating accounts or logging in.

Why not GitLab

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Perhaps rename section to "Why GitHub over other third party git services"

--------------

Had we migrated to GitLab instead of GitHub in 2017, this PEP would have been
titled "Using GitLab Issues for CPython".

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You should mention somewhere "commercial" or "vendor lock-in" and explain why you consider that it's not a blocker issue. I'm sure that the question will arise.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The repo exports that Ernest (has set up? is setting up?) are definitely worth mentioning.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, they're already mentioned later on in a slightly different context. Still, that's the key point for addressing any lock-in concerns: there are PSF hosted data backups, and the only thing that gets potentially tricky is tracking identity following a future migration away from GitHub, and the CLA signing process can cover that.

Why not other issue tracker

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Perhaps combine with the prior section.

---------------------------

Using another issue tracker will require yet another learning curve, of having

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Using Github to track issues has its own learning curve, do you mean its less steep than other bug trackers?

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I meant that since we (existing Python core devs, The PSF, and contributors to Python) have been using GitHub for everything else, there is less learning to do, than if we're to use a different issue tracker.

to learn and get used to a different interface. We'll also need to learn and
figure out how to build the integrations with GitHub.

By using GitHub issues, where the CPython source code is currently hosted, where
pull requests are taking place, we'll be providing consistent experience to
contributors and maintainers, and not having to jump from one interface to another.

Why not focus on improving Roundup / bpo
----------------------------------------

GitHub has many features we like that are already available. We still need to
build out additional integrations and update our bots, but this is something
we already know how to do.

In order to really improve Roundup / bpo, it needs to first migrate to GitHub,

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this is really only a concern with the roundup fork, which might not even be necessary anymore. The python-dev repo which hosts our custom plugins for roundup should be able to migrate to git without issue.

add CI and bots. As I understand it, there is hesitation because upstream Roundup
is still in Mercurial. Someone more familiar with Roundup / bpo needs
to champion this effort. (I'm not volunteering, I'm sorry).

I believe the effort of creating and maintaining GitHub integrations and bots
is much less than the effort needed to get Roundup up to speed and then to continue
maintaining it.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since GitHub is popular, there are also a lot of existing tooling around GitHub. I don't think that it's the case for Roundup. GitHub also has builtin webhooks and a wide choice of "services". Enabling a service is just 2 clicks.


Migration Plan
==============

Backup of GitHub data
---------------------

This effort has been started and being tracked as an issue in core-workflow
[8]_. We're using GitHub Migrations API [9]_
to download GitHub data for CPython on daily basis. The archives will be
dropped in an S3 bucket.

Thanks to Ernest W. Durbin III for working on this.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

❤️


Update the CLA host

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a separate issue and does not belong here. If possible, it should be done independently of issue migration, and discussion requires a different group of people.

-------------------

At the moment the CLA is hosted within bpo. It needs to be updated such that
signing the CLA does not require a bpo account, and it should be hosted outside
of the bpo.

The current CLA process itself is not ideal. Currently contributors to
devguide, peps, and core-workflow need to sign a CLA, and it requires a bpo
account. A bpo account should not be required for those projects.

Currently the CLA process involves real human to manually check the records.
Question: Will it be possible to completely automate the CLA process, so

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This would be a great time to start a thread with PSF legal on this. On the technical side, there are already great projects to handle this like http://clabot.github.io/

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was thinking that this itself might require a separate PEP 😅

it does not require human intervention?

Create labels for issue triage
------------------------------

In bpo, we currently have the following fields for each issues:

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Replace 'each issues' with 'issue' or 'each issue'. As I said above, losing these fields is a big loss.

Reviewing and changing the the options is something we have done periodically in the past and could do again independently of this issue.


Type: behavior, crash, compile error, resource usage, security, performance, enhancement.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This list was once much smaller. We could discuss on committers list whether the added options are really useful.

Components: 2to3, Argument Clinic, asyncio, Build, Cross-build, ctypes, ...

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

IMHO components will be useful on pull requests are well. We already have "type-doc" label for example. It helps when someone wants to review all documentation PRs for example.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

But the opportunity can potentially be taken to trim the component list

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm for trimming it :) but perhaps we'll need more discussions with other core devs.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Another thread for committers list. I want to keep Idle, Tkinter, Windows, MacOS. We can ask whether anyone still want 2to3, Argument clinis, asyncio, ctypes, and some others.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

By using custom labels on GitHub, priority, components, and other bpo fields can be set on issue creation/updating (i.e priority: high, components: asyncio, components: ctypes). For example, adopting a naming convention for labels is a common practice in other open source projects hosted on GitHub.

Priority: release blocker, deferred blocker, critical, high, normal, low

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ditto: when a PR has no associated bug, priority might be useful on PRs as well. But when a PR has an associated bug, the priority would be redundant and may become inconsistent if only updated on one side :-/

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Honestly, I think "release blocker" and "deferred blocker" are the only actually useful priority levels we currently have, and those will always be associated with issues rather than PRs.


Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We need a field to specify the Python version that should be fixed, like the Python Versions field on bpo.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we can use the existing backport labels for that - just set them on the issue to indicate which labels the PR should get.

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah I plan to use the existing "needs backport to:" labels. These can be attached to the issue, and later automatically applied to the PR.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Those could be shorted to 'Backport to: x.y'

We will create the corresponding labels::

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm, this is a lot of labels. Have you given thought or experimented to see how well the Github interface handles this many labels?

@Mariatta Mariatta Jul 10, 2018

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I haven't seen any mention of limit of labels on GitHub anywhere. These are labels created but in really not all of them will be applied to the issue/PR.
In coala we've been working with 65 labels, and we also use bots to automatically categorize PRs.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Aah perfect, you already have a project that has a lot of labels 👍


type-behavior, type-crash, type-compile error, type-resource usage, ...

components-2to3, components-argument clinic, components-asyncio, ...

priority-release blocker, priority-deferred blocker, priority-critical, ...

In addition, we'll create ``needs triage`` label.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please elaborate. How do you qualify that a bug is triagged or not?

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

See "create issue template" below.

If the issue does not have at least one of type-, component, priority-, needs backport to- labels, then bot will apply needs triage label.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think there is an other issue, currently, there are 2 lists, the contributors and the members. do you think you will add a new list for the 'triage' ? on bpo, my bpo account can change the associated component of an issue, could I do the same here?

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it could be helpful to utilize GitHub's multiple issue templates (more info here). I found this to be really helpful on other OSS projects to keep general usage bugs out of the issue tracker by redirecting them elsewhere.

IMO, Babel does this really well (and has great emoji usage too).


Create issue templates
----------------------

We will create an issue template and add the following headers::

---
Type: behavior | crash | compile error | resource usage (choose one)
Components: 2to3 | Argument Clinic | asyncio | Build | ... (can select more than one)
Priority: release blocker | deferred blocker | critical | ...
Needs backport to: 2.7 | 3.6 | 3.7
---

The idea is to allow the issue creator to help us triage the issue.
The above values are pre-filled in the template. Issue creator will remove texts
that do not apply to their issue.

Based on the above headers, bedevere-bot can apply the necessary labels to the
issue. If issue creator did not supply the above headers, the bot will apply
``needs triage`` label. At that point it will require a core developer to

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

a core developer to properly label the issue, and what happens for the bpo accounts with the 'triage' flag? as mine.

properly label the issue.

We can also take advantage of GitHub's multiple issue template feature.

Updates to bedevere
-------------------

Bedevere-bot will need to be updated to recognize the issue headers described above,
and apply the proper labels.

Bedevere-bot can also copy over the labels to pull requests that correspond to
the issue.

Update the devguide
-------------------

Provide explanation in the devguide about new issue workflow and triage labels.

Add a button in bpo to migrate the issue to GitHub
--------------------------------------------------

This will require actual update to the bpo. But I believe this effort needed
is much less than a complete overhaul.

We will create a button in bpo, that will copy over the issue description
and associated comments into a GitHub issue.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What about the existing files, which are typically patch files? Create a comment to embed them in? Or put a link to the original in?


We should not be moving all open issues to GitHub. Issues with little or no
activity should just be closed. Issues with no decision made for years should
just be closed.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No, No, No.

Closing bpo issues is a separate topic from and irrelevant to creating a usable GH cpython issue tracker. It can and has been done without this PEP. More could be done, but not so blindly as you propose. Doing so is a separate discussion, which I won't continue here.

Implementing this PEP is unaffected by bpo issue closures. The main affect of closure is limiting bpo search results. Bpo issue closures do not belong on this PEP and should be removed. Please don't repeat Chris Angelico's mistake of piggybacking a separate proposal onto his main proposal of assignment expressions.

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm assuming your good intention, but I think your last sentence is unnecessary.


If a core developer is still interested in the issue, they can indicate so in
the bpo issue, and later use the button to migrate it over to GitHub.

Make bpo readonly

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Move bpo to be an archival, read only project

-----------------

This should be the final step. Once we start using GitHub issues, make bpo
readonly, not shut it down.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It was not practical to use hg and git simultaneously. In my view, and some others, apparently, github was not fully ready to use when the switch was made. I would say not until Miss Islington started doing backports. (They still take more time than on hg, but the difference is tolerable.) The result was that many people stopped patching python for various periods.

It would be practical to have both issue trackers in use for a short time. So I think disabling bpo should wait until the new system has been used and approved by people other than the creators. Unlike with the repository, testing can be done with real issues for cpython PRs.

So I think approval of this PEP should be provisional, subject to final acceptance review, which would then signal 'stop using bpo'. At that point, shut off new registrations and issues, (which could be discourage earlier), and new comments. 'Readonly' should mean that search still works, and that issues can still be closed instead of transferred.

Do not accept new registrations. Do not allow comments or issues to be created.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

An alternative is to keep bpo: have bpo and GH.

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No. Because then we need to support two different workflows and further complicate our life.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

To minimize the burden of maintaining two issue reporting/commenting workflows, one for GitHub and one for bpo, the following constraints will be made to bpo when GitHub is used for issues:

  • Disable new user registration to bpo.
  • Disable new issue creation on bpo.
  • Lock comments on existing bpo issues. Additional comments would be done on GitHub with a link to the archival bpo issue.



TBD and additional concerns
===========================

Expert index
------------

At the moment, there is a mechanism to add people in expert index automatically
added to the nosy list. We need to replicate this functionality.

A GitHub account should not be a requirement
--------------------------------------------

Back when it was discussed about moving the CPython codebase from Mercurial
to GitHub [10]_ and [11]_, it was brought up that
we need to still allow uploading patches in bpo, and that a GitHub account should
not be a requirement in order to contribute to Python.

If bpo is made readonly, we'll need to come up with a different solution to allow
folks to contribute when they don't own a GitHub account.

One solution is to create a new "python-issues" mailing list, similar to
docs@python.org [12]_ mailing list, to allow people to submit their issues
there.

Related to this, since the migration to GitHub in 2017, I recall one case
[13]_ where we had one contributor who submitted patch to Mercurial, and
refused to create a GitHub account. Because of this, our bot is unable to detect
whether the have signed CLA. Another person had volunteered to upload his
patch to GitHub. But we still require both people to sign the CLA.

That particular situation was complicated. It took up five core developers time
to investigate and manually check the CLA, causing confusion.

Trim off the "Components" list
------------------------------

Is the current "components" list still making sense and relevant?
Can we shorten the list?

Priority list
-------------

Is the current "priority" list useful? Nick Coghlan noted that perhaps only
``release blocker`` and ``deferred blocker`` are useful.

Further questions and discussions
---------------------------------

There is a dedicated `#pep581 <https://python.zulipchat.com/#narrow/stream/130206-pep581>`_
stream in python.zulipchat.com.


Acknowledgements
================

Thanks to Guido van Rossum, Brett Cannon, and Nick Coghlan who were consulted
in the early stage and research of this PEP. Their feedback, concerns, input,
and ideas have been valuable.


References
==========

.. [1] bugs.python.org
(https://bugs.python.org/)

.. [2] Getting permanent links to files
https://help.github.com/articles/getting-permanent-links-to-files/

.. [3] Closing issues using keywords
(https://help.github.com/articles/closing-issues-using-keywords/)

.. [4] About GitHub email notifications
(https://help.github.com/articles/about-email-notifications/)

.. [5] Consider whether or not to migrate bugs.python.org source code
to GitHub repo
https://github.com/python/bugs.python.org/issues/2

.. [6] Roundup issue 2550734 Expose roundup via a RESTful interface
(http://issues.roundup-tracker.org/issue2550734)

.. [7] Roundup issue 2550742 Do not send email by default when adding
or removing oneself from the Nosy list
(http://issues.roundup-tracker.org/issue2550742)

.. [8] Backup GitHub information
(https://github.com/python/core-workflow/issues/20)

.. [9] GitHub Migrations API
(https://developer.github.com/v3/migrations/orgs/)

.. [10] Python-committers email
(https://mail.python.org/pipermail/python-committers/2015-December/003642.html)

.. [11] Python-committers email
(https://mail.python.org/pipermail/python-committers/2015-December/003645.html)

.. [12] docs mailing list
(https://mail.python.org/mailman/listinfo/docs)

.. [13] CPython GitHub Pull request 1505
(https://github.com/python/cpython/pull/1505)

.. [14] miss-islington
(https://github.com/python/miss-islington)

.. [15] bedevere
(https://github.com/python/bedevere)

.. [16] the-knights-who-say-ni
(https://github.com/python/the-knights-who-say-ni)


Copyright
=========

This document has been placed in the public domain.



..
Local Variables:
mode: indented-text
indent-tabs-mode: nil
sentence-end-double-space: t
fill-column: 70
coding: utf-8
End: