Cleaning Up GitHub (for Data Science) | by S. T. Lanier | Jan, 2021


Github, from its conception, supplies somewhat minimalistic construction. It lacks a real listing, or report, construction. For instance, nearly all of the ones 339 repositories of mine are from Flatiron School’s Data Science Bootcamp, however I will be able to’t simply throw all of them in a “Flatiron” folder as a result of GitHub inherently lacks that construction. There are various choices for mimicking that construction, even though, with various levels of bastardization required of the instrument’s authentic goal.

For the aim of cleansing up your repositories web page, I feel subtree is the winner between the 2, however I’m positive any person available in the market would have higher use of git submodule. For a rather detailed dialogue at the distinction between the 2, learn here, here, and here, however the principle distinction is that this:

  • submodule leaves a pointer throughout the outer repository pointing to a selected devote within the interior repository (it doesn’t transfer the interior repo within the outer repo the way in which we consider a report shifting within a folder) and has an particular command in git, git submodule, making it simple to setup, however exhausting to deal with thereafter;
  • subtree, however, in reality strikes the code of the interior repo into the outer repo, like shifting a report right into a folder, however is does now not have a default command in git, making it a bit of tougher to setup however more straightforward to deal with afterwards.

These two strategies can temporarily turn out to be time eating if making an attempt to make use of their construction throughout loads of repos, however you might get to stay your contributions graph.

Just from the identify, this one sounds promising––my intuition when listening to this identify is one thing like “I am getting to team more than one repositories in combination underneath a unmarried undertaking”––however has a capability a lot nearer to a todo checklist for maintaining a tally of problems, pull requests, and notes. Still, I’ve heard of other people the use of this selection with the intention to arrange repositories (as much as five repos in line with undertaking).

This one was once the winner for me. You can create organizations to team repos underneath and it 1) gets rid of the repos out of your repository web page and a pair of) lists the group identify as an alternative. So in that vital regard, that is arguably the nearest you’ll be able to come to mimicking a listing construction on GitHub. It seems like this:

New group rotated in purple, backside left nook. Image by creator.
Hover impact for organizations. Image by creator.

Hover your mouse over it, and it supplies some details about the group. In the picture, it nonetheless says I’ve some 300 repositories, however that’s as a result of I haven’t moved maximum of them over to the brand new group but. Transferring a repository into a company, even one owned by you, gets rid of the repository from the checklist of repositories for your profile web page and gets rid of any contributions made to these repositories out of your contributions graph. For me, this was once a small value to pay for a pleasing position to stay those 300 repositories tucked away in combination and out of sight, however I’m positive for some that is the worst of each worlds: the repos are nonetheless round and I misplaced the contribution graph.

For a good higher instance, take a look at this article by Andrei Cioara.


Please enter your comment!
Please enter your name here