[Themaintainers] [EXTERNAL] Re: XKCD comic on maintenance

Don Goodman-Wilson don at maintainerati.org
Wed Aug 19 16:11:24 EDT 2020


Ed,

You’ll enjoy knowing that GH is already parsing dependency files for a wide range of languages and makes that data available over their Dependency Graph API. It’s got some curious bugs and caveats, and of course it only really documents what’s on GitHub as you mentioned, but it’s still pretty useful. I’ve got some code for exploring the full dependency tree for a given project, but it takes hours to pull down everything, but it is useful for discovering distant, and sometimes surprising dependencies. It still doesn’t solve the main problem I raised earlier tho: uncovering even a subset of those packages that depend upon Package X would take ridiculous amounts of resources and a lot of manual intervention.

Still if you’re interested I can invite you (or anyone else!) to the repo so you can try it out yourself.

Don GOODMAN-WILSON
Maintainerati Board
web: maintainerati.org
twitter: @DEGoodmanWilson
cal: calendly.com/degoodmanwilson/
On 19 Aug 2020, 20:06 +0200, Edward Summers <ehs at pobox.com>, wrote:
> Thanks Don, you make an excellent point.
>
> To know what software *depends on* a given piece of software (e.g. ImageMagick) requires that you build a network graph of all (or some relevant portion) of software dependencies, which then lets you then infer the inverse relation:
>
> A hasDependency B ∴ B isDependencyOf A
>
> Building a complete view of the network of software dependencies would be very costly, and would be out of date the moment you "finished" it. But perhaps projects like GHTorrent [1] could help, since they have built a queryable database of GitHub repository metadata?
>
> I haven't used it for this purpose, but it also might be workable to infer some of the isDependencyOf relations using GitHub's search API [2] which lets you search the code in GitHub's repositories. But you would need custom logic for the different types of builds you were interested in (Gemfile, package.json, requirements.txt, pom.xml, Cargo.toml, etc) which could get tedious.
>
> The obvious caveat here is that as big as GitHub is, it does not completely represent the universe of software. Still, it could provide an interesting view?
>
> //Ed
>
> [1] https://ghtorrent.org/
> [2] https://docs.github.com/en/rest/reference/search
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.stevens.edu/pipermail/themaintainers/attachments/20200819/278a5ff5/attachment.html>


More information about the Themaintainers mailing list