Tools

  • maven-miner Mines Maven Central and creates a global dependency graph.

  • source{d} Engine Powerful language-agnostic analysis of your source code and git history.

  • reaper Calculate the score of a repository based on best engineering practices.

Datasets

  • GH Archive Records the public GitHub timeline, archive it, and make it easily accessible for further analysis.

  • The GHTorrent project An effort to create a scalable, queriable, offline mirror of data offered through the Github REST API.

  • jsDelivr is a Content Delivery Network (CDN) that can be used to download the GitHub files without any rate limit. (Ask Zimin about how to get the difference between two versions of the same file).

Influential papers