ghuser.io's database scripts

This repository provides scripts to update the database for the ghuser.io Reframe app. The database consists of JSON files. The production data is stored on AWS. The scripts expect it at ~/data and this can be overridden by setting the GHUSER_DBDIR environment variable.

The fetchBot calls these scripts. It runs every few days on an EC2 instance.

Setup

API keys can be created here.

$ npm install

Usage

Start tracking a user

$ ./addUser.js USER

Stop tracking a user

$ ./rmUser.js USER "you asked us to remove your profile in https://github.com/ghuser-io/ghuser.io/issues/666"

Refresh and clean data for all tracked users

$ export GITHUB_CLIENT_ID=0123456789abcdef0123
$ export GITHUB_CLIENT_SECRET=0123456789abcdef0123456789abcdef01234567
$ export GITHUB_USERNAME=AurelienLourot
$ export GITHUB_PASSWORD=********
$ ./fetchAndCalculateAll.sh
GitHub API key found.
GitHub credentials found.
...
/home/ubuntu/data/users
  2654 users
  largest: gdi2290.json (26 KB)
  total: 5846 KB
/home/ubuntu/data/contribs
  largest: orta.json (144 KB)
  total: 14 MB
/home/ubuntu/data/repos
  112924 repos
  65706 significant repos
  largest: jlord/patchwork.json (712 KB)
  total: 203 MB
/home/ubuntu/data/repoCommits
  largest: CocoaPods/Specs.json (3965 KB)
  total: 397 MB
/home/ubuntu/data/orgs
  11072 orgs
  largest: google-certified-mobile-web-specialists.json (445 B)
  total: 3520 KB
/home/ubuntu/data/nonOrgs.json: 252 KB
/home/ubuntu/data/meta.json: 49 B
total: 623 MB

=> 240 KB/user

real    449m19.774s
user    15m52.644s
sys     2m21.976s

Implementation

Several scripts form a pipeline for updating the database. Here is the data flow:

[ ./addUser.js myUser ]   [ ./rmUser.js myUser ]
                 │             │
                 v             v
              ┌───────────────────┐
              │ users/myuser.json │<───────────┐
              └────────────────┬──┘ │─┐        │
                └──────────────│────┘ │        │                    ╔════════╗
                  └────┬───────│──────┘        │                    ║ GitHub ║
                       │       │               │                    ╚════╤═══╝
                       │       v               │                         │
                       │   [ ./fetchUserDetailsAndContribs.js myUser ]<──┤
                       │                                                 │
                       ├────────────>[ ./fetchOrgs.js ]<─────────────────┤
                       │                   ^     ^                       │
                       │                   │     │                       │
                       │                   v     v                       │
                       │      ┌──────────────┐ ┌─────────────────┐       │
                       │      │ nonOrgs.json │ │ orgs/myOrg.json │─┐     │
                       │      └──────────────┘ └─────────────────┘ │─┐   │
                       │                         └─────────────────┘ │   │
                       │                           └──────────┬──────┘   │
                       │                                      │          │
                       ├──>[ ./fetchRepos.js ]<──────────────────────────┘
                       │             ^                        │
                       │             │                        │
                       │             v                        │
                       │  ┌───────────────────────────┐       │
                       │  │ repo*/myOwner/myRepo.json │─┐     │
                       │  └───────────────────────────┘ │─┐   │
                       │    └───────────────────────────┘ │   │
                       │      └────┬──────────────────────┘   │
                       │           │                          │
                       │           │          ┌───────────────┘
                       │           │          │
                       v           v          v
                   [ ./calculateContribsAndMeta.js ]
                           │               │
                           v               v
       ┌──────────────────────┐         ┌───────────┐
       │ contribs/myuser.json │─┐       │ meta.json │
       └──────────────────────┘ │─┐     └───────────┘
         └──────────────────────┘ │
           └──────────────────────┘

NOTES:

These scripts also delete unreferenced data.

Instead of calling each of these scripts directly, you can call ./fetchAndCalculateAll.sh which will orchestrate them.

Production JSON files

The production JSON files are currently stored on S3 and exposed to front end over HTTPS, e.g.

Every few days a backup named YYYY-MM-DD.tar.gz containing all the JSON files is created, e.g. 2018-10-07.tar.gz.

Contributors

Thanks goes to these wonderful people (emoji key):

_{Aurelien Lourot} 💬 💻 📖 👀	_Charles 💻 📖 🤔	_{Romuald Brillout} 🤔

This project follows the all-contributors specification. Contributions of any kind welcome!

Name		Name	Last commit message	Last commit date
Latest commit History 141 Commits
fetchBot		fetchBot
impl		impl
thirdparty		thirdparty
.all-contributorsrc		.all-contributorsrc
.ghuser.io.json		.ghuser.io.json
.gitignore		.gitignore
.spelling		.spelling
.travis.yml		.travis.yml
LICENSE		LICENSE
README.md		README.md
addUser.js		addUser.js
calculateContribsAndMeta.js		calculateContribsAndMeta.js
fetchAndCalculateAll.sh		fetchAndCalculateAll.sh
fetchOrgs.js		fetchOrgs.js
fetchRepos.js		fetchRepos.js
fetchUserDetailsAndContribs.js		fetchUserDetailsAndContribs.js
findUsersToRemove.js		findUsersToRemove.js
keepUser.js		keepUser.js
package-lock.json		package-lock.json
package.json		package.json
printDataStats.js		printDataStats.js
rmUser.js		rmUser.js

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

ghuser.io's database scripts

Table of Contents

Setup

Usage

Implementation

Production JSON files

Contributors

About

Releases

Packages

Contributors 2

Languages

License

ghuser-io/db

Folders and files

Latest commit

History

Repository files navigation

ghuser.io's database scripts

Table of Contents

Setup

Usage

Implementation

Production JSON files

Contributors

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Contributors 2

Languages

Packages