andygrunwald / gerrie
A data crawler for Googles code review tool 'Gerrit'
Fund package maintenance!
andygrunwald
paypal.me/andygrunwald
Requires
- php: >=5.4.0
- kriswallsmith/buzz: 0.13
- symfony/class-loader: 2.6.*
- symfony/console: 2.6.*
- symfony/yaml: 2.6.*
Requires (Dev)
- phpunit/phpunit: 4.3.*
This package is not auto-updated.
Last update: 2024-12-21 16:10:21 UTC
README
Gerrie is a data and information crawler for Gerrit, a code review system developed by Google.
Gerrie uses the SSH and REST-APIs offered by Gerrit to transform the data from Gerrit into a RDBMS. Currently only MySQL is supported. After the transformation the data can be used to start simple queries or complex analysis. One usecase is to analyze communites which use Gerrit like TYPO3, Wikimedia, Android, Qt, Eclipse and many more.
- Website: andygrunwald.github.io/Gerrie
- Source code: Gerrie @ GitHub
- Documentation: Gerrie @ Read the Docs
Gerrie is deprecated: watson will be replace Gerrie. Watson benefits from our learnings of developing and maintaing Gerrie in a larger (crawling) scale. Checkout #17 for some more information. Neverless we still merge and support contributions to Gerrie.
Features
- Full imports
- Incremental imports
- Full support of SSH API
- Command line interface
- MySQL as storage backend
- Debugging functionality
- Logging functionality
- Full documented
Getting started
Download application and install dependencies:
$ git clone https://github.com/andygrunwald/Gerrie.git .
$ composer install
Copy config file and adjust configuration (Database, SSH, Gerrit):
$ cp Config.yml.dist Config.yml $ vim Config.yml
A minimalistic configuration for the TYPO3 Gerrit instance with the user max.mustermann can look like:
Database: Host: 127.0.0.1 Username: root Password: Port: 3306 Name: gerrie SSH: KeyFile: /Users/max/.ssh/id_rsa_gerrie Gerrit: TYPO3: - ssh://max.mustermann@review.typo3.org:29418/
Create a new database in your database with name gerrie and setup database scheme:
$ mysql -u root -e "CREATE DATABASE gerrie;" $ ./gerrie gerrie:setup-database --config-file="./Config.yml"
Create an account (e.g. max.mustermann) in the Gerrit instance you want to crawl (e.g. review.typo3.org:29418), add your SSH public key to the Gerrit instance and execute the gerrie:check command to check your environment:
$ ./gerrie gerrie:check --config-file="./Config.yml"
Important: If your SSH key is protected by a passphrase this check will ask you to enter your passphrase to use the private key for this connection. Gerrie does not save or transfer this passphrase to any foreign server. The private key is only necessary to authenticate against the Gerrit instance.
If everything is fine start crawling:
$ ./gerrie gerrie:crawl --config-file="./Config.yml"
Now the crawler starts and is doing its job 🍺
You reading can continue in the documentation in the chapters Installation, Configuration, Commands, Database or Contributing.
Documentation
The complete and detailed documentation can be found at Gerrie @ Read the Docs. The documentation is written in reStructuredText and shipped with the source code and can be found in the docs/ folder.
Source code
The source code can be found at andygrunwald/Gerrie @ GitHub.
Contributing
Contribution is welcome at every time.
Contribution is not limited to source code. Also documentation, issues (bugs, new features, nice improvements), talks at usergroups or conferences and so on. In our documentation you can find more detailed information about contribution.
See Gerrie: Contribution @ Read the Docs.
License
This project is released under the terms of the MIT license.
Support, contact or feedback
If you got questions, got feedback, getting crazy with setting up or using this project or want to drink a 🍺 and talk about this project just contact me.
Write me an email (see Andy @ GitHub) or tweet me (@andygrunwald). And of course, you can just open an issue in the Gerrie tracker.