keboola / storage-driver-bigquery
Keboola BigQuery driver
Requires
- php: ^8.1
- ext-json: *
- google/apiclient: ^2.12.1
- google/apiclient-services: 0.282.0
- google/cloud-bigquery: ^1.23
- google/cloud-bigquery-analyticshub: ^0.1.0
- google/cloud-billing: ^1.4
- google/cloud-resource-manager: ^0.3.5
- google/cloud-service-usage: ^0.2.7
- google/protobuf: ^3.21
- keboola/db-import-export: ^2.0
- keboola/php-file-storage-utils: ^0.2.5
- keboola/retry: ^0.5.1
- keboola/storage-driver-common: >=6.2
- keboola/table-backend-utils: >=2.2.1
- psr/log: ^1.1|^2.0|^3.0
- react/async: ^3.0
- symfony/polyfill-php80: ^1.26
Requires (Dev)
- brianium/paratest: ^6.10
- keboola/coding-standard: ^14.0
- keboola/phpunit-retry-annotations: ^0.4.0
- php-parallel-lint/php-parallel-lint: ^1.3
- phpstan/phpstan: ^1.8
- phpstan/phpstan-phpunit: ^1.1
- phpstan/phpstan-symfony: ^1.2
- phpunit/phpunit: ^9.5
- symfony/finder: ^5.4
- symfony/lock: ^6.3
- dev-main
- v5.3.2
- v5.3.1
- v5.3.0
- v5.2.17
- v5.2.16
- v5.2.15
- v5.2.14
- v5.2.13
- v5.2.12
- v5.2.11
- v5.2.10
- v5.2.9
- v5.2.8
- v5.2.7
- v5.2.6
- v5.2.5
- v5.2.4
- v5.2.3
- v5.2.2
- v5.2.1
- v5.2.0
- v5.1.3
- v5.1.2
- v5.1.1
- v5.1.0
- v5.0.0
- v4.0.1
- v4.0.0
- v3.4.1
- v3.4.0
- v3.3.0
- v3.2.0
- v3.1.1
- v3.1.0
- v3.0.5
- v3.0.4
- v3.0.3
- v3.0.2
- v3.0.1
- v3.0.0
- v2.14.0
- v2.13.1
- v2.13.0
- v2.12.0
- v2.11.0
- v2.10.0
- v2.9.0
- v2.8.0
- v2.7.0
- v2.6.0
- v2.5.0
- v2.4.0
- v2.3.0
- v2.2.0
- v2.1.0
- v2.0.0
- v1.2.0
- v1.1.0
- v1.0.0
- v0.1.1
- v0.1
- dev-zajca-big-216-2
- dev-zajca-allow-older-psrlog
- dev-big-123-filter-out-ws-dataset-from-other-ws
- dev-jirka-increase-factor-for-retry
- dev-jirka-ct-1084-add-table-type
- dev-CT-807-php8
- dev-BIG-113-array_conversion
- dev-roman-export-with-filters
- dev-roman-preview-with-filters
- dev-move-tf-to-provisioning
- dev-roman-create-exchanger-when-creating-project
This package is auto-updated.
Last update: 2023-11-27 11:13:35 UTC
README
Keboola high level storage backend driver for Big Query
Setup Big Query
Install Google Cloud client (via Brew), initialize it and log in to generate default credentials.
To prepare the backend use Terraform template. Create a sub folder in the KBC Team Dev (id: 431160969986) folder and fill the folder into the terraform command.
- get missing pieces (organization_id and billing_id) from Connection repository.
- (optional) move
bq-storage-backend-init.tf
out of project directory so new files would be out of git - Run
terraform init
- Run
terraform apply -var folder_id=[folder_id] -var billing_account_id=[billing_id] -var backend_prefix=<your prefix, eg. js-driver-bq>
- After terraform apply ends go to the service project in your folder.
- go to the newly created service project, the project id is listed at the end of the terraform call. (service_project_id). Typically (https://console.cloud.google.com/welcome?project=<service_project_id>)
- click on IAM & Admin
- on left panel choose Service Accounts
- click on email of service account (there is only one, something like js-bq-driver-main-service-acc@js-bq-driver-bq-driver.iam.gserviceaccount.com)
- on to the top choose Keys and Add Key => Create new key
- select Key type JSON
- click on the Create button and the file will be automatically downloaded
- open keyFile.json set content of
private_key
as variableBQ_SECRET
and remove (the whole entry) it from json file- note: simply cut&paste it whole even with the quotes and new lines -> your .env will be like
BQ_SECRET="-----BEGIN PRIVATE KEY-----XXXXZQ==\n-----END PRIVATE KEY-----\n"
- note: simply cut&paste it whole even with the quotes and new lines -> your .env will be like
- remove line breaks from the rest of key file (without
private_key
entry) and set this string as variableBQ_PRINCIPAL
to.env
- You can convert the key to string with
awk -v RS= '{$1=$1}1' <key_file>.json
- You can convert the key to string with
- Create new key as described in 5.i - vii, remove its line breaks and set it as
BQ_KEY_FILE
(even with theprivate_key
)
At the end, your .env
file should look like...
# the id is printed by terraform at the end and it is just the numbers after `folders/` BQ_PRINCIPAL=<the content of the downloaded json key file as single line without private_key entry> BQ_SECRET=<private_key from downloaded json key file (taken from BQ_PRINCIPAL)> BQ_FOLDER_ID=<TF output file_storage_bucket_id : the id of the created folder, just the number, without /folders prefix> BQ_BUCKET_NAME=<TF output file_storage_bucket_id : bucket id created in main project> # choose different BQ_STACK_PREFIX than you Terraform prefix otherwise project created by Terraform will be deleted . e.g. local :) BQ_STACK_PREFIX=local BQ_KEY_FILE=<key file json owned by main service acc>
All done. Now you can try composer loadGcs
script and run tests.
Build docker images
docker-compose build
Xdebug
To run with xdebug use dev-xdebug
container instead of dev
Tests
Run tests with following command.
# This will run all tests docker-compose run --rm dev composer tests # This will run all tests in parallel docker-compose run --rm dev composer paratest # This will run import tests in parallel docker-compose run --rm dev composer paratest-import # This will run export tests in parallel docker-compose run --rm dev composer paratest-export # This will run all tests in parallel excluding import and export docker-compose run --rm dev composer paratest-other
To disable retry copy phpunit-retry.xml.dist
cp phpunit-retry.xml.dist phpunit-retry.xml
Code quality check
#run all bellow but not tests docker-compose run --rm dev composer check #phplint docker-compose run --rm dev composer phplint #phpcs docker-compose run --rm dev composer phpcs #phpcbf docker-compose run --rm dev composer phpcbf #phpstan docker-compose run --rm dev composer phpstan
Full CI workflow
This command will run all checks and run tests
docker-compose run --rm dev composer ci
Using
Project ID: A globally unique identifier for your project. This lib creating project id as combinations of stackPrefix
and projectId
from CreateProjectCommand
A project ID is a unique string used to differentiate your project from all others in Google Cloud. You can use the Google Cloud console to generate a project ID, or you can choose your own. You can only modify the project ID when you're creating the project.
Project ID requirements:
- Must be 6 to 30 characters in length.
- Can only contain lowercase letters, numbers, and hyphens.
- Must start with a letter.
- Cannot end with a hyphen.
- Cannot be in use or previously used; this includes deleted projects.
- Cannot contain restricted strings, such as
google
andssl
.
License
MIT licensed, see LICENSE file.