keboola / staging-provider
Requires
- php: >=8.2
- ext-json: *
- keboola/input-mapping: *@dev
- keboola/key-generator: *@dev
- keboola/output-mapping: *@dev
- keboola/slicer: *@dev
- keboola/storage-api-client: ^18.1
- keboola/storage-api-php-client-branch-wrapper: ^6.0
Requires (Dev)
- keboola/coding-standard: >=14.0
- phpstan/phpstan: ^1.8
- phpstan/phpstan-phpunit: ^1.1
- phpunit/phpunit: ^9.5
- sempro/phpunit-pretty-print: ^1.4
- symfony/dotenv: ^5.2|^6.0|^7.0
- dev-main
- 10.0.1
- 10.0.0
- 9.1.1
- 9.1.0
- 9.0.0
- 8.1.0
- 8.0.0
- 7.1.0
- 7.0.2
- 7.0.1
- 7.0.0
- 6.1.0
- 6.0.4
- 6.0.3
- 6.0.2
- 6.0.1
- 6.0.0
- 5.7.0
- 5.6.0
- 5.5.0
- 5.4.0
- 5.3.0
- 5.2.0
- 5.1.0
- 5.0.0
- 4.1.1
- 4.1.0
- 4.0.0
- 3.0.0
- 2.4.0
- 2.3.0
- 2.2.0
- 2.1.0
- 2.0.0
- 1.1.0
- 1.0.0
- dev-pepa_PAT-255_lazy
- dev-pepa_PAT-255_keyPair
- dev-ujovlado-snowflake-size
- dev-PST-2442-ondra
- dev-ondra-fix-basetype-validation
- dev-pepa_PST-883_customServiceUrl
- dev-pepa_DMD-67
- dev-pepa_k8s_deployment
- dev-roman-pst-1710
- dev-pepa_azClientNamedArgs
- dev-zajca-event-grid
This package is auto-updated.
Last update: 2025-05-07 14:47:56 UTC
README
Installation
composer require keboola/staging-provider
Usage
The staging provider package helps you to properly configure input/output staging factory for various environments.
Typical use-case can be set up a Reader
instance to access some data:
use Keboola\InputMapping\Reader; use Keboola\InputMapping\Staging\StrategyFactory as InputStrategyFactory; use Keboola\StagingProvider\InputProviderInitializer; use Keboola\StagingProvider\Provider\ExistingWorkspaceProvider; use Keboola\StorageApi\Client; use Keboola\StorageApi\Workspaces; use Keboola\StorageApiBranch\ClientWrapper; use Psr\Log\NullLogger; $storageApiClient = new Client(...); $storageApiClientWrapper = new ClientWrapper($storageApiClient, ...); $logger = new NullLogger(); $strategyFactory = new InputStrategyFactory($storageApiClientWrapper, $logger, 'json'); $tokenInfo = $storageApiClient->verifyToken(); $dataDir = '/data'; $workspaceProvider = new ExistingWorkspaceProvider( new Workspaces($storageApiClient), 'my-workspace', // workspace ID new Credentials\ExistingCredentialsProvider( new Configuration\WorkspaceCredentials([ 'password' => 'abcd1234' // workspace password ]), ), ); $providerInitializer = new InputProviderInitializer($strategyFactory, $workspaceProvider, $dataDir); $providerInitializer->initializeProviders( InputStrategyFactory::WORKSPACE_SNOWFLAKE, $tokenInfo ); // now the $strategyFactory is ready to be used $reader = new Reader($strategyFactory);
We start by creating a StrategyFactory
needed by the reader. The strategy itself has no knowledge of which storage
should be used with each staging type. This is what provider initializer does - configure the StrategyFactory
for
a specific type of staging.
To create a provider initializer, we pass it:
- the
StrategyFactory
to initialize - a workspace provider, used to access workspace information for workspace staging
ExistingWorkspaceProvider
in case we want to re-use existing workspaceNewWorkspaceProvider
in case we want a new workspace to be created (based on a component configuration)
- a data directory path used for local staging
Then we call initializeProviders
method to configure the StrategyFactory
for specific staging type. It's up to the
caller to know, which staging type to configure:
- when working with components, each component has staging type defined in its configuration
- sandbox has the type deduced from its workspace
- etc.
The example above presents usage of InputProviderInitializer
for configuration of input mapping StrategyFactory
for
a Reader
. Similarly, we can use OutputProviderInitializer
to configure output mapping StrategyFactory
for a Writer
.
Internals
The main objective of the library is to configure StrategyFactory
so it knows which staging provider to
use with each kind of storage.
Staging
Generally, there are two kinds of staging:
- local staging - used to store data locally on filesystem, represented by
LocalStaging
class - workspace staging - used to store data in a workspace, represented by
WorkspaceStagingInterface
Provider (staging provider)
The StrategyFactory
does not use a staging directly but rather through a provider (ProviderInterface
) so there is
a provider implementation for each kind:
LocalStagingProvider
- for local filesystem stagingWorkspaceProviderInterface
- for Connection workspace staging
The main reason the StrategyFactory
does not use the staging directly is to achieve lazy initialization of the staging -
provider instance is created during bootstrap, but the staging instance is only created when really used.
Workspace provider factory
Local staging is pretty simple. It contains just the path to the data directory, provided by the caller. On the other hand,
things get a bit more complicated with workspace staging as the provider may represent an already existing workspace or
a configuration for creating a new workspace. To achieve this, caller must provide a WorkspaceProviderInterface
.
Currently, there are 2 implementations:
NewWorkspaceProvider
which creates a provider that creates a new workspace based on a component configurationExistingWorkspaceProvider
which creates a provider working with an existing workspace
When using ExistingWorkspaceProvider
, a developer is responsible for providing workspace credentials. Depending on the
situation, the following options are available:
ExistingCredentialsProvider
for situation when we know the exact workspace credentials. For example, when working with a workspace for which the end user provides credentials, like SQL sandbox. Credentials are in the form of a free array, and it's the caller's responsibility to provide correct credentials properties (password, private key, etc.).ResetCredentialsProvider
for situation nobody else accesses the workspace, and we can safely generate new credentials. This is typical when working with a staging workspace, which is accessed only through code (nobody has stored the credentials anywhere).NoCredentialsProvider
for situations when we need to just work with the workspace indirectly (through Connection API) and don't need credentials. When something tries to access the credentials, the provider throws an exception.
Development
First start with creating .env
file from .env.dist
.
cp .env.dist .env
# edit .env to set variable values
To run tests, there is a separate service for each PHP major version (5.6 to 7.4). For example, to run tests against PHP 5.6, run following:
docker compose run --rm tests56
To develop locally, use dev
service. Following will install Composer dependencies:
docker compose run --rm dev composer install
License
MIT licensed, see LICENSE file.