simpod/clickhouse-client

PHP ClickHouse Client

Fund package maintenance!
simPod

0.7.5 2024-01-19 13:37 UTC

README

Build Status Code Coverage Downloads Infection MSI

Motivation

The library is trying not to hide any ClickHouse HTTP interface specific details. That said everything is as much transparent as possible and so object-oriented API is provided without inventing own abstractions.
Naming used here is the same as in ClickHouse docs.

Contents

Setup

composer require simpod/clickhouse-client  
  1. Read about ClickHouse Http Interface. It's short and useful for concept understanding.
  2. Create a new instance of ClickHouse client and pass PSR factories.
    1. Symfony HttpClient is recommended (performance, less bugs, maintenance)
    2. The plot twist is there's no endpoint/credentials etc. config in this library, provide it via client
  3. See tests
<?php

use Http\Client\Curl\Client;
use Nyholm\Psr7\Factory\Psr17Factory;
use SimPod\ClickHouseClient\Client\PsrClickHouseClient;
use SimPod\ClickHouseClient\Client\Http\RequestFactory;

$psr17Factory = new Psr17Factory;

$clickHouseClient = new PsrClickHouseClient(
    new Client(),
    new RequestFactory(
        $psr17Factory,
        $psr17Factory
    ),
    [],
    new DateTimeZone('UTC')
);

Symfony HttpClient Example

Configure HTTP Client

As said in ClickHouse HTTP Interface spec, we use headers to auth and e.g. set default database via query.

framework:
    http_client:
        scoped_clients:
            click_house.client:
                base_uri: '%clickhouse.endpoint%'
                headers:
                    'X-ClickHouse-User': '%clickhouse.username%'
                    'X-ClickHouse-Key': '%clickhouse.password%'
                query:
                    database: '%clickhouse.database%'

Logging

SimPod\ClickHouseClient\Client\Http\LoggerPlugin is available to be used with HTTPlug PluginClient.

This is the

<?php

declare(strict_types=1);

namespace Cdn77\Mon\Core\Infrastructure\Symfony\Service\ClickHouse;

use Http\Client\Common\PluginClient;
use SimPod\ClickHouseClient\Client\Http\LoggerPlugin;
use SimPod\ClickHouseClient\Logger\SqlLogger;
use Symfony\Component\HttpClient\HttplugClient;
use Symfony\Contracts\HttpClient\HttpClientInterface;

final class HttpClientFactory
{
    public function __construct(private HttpClientInterface $clickHouseClient, private SqlLogger $sqlLogger)
    {
    }

    public function create() : PluginClient
    {
        return new PluginClient(
            new HttplugClient($this->clickHouseClient),
            [new LoggerPlugin($this->sqlLogger)]
        );
    }
}

Time Zones

ClickHouse does not have date times with timezones. Therefore you need to normalize DateTimes' timezones passed as parameters to ensure proper input format.

Following would be inserted as 2020-01-31 01:00:00 into ClickHouse.

new DateTimeImmutable('2020-01-31 01:00:00', new DateTimeZone('Europe/Prague'));

If your server uses UTC, the value is incorrect for you actually need to insert 2020-01-31 00:00:00.

Time zone normalization is enabled by passing DateTimeZone into PsrClickHouseClient constructor.

new PsrClickHouseClient(..., new DateTimeZone('UTC'));

PSR Factories who?

The library does not implement it's own HTTP. That has already been done via PSR-7, PSR-17 and PSR-18. This library respects it and allows you to plug your own implementation (eg. HTTPPlug or Guzzle).

Recommended are composer require nyholm/psr7 for PSR-17 and composer require php-http/curl-client for Curl PSR-18 implementation (used in example above).

Sync API

Select

ClickHouseClient::select()

Intended for SELECT and SHOW queries. Appends FORMAT to the query and returns response in selected output format:

<?php

use SimPod\ClickHouseClient\Client\ClickHouseClient;
use SimPod\ClickHouseClient\Format\JsonEachRow;
use SimPod\ClickHouseClient\Output;

/** @var ClickHouseClient $client */
/** @var Output\JsonEachRow $output */
$output = $client->select(
    'SELECT * FROM table',
    new JsonEachRow(),
    ['force_primary_key' => 1]
);

Select With Params

ClickHouseClient::selectWithParams()

Same as ClickHouseClient::select() except it also allows parameter binding.

<?php

use SimPod\ClickHouseClient\Client\ClickHouseClient;
use SimPod\ClickHouseClient\Format\JsonEachRow;
use SimPod\ClickHouseClient\Output;

/** @var ClickHouseClient $client */
/** @var Output\JsonEachRow $output */
$output = $client->selectWithParams(
    'SELECT * FROM :table',
    ['table' => 'table_name'],
    new JsonEachRow(),
    ['force_primary_key' => 1]
);

Insert

ClickHouseClient::insert()

<?php

use SimPod\ClickHouseClient\Client\ClickHouseClient;

/** @var ClickHouseClient $client */
$client->insert('table', $data, $columnNames);

If $columnNames is provided and is key->value array column names are generated based on it and values are passed as parameters:

$client->insert( 'table', [[1,2]], ['a' => 'Int8, 'b' => 'String'] ); generates INSERT INTO table (a,b) VALUES ({p1:Int8},{p2:String}) and values are passed along the query.

If $columnNames is provided column names are generated based on it:

$client->insert( 'table', [[1,2]], ['a', 'b'] ); generates INSERT INTO table (a,b) VALUES (1,2).

If $columnNames is omitted column names are read from $data:

$client->insert( 'table', [['a' => 1,'b' => 2]]); generates INSERT INTO table (a,b) VALUES (1,2).

Column names are read only from the first item:

$client->insert( 'table', [['a' => 1,'b' => 2], ['c' => 3,'d' => 4]]); generates INSERT INTO table (a,b) VALUES (1,2),(3,4).

If not provided they're not passed either:

$client->insert( 'table', [[1,2]]); generates INSERT INTO table VALUES (1,2).

Async API

Select

Parameters "binding"

<?php

use SimPod\ClickHouseClient\Sql\SqlFactory;
use SimPod\ClickHouseClient\Sql\ValueFormatter;

$sqlFactory = new SqlFactory(new ValueFormatter());

$sql = $sqlFactory->createWithParameters(
    'SELECT :param',
    ['param' => 'value']
);

This produces SELECT 'value' and it can be passed to ClickHouseClient::select().

Supported types are:

  • scalars
  • DateTimeImmutable (\DateTime is not supported because ValueFormatter might modify its timezone so it's not considered safe)
  • Expression
  • objects implementing __toString()

Native Query Parameters

Tip

Official docs

<?php

use SimPod\ClickHouseClient\Client\PsrClickHouseClient;

$client = new PsrClickHouseClient(...);

$output = $client->selectWithParams(
    'SELECT {p1:String}',
    ['param' => 'value']
);

All types are supported (except AggregateFunction, SimpleAggregateFunction and Nothing by design). You can also pass DateTimeInterface into Date* types or native array into Array, Tuple, Native and Geo types

Expression

To represent complex expressions there's SimPod\ClickHouseClient\Sql\Expression class. When passed to SqlFactory its value gets evaluated.

To pass eg. UUIDStringToNum('6d38d288-5b13-4714-b6e4-faa59ffd49d8') to SQL:

<?php

use SimPod\ClickHouseClient\Sql\Expression;

Expression::new("UUIDStringToNum('6d38d288-5b13-4714-b6e4-faa59ffd49d8')");
<?php

use SimPod\ClickHouseClient\Sql\ExpressionFactory;
use SimPod\ClickHouseClient\Sql\ValueFormatter;

$expressionFactory = new ExpressionFactory(new ValueFormatter());

$expression = $expressionFactory->templateAndValues(
    'UUIDStringToNum(%s)',
    '6d38d288-5b13-4714-b6e4-faa59ffd49d8'
);

Snippets

There are handy queries like getting database size, table list, current database etc.

To prevent Client API pollution, those are extracted into Snippets.

Example to obtain current database name:

<?php

use SimPod\ClickHouseClient\Snippet\CurrentDatabase;

$currentDatabaseName = CurrentDatabase::run($client);

List

  • CurrentDatabase
  • DatabaseSize
  • Parts
  • ShowCreateTable
  • ShowDatabases
  • TableSizes
  • Version