taeluf / json-to-mysql
v2.0.x-dev
2023-07-15 22:40 UTC
Requires (Dev)
- taeluf/tester: v0.3.x-dev
This package is auto-updated.
Last update: 2024-10-16 06:35:01 UTC
README
Convert large JSON files into MySql. Uses fgets()
& parses json line-by-line to minimize memory usage. Also works for smaller json. Also chunks the output sql
files to your desired size.
Example
JSON must be in this format:
[
{"name": "row1", "index": 0},
{"name": "row2", "index": 1}
]
Executing it:
<?php
// variables you'll want to change
$source_file = $this->file('test/input/main.json');
$out_prefix = $this->file('test/input/main.out/test_sql');
$table_name = 'test_sql';
// max `strlen()` of the sql put into any one output file
$max_sql_len = 1024*1024;
// a pdo instance for running the create & insert
$pdo = new \PDO("sqlite::memory:");
// generate all the sql
$jtm = new \Tlf\Jtm();
$schema_info = $jtm->generate_schema_info($source_file);
$schema = $jtm->generate_schema($source_file, $schema_info);
$create_sql = $jtm->generate_sql_create($table_name, $schema);
$jtm->generate_sql_insert($source_file, $out_prefix, $table_name, $schema, $max_sql_len);
//execute the sql
$pdo->exec($create_sql);
$num_rows = $jtm->execute_insert_files($out_prefix, $pdo);
Install
composer require taeluf/json-to-mysql v2.0.x-dev
or in your composer.json
{"require":{ "taeluf/json-to-mysql": "v2.0.x-dev"}}
Versions (branches)
- v1.0: old version with a disappointing api
- v2.0: new version with a much cleaner api
Features
- Read small or large JSON source file and output mysql create statement & insert statements
- Chunk insert statements, limiting max length of output files
- Output column data from all rows of json file
- Convert nested arrays into JSON strings for INSERT
- TRIES to use correct column type (text, bool, varchar, int, float)
- Build table from generated schema
Very small RAM footprint - Reads input file line-by-line & outputs insert sql files line-by-line
Extra Notes
Warnings
- Set your length to at least
1024*1024
(roughly one megabyte), or you might end up with several thousand files from large datasets - Setting a length SMALLER than the length of an insert statement... is probably really bad
Column Data ($jtm->generate_schema_info()
)
- Each
is_
entry counts the number of rows that matched thatis_
. sampleValue
is the longest found value for that column
<?php
[
'name' => [
'maxlen' => 4,
'minlen' => 4,
'count' => 2,
'is_string' => 2,
'is_numeric' => 0,
'is_int' => 0,
'is_float' => 0,
'is_bool' => 0,
'is_array' => 0,
'is_date' => 0,
'is_json' => 0,
'sampleValue' => 'row2',
],
'index'=>[
'maxlen' => 1,
'minlen' => 1,
'count' => 2,
'is_string' => 0,
'is_numeric' => 2,
'is_int' => 2,
'is_float' => 2,
'is_bool' => 0,
'is_array' => 0,
'is_date' => 0,
'is_json' => 0,
'sampleValue' => 1,
]
],
TODO (maybe)
- a function like
create_and_insert_from_json($json_file_path, $pdo)
... to just have a one-liner that does it all - Write bash script to execute generation from cli
- Write a bash script to compress, upload, & uncompress the sql files
- Write a bash script that will import all the sql files into the database