touilelhadj / biostat-php
A pure-PHP biostatistics library implementing descriptive, bivariate and multivariate methods (logistic regression, VIF, Box-Tidwell, GLMM by PQL, GEE with Liang-Zeger sandwich variance, MICE multiple imputation, Rubin's pooling) for survey-based epidemiological studies.
Requires
- php: >=8.0
Requires (Dev)
- phpstan/phpstan: ^1.10
- phpunit/phpunit: ^9.6 || ^10.0
- squizlabs/php_codesniffer: ^3.7
README
A pure-PHP biostatistics library implementing descriptive, bivariate and multivariate methods for survey-based epidemiological studies β including logistic regression, VIF, BoxβTidwell, GLMM, GEE, MICE and Rubin's pooling.
π Overview
biostat-php brings the analytic side of a cross-sectional epidemiological
study to PHP. It implements, in pure PHP and without external
dependencies, the same family of statistical methods normally available
only in R (stats, car, lme4, geepack, mice) or SPSS.
The library was extracted from a real research instrument β a trilingual data-collection platform used to study adolescent overweight in the Wilaya of Chlef, Algeria β where it ran the 48 pre-registered hypotheses of the underlying master thesis on a low-cost shared-hosting environment that did not allow R or Python.
Every method is cross-checked against R 4.x and IBM SPSS 25; the
quantitative comparison is documented in
docs/validation-tables.md and exercised by
the PHPUnit suite in tests/.
π Installation
composer require touilelhadj/biostat-php
Requirements:
- PHP β₯ 8.0
- No external PHP extension beyond the defaults (
mbstring,jsononly for the test fixtures) - No system library, no Composer dependency at runtime
β‘ Quick start
<?php require_once __DIR__ . '/vendor/autoload.php'; use TouilElhadj\BiostatPhp\BiostatAnalysis; $stats = new BiostatAnalysis(); // Two-by-two table: 45 exposed cases, 18 exposed non-cases, // 30 unexposed cases, 22 unexposed non-cases. $chi = $stats->chi2Test2x2(45, 18, 30, 22); echo "chiΒ² = {$chi['chi2']}, p = {$chi['p']}\n"; // chiΒ² = 1.8027, p = 0.1794 $or = $stats->oddsRatio(45, 18, 30, 22); echo "OR = {$or['or']} [{$or['ci_low']}, {$or['ci_high']}]\n"; // OR = 1.83 [0.84, 3.98]
// Welch's t-test $t = $stats->tTest( [10, 12, 11, 13, 14, 12, 11], [15, 17, 16, 18, 16, 17, 15] ); echo "t = {$t['t']}, df = {$t['df']}, p = {$t['p']}\n"; // Logistic regression (one continuous covariate) $y = [0,0,0,1,1,1,1,1,0,1,1,0,1,1,1,0,0,1,1,1]; $x = [1,1,2,2,3,3,4,4,1,2,3,4,5,5,4,2,1,3,4,5]; $lr = $stats->logisticRegression($y, $x); echo "OR = {$lr['or']}, p = {$lr['p']}\n"; // BenjaminiβHochberg FDR adjustment $adj = BiostatAnalysis::benjaminiHochberg([0.001, 0.01, 0.04, 0.2, 0.5]); // [0.005, 0.025, 0.067, 0.25, 0.5]
More examples β including GLMM and MICE β are in examples/.
A complete per-method walkthrough with R-equivalent commands is in
docs/usage-examples.md.
π§ͺ Method catalogue
| Family | Methods |
|---|---|
| Descriptive | mean, std, median, quantile |
| 2 Γ 2 tables | chi2Test2x2 (with Yates), oddsRatio (with HaldaneβAnscombe) |
| Means | tTest (Welch), anova |
| Correlation | pearson, spearman |
| Binomial | binomialTest |
| Logistic regression | logisticRegression, logisticRegressionMulti, hosmerLemeshow |
| Multiple testing | benjaminiHochberg (FDR) |
| Multicollinearity | vif |
| Logit linearity | boxTidwell |
| Mixed / clustered | glmmLogistic (PQL), geeLogistic (LiangβZeger sandwich) |
| Missing data | mice (PMM + chained Gibbs), rubinPool (Rubin's rules) |
Mathematical formulations and bibliographic references for every method
are in docs/statistical-methods.md.
β Verification
Reference values for every method were computed independently in R 4.3.0 and IBM SPSS 25. The tolerances used in the test suite are:
| Family | Tolerance |
|---|---|
| p-values | Β± 0.01 |
| Odds ratios | Β± 0.01 |
| Correlation coefficients | Β± 0.001 |
| Regression coefficients | Β± 0.05 |
| Variance components (GLMM) | Β± 0.05 |
A complete numerical comparison table β R command, R value, PHP value,
|Ξ| β lives in docs/validation-tables.md.
π§° Running the tests
composer install composer test # run PHPUnit composer test:coverage # with HTML coverage report composer analyse # PHPStan level 6 composer cs # PSR-12 check
π Repository layout
biostat-php/
βββ src/
β βββ BiostatAnalysis.php main class (β 1 460 LOC)
β βββ LinearAlgebra.php matMul / transpose / OLS (trait)
β βββ Distributions.php normal / ΟΒ² / t / F CDFs (trait)
β βββ Exceptions/
β βββ ConvergenceException.php
βββ tests/ PHPUnit tests against R reference values
βββ docs/
β βββ statistical-methods.md formal mathematical specification
β βββ validation-tables.md R vs PHP numerical comparison
β βββ usage-examples.md worked example per public method
βββ examples/ stand-alone runnable scripts
βββ composer.json PSR-4 autoload + dev tools
βββ phpunit.xml PHPUnit configuration
βββ paper.md JOSS paper (β 1 000 words)
βββ paper.bib BibTeX references
βββ CITATION.cff machine-readable citation
βββ LICENSE MIT
βββ README.md
π Citation
If you use this library in a research publication, please cite it. A
machine-readable citation file is provided in
CITATION.cff.
Recommended citation
TOUIL, E. (2026). biostat-php: a pure-PHP biostatistics library for survey-based epidemiological studies. Version 1.0.0. https://github.com/Touil-Elhadj/biostat-php
BibTeX
@software{touil_biostat_php_2026, author = {TOUIL, Elhadj}, title = {biostat-php: a pure-PHP biostatistics library for survey-based epidemiological studies}, year = {2026}, version = {1.0.0}, url = {https://github.com/Touil-Elhadj/biostat-php} }
π Real-world use
biostat-php is the analytic engine of the
chlef-touilelhadj
platform, a trilingual web instrument used to enrol 1 220 adolescents
(14β19 years) in the Wilaya of Chlef, Algeria, during the 2025β2026
academic year. All statistical results of the underlying master thesis
were produced by this library and independently verified against R 4.x
and SPSS 25.
π€ Contributing
Contributions are welcome β see CONTRIBUTING.md for
the development workflow and the statistical-contribution checklist
(every new method must ship with a closed-form reference value from R
or SPSS plus a PHPUnit assertion).
π‘ Security
To report a vulnerability please follow the procedure in
SECURITY.md.
π License
Released under the MIT License β Β© 2026 TOUIL Elhadj.