andreinocenti / file-type-detector-php
A PHP Lib to detect file type (category, MIME, extension) from local paths, URLs or data URIs using MIME, extensions, magic numbers and HTTP HEAD.
Requires
- php: >=8.2
Requires (Dev)
- pestphp/pest: ^3.0
This package is auto-updated.
Last update: 2025-10-01 19:44:06 UTC
README
A PHP package to accurately identify a file type from a local path, URL (HEAD), or Data URI, combining multiple strategies: finfo
(real MIME), binary signatures (magic numbers), extension lookups, and HTTP HEAD Content-Type (with SSRF/security controls).
Perfect for upload validation, download gatekeeping, conditional processing (e.g., images vs. docs), and indexing.
PHP MIME detection, detect file type by extension and magic number, validate uploads in Laravel/PHP, detect docx/xlsx/pptx, detect mp4/m4a/heic/avif, tell zip vs docx, anti-SSRF HTTP HEAD.
✨ Highlights
- Multi-strategy:
finfo
→ smart refinements → magic → extension (fallback). - Family-level refinements:
- ZIP: distinguishes plain
.zip
fromdocx/xlsx/pptx
, EPUB, ODF (odt/ods/odp), JAR, APK, 3MF, KMZ by peeking internal entries. - EBML: distinguishes WebM vs Matroska (MKV).
- ISO-BMFF: identifies MP4/M4A/HEIC/AVIF based on ftyp brand.
- ZIP: distinguishes plain
- Security (URLs): protections against SSRF, unsafe redirects, limited protocols, timeouts, allow-/block-lists, private network blocking (optional).
- Configurable overrides: inject your own ext→mime and mime→category maps without touching the core.
- Simple, typed API (PHP 8.2+), PSR-12 compliant.
🧩 Supported types & extensions
Below is a snapshot (non-exhaustive; you can expand via overrides).
Images
Extensions | Primary MIME | Category |
---|---|---|
jpg, jpeg | image/jpeg | image |
png | image/png | image |
gif | image/gif | image |
webp | image/webp | image |
avif | image/avif | image |
heic, heif | image/heic, image/heif | image |
tiff, tif | image/tiff | image |
bmp | image/bmp | image |
ico | image/vnd.microsoft.icon | image |
psd | image/vnd.adobe.photoshop | image |
svg | image/svg+xml | image |
cr2, nef, arw, dng | image/x-* (raw) | raw-image |
Video
Extensions | Primary MIME | Category |
---|---|---|
mp4, m4v | video/mp4 | video |
webm | video/webm | video |
mkv | video/x-matroska | video |
mov | video/quicktime | video |
3gp | video/3gpp | video |
Audio
Extensions | Primary MIME | Category |
---|---|---|
mp3 | audio/mpeg | audio |
wav | audio/wav | audio |
flac | audio/flac | audio |
ogg, oga | audio/ogg | audio |
opus | audio/opus | audio |
aac | audio/aac | audio |
m4a | audio/mp4 | audio |
mid, midi | audio/midi | audio |
caf | audio/x-caf | audio |
Documents / Text / Code
Extensions | Primary MIME | Category |
---|---|---|
application/pdf | ||
doc | application/msword | document |
docx | application/vnd.openxmlformats-officedocument.wordprocessingml.document | document |
xls | application/vnd.ms-excel | spreadsheet |
xlsx | application/vnd.openxmlformats-officedocument.spreadsheetml.sheet | spreadsheet |
ppt | application/vnd.ms-powerpoint | presentation |
pptx | application/vnd.openxmlformats-officedocument.presentationml.presentation | presentation |
rtf | application/rtf | document |
odt, ods, odp | application/vnd.oasis.opendocument.* | document/spreadsheet/presentation |
epub | application/epub+zip | ebook |
txt | text/plain | text |
csv | text/csv | spreadsheet |
md | text/markdown | text |
html, htm | text/html | code |
json | application/json | code |
xml | application/xml | code |
yaml, yml | text/yaml | code |
php, js, css | text/x-php, application/javascript, text/css | code |
Compressed / Packaged
Extensions | Primary MIME | Category |
---|---|---|
zip, cbz | application/zip | archive |
7z | application/x-7z-compressed | archive |
rar, cbr | application/x-rar-compressed | archive |
tar | application/x-tar | archive |
gz | application/x-gzip | archive |
bz2 | application/x-bzip2 | archive |
xz | application/x-xz | archive |
zst | application/zstd | archive |
jar | application/java-archive | code |
apk | application/vnd.android.package-archive | code |
kmz | application/vnd.google-earth.kmz | gis |
3mf | model/3mf | 3d-model |
Executables / Fonts / Others
Extensions | Primary MIME | Category |
---|---|---|
exe | application/x-dosexec | executable |
elf | application/x-executable | executable |
mach-o | application/x-mach-binary | executable |
ttf, otf, woff, woff2 | font/* | font |
iso | application/x-iso9660-image | disk-image |
torrent | application/x-bittorrent | torrent |
m3u, m3u8, pls | audio/x-mpegurl, application/vnd.apple.mpegurl, application/pls+xml | playlist |
srt, vtt, ass, ssa | application/x-subrip, text/vtt, text/x-ssa | subtitle |
pem, der, p7m, p7s | x509/pkcs7 | certificate |
ics | text/calendar | calendar |
vcf | text/vcard | contact |
eml | message/rfc822 | |
kml | application/vnd.google-earth.kml+xml | gis |
gpx | application/gpx+xml | gis |
obj, stl, gltf | model/* | 3d-model |
step, stp, iges, igs, dxf, dwg | model/* or image/vnd.* | cad |
You can extend/modify via overrides; see below.
📦 Installation
composer require andreinocenti/file-type-detector-php
Requirements:
- PHP 8.2+
- fileinfo extension enabled (for
finfo
) - (Optional) cURL for more robust HEAD on URLs
🚀 Basic Usage
use AndreInocenti\FileTypeDetector\FileTypeDetector; $detector = new FileTypeDetector(); // 1) Local path $result = $detector->detect('/path/to/file.png'); /* $result->toArray(): [ 'category' => 'image', 'mime' => 'image/png', 'extension' => 'png', 'confidence' => 0.95, 'source' => 'mime' // mime | magic | extension | http-head | data-uri ] */ // 2) URL (HEAD only, safe) $result = $detector->detect('https://cdn.example.com/file'); // Uses Content-Type and, when needed, "Content-Disposition: filename=..." and/or URL path extension. // 3) Data URI $result = $detector->detect('');
Detection order
- finfo (MIME) →
Family refinements:
- ZIP (docx/xlsx/pptx/epub/odf/jar/apk/3mf/kmz),
- EBML (webm/mkv),
- ISO-BMFF (mp4/m4a/heic/avif).
If MIME is weak (
application/octet-stream
,text/plain
,inode/x-empty
, etc.), we prefer extension if reliable.
- Magic numbers → same refinements.
- Extension (fallback).
- URL: HEAD (Content-Type) + Content-Disposition filename + path extension.
🧩 Enums
FileCategory
: IMAGE, VIDEO, AUDIO, DOCUMENT, SPREADSHEET, PRESENTATION, PDF, ARCHIVE, EXECUTABLE, FONT, DISK_IMAGE, TORRENT, CODE, TEXT, EBOOK, CONTACT, CALENDAR, SUBTITLE, CERTIFICATE, GIS, 3D_MODEL, CAD, RAW_IMAGE, UNKNOWN.
Eg:
use AndreInocenti\FileTypeDetector\Enums\FileCategory; if ($result->category === FileCategory::IMAGE->value) { // process image }
🔧 Overrides (external config)
Inject your own mappings without touching the core:
use AndreInocenti\FileTypeDetector\FileTypeDetector; use AndreInocenti\FileTypeDetector\Enums\FileCategory; $extOverrides = [ 'abc' => 'application/x-custom', // map ".abc" to x-custom ]; $mimeCatOverrides = [ 'application/x-custom' => FileCategory::DOCUMENT, // category for your custom MIME ]; $detector = new FileTypeDetector( http: null, // default HTTP client extToMimeOverrides: $extOverrides, // your mappings mimeToCategoryOverrides: $mimeCatOverrides ); $res = $detector->detect('/files/report.abc');
Override tips
- Use overrides for proprietary types, internal extensions, or to normalize categories in your domain.
- Overrides take precedence over package defaults.
🛡️ Security hardening (URLs)
The package ships with NativeHttpClient
+ SecurityOptions
:
use AndreInocenti\FileTypeDetector\FileTypeDetector; use AndreInocenti\FileTypeDetector\Config\SecurityOptions; use AndreInocenti\FileTypeDetector\Http\NativeHttpClient; $security = new SecurityOptions( allowPrivateNetworks: false, // blocks 10.0.0.0/8, 192.168.0.0/16, 127.0.0.1, ::1, etc. allowedSchemes: ['https'], // HTTPS only allowedHosts: ['cdn.yoursite.com'], // optional allow-list blockedHosts: ['example-insecure.com'], // optional block-list timeout: 8, // seconds maxRedirects: 3, // manual redirects userAgent: 'Yoursite-FileTypeDetector/1.0' ); $http = new NativeHttpClient($security); $detector = new FileTypeDetector($http, [], [], $security); $res = $detector->detect('https://cdn.yoursite.com/file');
Protected by default:
- SSRF: resolves DNS → validates public IPs only (when
allowPrivateNetworks=false
). - Redirects: handled manually and re-validated at every hop (scheme/host/IP).
- Protocols:
http/https
only by default. - Timeouts and User-Agent configurable.
- Allow-/block-lists by host (optional).
If you must reach internal hosts (intranet), set
allowPrivateNetworks=true
only in trusted environments.
🧪 Tests
Uses Pest + PHPUnit CodeCoverage.
composer install
vendor/bin/pest
# Coverage
XDEBUG_MODE=coverage vendor/bin/pest --coverage
📚 API (quick reference)
FileTypeDetector::__construct(
?HttpClientInterface $http = null,
array $extToMimeOverrides = [],
array $mimeToCategoryOverrides = [],
?SecurityOptions $security = null
)
- $http: HTTP client for HEAD; defaults to
NativeHttpClient
withSecurityOptions
. - $extToMimeOverrides: override extension→MIME.
- $mimeToCategoryOverrides: override MIME→Category.
- $security: HTTP security policy.
FileTypeDetector::detect(string $input): FileTypeResult
- $input: local path, URL (http/https), or data: URI.
- Returns:
category
,mime
,extension
,confidence
,source
.
Confidence Level
ConfidenceLevel
: HIGH (0.95), MEDIUM (0.75), LOW (0.5), VERY_LOW (0.25), NONE (0.0).
📄 License
MIT — free for commercial and open-source projects.
🙋 Support & Contributions
- Issues and PRs are welcome.
- Share real-world “tricky files” to help expand coverage and heuristics.