rjt / title-case
Smart capitalization for names, titles, and addresses.
Requires
- php: >=8.1
Requires (Dev)
- phpstan/phpstan: ^1.11
- phpunit/phpunit: ^10
Suggests
- ext-mbstring: For best Unicode casing behavior (recommended).
This package is auto-updated.
Last update: 2026-03-27 17:23:12 UTC
README
Smart capitalization for names, titles, and addresses — with sensible heuristics and a tiny API surface.
- Vendor / Packagist:
rjt - Package:
rjt/title-case - Namespace root:
RJT\TitleCase - Public API:
RJT\TitleCase\titleCase()/RJT\TitleCase\nameCase() - License: MIT
Installation
composer require rjt/title-case
Usage
Titles / addresses (default)
<?php declare(strict_types=1); use function RJT\TitleCase\titleCase; // Customer support / email subject lines echo titleCase("Re: assistance with account settings options"); // "Re: Assistance with Account Settings Options" // Addresses echo titleCase("123 nw 5th st, apt 255"); // "123 NW 5th St, Apt 255"
Names
<?php declare(strict_types=1); use function RJT\TitleCase\nameCase; echo nameCase("JOHN VAN DER WAALS"); // "John van der Waals" echo nameCase("o'connor"); // "O'Connor"
If you prefer a single entrypoint, titleCase($input, true) applies name rules
(same as nameCase()), but nameCase() is recommended for readability.
Background and guidance
This project started as a small helper to normalize user-entered names in a CRM (where input is often inconsistent, or messy). It was later expanded to also handle support ticket / email subject titles and common address capitalization (e.g., PO Box, NW, RR) in a practical, “do the right thing” way.
Good use cases:
- Display-layer cleanup: showing consistent names/subjects in a UI (CRM contact lists, ticket queues, inboxes).
- Import/cleanup workflows: normalizing CSV imports or form input before review.
- Derived fields: generating a “display name” / “display subject” value while keeping the original raw input.
Usually not a good idea:
- Overwriting canonical stored user data without keeping the original (people/brands often have intentional casing).
- Treating this as postal address standardization/validation (it’s formatting, not verification).
- Assuming it’s locale-perfect for every language and naming convention.
If you need organization- or customer-specific casing (product names, acronyms, preferred spellings), use Overrides to enforce those consistently.
Overrides
Use overrides to force specific words or phrases to a preferred final capitalization.
Overrides file example:
[common] NASA U.S.A. [title] bell hooks Hooks GitHub.com [names] bell hooks MacVicar
Notes:
- Each non-empty line is the preferred final capitalization for that word or phrase.
- Matching is case-insensitive; output uses exactly the capitalization written in the file.
- Leading/trailing whitespace is ignored; blank lines and comment lines (
#or;) are allowed. - Phrase overrides take precedence over word overrides (e.g.,
bell hookswins overHooks). [common]entries apply to both title and name modes;[title]/[names]override[common]on conflicts.=and=>are treated as literal characters, not mappings.
Usage:
<?php declare(strict_types=1); use RJT\TitleCase\Overrides; use function RJT\TitleCase\nameCase; use function RJT\TitleCase\titleCase; $overrides = Overrides::fromFile(__DIR__ . '/overrides.ini'); // or: Overrides::fromString($iniString); echo titleCase("meet hooks today", overrides: $overrides); // "Meet Hooks Today" echo nameCase("bell hooks", "UTF-8", $overrides); // "bell hooks"
Legacy API:
Overrides::fromIniFile()/Overrides::fromIniString()are retained for backward compatibility. PreferOverrides::fromFile()/Overrides::fromString()going forward.
Limitations:
- Overrides are intended only for capitalization (case) control.
- They do not support remapping spelling/punctuation to a different string. For example,
you cannot map
usatoU.S.A.unless your canonical entry is exactlyU.S.A.(which will matchu.s.a.but notusa).
Highlights
This library is opinionated and practical: it aims to “do the right thing” for common real-world input without requiring configuration or a large API.
Token-aware parsing
Input is split into meaningful token types so casing rules can be applied safely:
-
Words with internal apostrophes/hyphens (including common Unicode hyphens).
-
Dotted initialisms like
u.s.a.→U.S.A.(with a small canonical exception list likeph.d.→Ph.D.). -
Dot-words like
node.js/react.tsx→Node.js/React.tsx(left side cased, suffix preserved via a small allowlist). -
Compounds using
&,/,+liker&d,input/output,api+sdk,c++:&/+: acronym-like short segments are uppercased (R&D,API+SDK), otherwise segments are title-cased (Rock+Roll)./: segments are title-cased (Input/Output).- Some special cases are preserved as-is (e.g.,
c/o,and/or).
Titles & addresses ($isName === false)
-
Minor words (e.g.,
of,to,and,vs) are lowercased only when they’re interior. -
Segment-aware capitalization: minor words are not forced lowercase at the start of a new segment (after
:,—,?,!, etc.) or after a parenthesis restart, so you get results like:War: out of the box→War: Out of the Box(in brief)→(In Brief)
-
Preserves acronyms in ALL CAPS (e.g.,
NASA,ESA) when they aren’t minor words. -
Uppercases address tokens:
PO,RR,NE,NW,SE,SW. -
Email-friendly: keeps
atlowercase when it introduces an email address (e.g.,Email me at jane@example.com).
Names ($isName === true)
-
Lowercases particles/articles anywhere (including multi-word phrases), e.g.:
van,von,de,del,der,al,bin,ibn, and phrases likede la,van der,von dem, etc.
-
Surname prefix bi-capitalization for common cases (in addition to
Mc…), with conservative lists for prefixes like:Mac…,De…,Di…,Du…,La…,Le…,Van…(e.g.,DiCaprio,DeMarco,MacIntyrewhen applicable).
-
Apostrophe prefixes for names are handled narrowly:
o'connor→O'Connor,d'artagnan→D'Artagnan(without turning contractions likey'allintoY'All).
-
Does not preserve all-caps acronyms by default (many names are entered as ALL CAPS).
In both modes
- Fixes possessive endings:
"It'S"/"IT’S"→"It's"/"It’s". - Preserves “intentional” mixed-case tokens like
iPhone/eBay. - Uppercases Roman numerals when appropriate (mode-aware).
- Uses Unicode-aware casing when
ext-mbstringis available (recommended); otherwise falls back to basic casing behavior.
Behavior Notes
- All-caps input normalization: when the entire input is ALL CAPS, words are title-cased instead of preserved as acronyms, except for dotted initialisms, symbol/digit compounds, and a small whitelist of common acronyms (e.g.,
NASA,API,HTTP). - Acronym possessives: all-caps possessives keep the acronym uppercased while lowercasing the trailing
s(NASA's,NASA’S). - URLs/domains/emails: preserved exactly as typed, but still count as words for position logic (so
Go to example.com, notGo To example.com). - Address-like strings: USPS state/territory abbreviations are uppercased when followed by a ZIP code or after a city comma (e.g.,
Portland, OR 97201).
Non-goals
This package intentionally does not try to be:
- A full linguistic titlecasing engine for every language and locale.
- A dictionary-based proper-noun corrector (it won’t “know” that
iPadshould always beiPadunless you typed it that way). - A normalization system for punctuation, quotes, or escape sequences (no stripslashes).
Development
composer install
composer test
composer analyse
License
MIT © RJT Media Group LLC