rjt/title-case

Smart capitalization for names, titles, and addresses.

Maintainers

Package info

github.com/rjt-labs/title-case

pkg:composer/rjt/title-case

Statistics

Installs: 0

Dependents: 0

Suggesters: 0

Stars: 0

Open Issues: 0

dev-main 2026-01-27 13:00 UTC

This package is auto-updated.

Last update: 2026-03-27 17:23:12 UTC


README

Smart capitalization for names, titles, and addresses — with sensible heuristics and a tiny API surface.

  • Vendor / Packagist: rjt
  • Package: rjt/title-case
  • Namespace root: RJT\TitleCase
  • Public API: RJT\TitleCase\titleCase() / RJT\TitleCase\nameCase()
  • License: MIT

Installation

composer require rjt/title-case

Usage

Titles / addresses (default)

<?php

declare(strict_types=1);

use function RJT\TitleCase\titleCase;

// Customer support / email subject lines
echo titleCase("Re: assistance with account settings options");
// "Re: Assistance with Account Settings Options"

// Addresses
echo titleCase("123 nw 5th st, apt 255");
// "123 NW 5th St, Apt 255"

Names

<?php

declare(strict_types=1);

use function RJT\TitleCase\nameCase;

echo nameCase("JOHN VAN DER WAALS");
// "John van der Waals"

echo nameCase("o'connor");
// "O'Connor"

If you prefer a single entrypoint, titleCase($input, true) applies name rules (same as nameCase()), but nameCase() is recommended for readability.

Background and guidance

This project started as a small helper to normalize user-entered names in a CRM (where input is often inconsistent, or messy). It was later expanded to also handle support ticket / email subject titles and common address capitalization (e.g., PO Box, NW, RR) in a practical, “do the right thing” way.

Good use cases:

  • Display-layer cleanup: showing consistent names/subjects in a UI (CRM contact lists, ticket queues, inboxes).
  • Import/cleanup workflows: normalizing CSV imports or form input before review.
  • Derived fields: generating a “display name” / “display subject” value while keeping the original raw input.

Usually not a good idea:

  • Overwriting canonical stored user data without keeping the original (people/brands often have intentional casing).
  • Treating this as postal address standardization/validation (it’s formatting, not verification).
  • Assuming it’s locale-perfect for every language and naming convention.

If you need organization- or customer-specific casing (product names, acronyms, preferred spellings), use Overrides to enforce those consistently.

Overrides

Use overrides to force specific words or phrases to a preferred final capitalization.

Overrides file example:

[common]
NASA
U.S.A.

[title]
bell hooks
Hooks
GitHub.com

[names]
bell hooks
MacVicar

Notes:

  • Each non-empty line is the preferred final capitalization for that word or phrase.
  • Matching is case-insensitive; output uses exactly the capitalization written in the file.
  • Leading/trailing whitespace is ignored; blank lines and comment lines (# or ;) are allowed.
  • Phrase overrides take precedence over word overrides (e.g., bell hooks wins over Hooks).
  • [common] entries apply to both title and name modes; [title]/[names] override [common] on conflicts.
  • = and => are treated as literal characters, not mappings.

Usage:

<?php

declare(strict_types=1);

use RJT\TitleCase\Overrides;
use function RJT\TitleCase\nameCase;
use function RJT\TitleCase\titleCase;

$overrides = Overrides::fromFile(__DIR__ . '/overrides.ini');
// or: Overrides::fromString($iniString);

echo titleCase("meet hooks today", overrides: $overrides);
// "Meet Hooks Today"

echo nameCase("bell hooks", "UTF-8", $overrides);
// "bell hooks"

Legacy API:

  • Overrides::fromIniFile() / Overrides::fromIniString() are retained for backward compatibility. Prefer Overrides::fromFile() / Overrides::fromString() going forward.

Limitations:

  • Overrides are intended only for capitalization (case) control.
  • They do not support remapping spelling/punctuation to a different string. For example, you cannot map usa to U.S.A. unless your canonical entry is exactly U.S.A. (which will match u.s.a. but not usa).

Highlights

This library is opinionated and practical: it aims to “do the right thing” for common real-world input without requiring configuration or a large API.

Token-aware parsing

Input is split into meaningful token types so casing rules can be applied safely:

  • Words with internal apostrophes/hyphens (including common Unicode hyphens).

  • Dotted initialisms like u.s.a.U.S.A. (with a small canonical exception list like ph.d.Ph.D.).

  • Dot-words like node.js / react.tsxNode.js / React.tsx (left side cased, suffix preserved via a small allowlist).

  • Compounds using &, /, + like r&d, input/output, api+sdk, c++:

    • & / +: acronym-like short segments are uppercased (R&D, API+SDK), otherwise segments are title-cased (Rock+Roll).
    • /: segments are title-cased (Input/Output).
    • Some special cases are preserved as-is (e.g., c/o, and/or).

Titles & addresses ($isName === false)

  • Minor words (e.g., of, to, and, vs) are lowercased only when they’re interior.

  • Segment-aware capitalization: minor words are not forced lowercase at the start of a new segment (after :, , ?, !, etc.) or after a parenthesis restart, so you get results like:

    • War: out of the boxWar: Out of the Box
    • (in brief)(In Brief)
  • Preserves acronyms in ALL CAPS (e.g., NASA, ESA) when they aren’t minor words.

  • Uppercases address tokens: PO, RR, NE, NW, SE, SW.

  • Email-friendly: keeps at lowercase when it introduces an email address (e.g., Email me at jane@example.com).

Names ($isName === true)

  • Lowercases particles/articles anywhere (including multi-word phrases), e.g.:

    • van, von, de, del, der, al, bin, ibn, and phrases like de la, van der, von dem, etc.
  • Surname prefix bi-capitalization for common cases (in addition to Mc…), with conservative lists for prefixes like:

    • Mac…, De…, Di…, Du…, La…, Le…, Van… (e.g., DiCaprio, DeMarco, MacIntyre when applicable).
  • Apostrophe prefixes for names are handled narrowly:

    • o'connorO'Connor, d'artagnanD'Artagnan (without turning contractions like y'all into Y'All).
  • Does not preserve all-caps acronyms by default (many names are entered as ALL CAPS).

In both modes

  • Fixes possessive endings: "It'S" / "IT’S""It's" / "It’s".
  • Preserves “intentional” mixed-case tokens like iPhone / eBay.
  • Uppercases Roman numerals when appropriate (mode-aware).
  • Uses Unicode-aware casing when ext-mbstring is available (recommended); otherwise falls back to basic casing behavior.

Behavior Notes

  • All-caps input normalization: when the entire input is ALL CAPS, words are title-cased instead of preserved as acronyms, except for dotted initialisms, symbol/digit compounds, and a small whitelist of common acronyms (e.g., NASA, API, HTTP).
  • Acronym possessives: all-caps possessives keep the acronym uppercased while lowercasing the trailing s (NASA's, NASA’S).
  • URLs/domains/emails: preserved exactly as typed, but still count as words for position logic (so Go to example.com, not Go To example.com).
  • Address-like strings: USPS state/territory abbreviations are uppercased when followed by a ZIP code or after a city comma (e.g., Portland, OR 97201).

Non-goals

This package intentionally does not try to be:

  • A full linguistic titlecasing engine for every language and locale.
  • A dictionary-based proper-noun corrector (it won’t “know” that iPad should always be iPad unless you typed it that way).
  • A normalization system for punctuation, quotes, or escape sequences (no stripslashes).

Development

composer install
composer test
composer analyse

License

MIT © RJT Media Group LLC