prebetafinal/epitypes

Epistemological file type classification for content management systems

Maintainers

Package info

github.com/prebetafinal/epitypes

pkg:composer/prebetafinal/epitypes

Statistics

Installs: 4

Dependents: 0

Suggesters: 0

Stars: 0

Open Issues: 0

v1.3.0 2026-02-04 07:11 UTC

This package is not auto-updated.

Last update: 2026-03-26 08:46:13 UTC


README

Semantic file classification for content management systems.

Problem 🤖

File systems organize by what files are (.jpg, .pdf, .mp3). Humans organize by what files do (images to view, documents to read, data to process). Epitypes bridges this gap with semantic classification.

Solution

Epitypes (epistemological types) bridge this gap using a three-level hierarchy:

Nature - fundamental behavior

  • Folders - native file system directories
  • Pages - files you edit (markdown, code, data, config)
  • Assets - files you consume (media, documents, archives, materials)
  • Ignore - system files to skip (.DS_Store, .git, etc.)

Type - semantic grouping (raw, markup, media, documents)

Format - specific implementation (html, image, pdf)

Hierarchy

Epitypes
├── 📁 folders (native directories)
├── 📄 pages (editable)
│   ├── raw (txt, log)
│   ├── markup (html, md)
│   ├── code (js, py)
│   ├── data (json, yaml)
│   └── config (ini, env)
├── 🗃️ assets (consumable)
│   ├── media (jpg, mp3, mp4)
│   ├── documents (pdf, docx)
│   ├── archives (zip, tar)
│   └── materials (psd, 3ds)
└── 🚫 ignored (system files)

Structure (abbreviated for clarity)

{
  "pages": { 
    "description": "Editable content files that can be opened in text editors",
    "types": {
      "raw": ["txt", "text", "log"],     
      "markup": ["html", "md", "xml"],    
      "code": ["js", "py", "php"],        
      "data": ["json", "yaml", "csv"],    
      "config": ["ini", "env", "conf"]    
    }
  },
  "assets": { 
    "description": "Files for consumption, reference, or use as materials",
      "types": {
        "media": ["jpg", "mp3", "mp4"],     
        "documents": ["pdf", "docx"],       
        "archives": ["zip", "tar", "7z"],   
        "materials": ["psd", "3ds", "midi"]  
      } 
  },
  "ignored": {
    "description": "Items to ignore during file scanning",
    "items": [".git", ".DS_Store", "Thumbs.db", ".svn", ".hg", ".epitome", ".gitignore", "node_modules", ".vscode", ".idea"]
  }
}

Epistemic Foundation 🤓

Layer 1 (Nature/Ontological): pages vs assets

  • Pure human conceptual distinction: "What I work on" vs "What I consume"
  • Domain-Driven Design level - reflects user mental models
  • Never changes because it's fundamental to human cognition

Layer 2 (Type/Categorical): raw, markup, code, data, config

  • Human-computer bridge layer - how humans categorize information processing
  • Reflects both human logic (raw text vs structured) and computational needs
  • Stable because these are fundamental information categories

Layer 3 (Format/Technical): markdown, javascript, json, etc.

  • Pure technical implementation details
  • Machine-readable yet human-understandable
  • Most volatile layer - formats come and go

Each level and node tends to be linguistically sound:

  • Raw = unstructured human thought
  • Markup = structured human expression
  • Code = human instructions for machines
  • Data = structured information
  • Config = system parameters

This creates an epistemic hierarchy: pure human cognition → hybrid human-machine categories → technical specifications.

Why Not MIME Types?

MIME types solve technical delivery (image/jpeg), not human organization:

  • application/json doesn't tell you if it's data, config, or a unit shelter
  • text/plain could be a note, log, or code snippet
  • ❌ No concept of "ignorable" vs "important" files

Epitypes adds semantic layers that MIME types lack.

Notes 📜

The 'page' term works perfectly as long as it's presented as a page in the UI, regardless of its deeper nature, and this is something widely accepted in the domain. An asset seems natural without any explanation as the opposite of page in given context.

This classification is not final so any input/feedback is welcome.

Unknown Files

Files not listed in epitypes automatically fall into an "other" group. Things like executables (.exe), system binaries, and other irrelevant to content management formats. These cases should be handled by algorithms accordingly.

License

MIT