nibra/fix-copyright

A one-shot project created to convert the copyright statements in the Joomla project to a standardised format. It might be useful for other projects as well.

1.0.5 2022-08-13 13:34 UTC

This package is auto-updated.

Last update: 2024-04-13 17:41:25 UTC


README

This tool is a one-shot project created to convert the copyright statements in the Joomla project to a standardised format. It might be useful for other projects as well.

Why bother

Until 2020, Joomla used to use this format:

Copyright (C) 2005 - 2020 Open Source Matters. All rights reserved.

The ending year of the range had to be updated each year for each file. Although this update was done by a script, this process kept polluting the file history all the time without any benefit.

Thus, the Production Department leadership of Joomla decided in mid-2020 to follow the advice in this excellent article about how and why to properly write copyright statements.

Some people consider this change pointless, but, as Michael Babker nailed it,

This might come across as a pointless change, but the other pointless change is modifying every file in every Joomla owned repository in January to amend the ending date of the copyright claim. Additionally, a copyright claim is being made on every file in most every Joomla owned repository of a copyright dating back to 2005, which is clearly not factual.

How it works

The original approach to determine the creation date for a file was

YEAR=$(git log --follow --date=format:%Y --pretty=format:"%cd" --diff-filter=A --find-renames=40% "${FILE}" | tail -n 1)

However, the results were disappointing. git itself uses content similarity to find renames, which led to unexpected results.

The Git > Show History function of PhpStorm, on the other hand, gave very plausible results for the first commit, and some research revealed the implementation in IntelliJ (which is the base for PhpStorm). The people at JetBrains found that

git log --follow does detect renames, but it has a bug - merge commits aren't handled properly: they just disappear from the history. See http://kerneltrap.org/mailarchive/git/2009/1/30/4861054 and the whole thread about that: --follow is buggy, but maybe it won't be fixed.

The solution, which is re-implemented here, is to

  1. Get the first commit of the file with that name
  2. Get the status (Added, Copied or Renamed) of that commit
  3. Stop, if status is Added or Copied, this really is the first commit.
  4. Status is Renamed, so get the first commit of the file with the previous name before the current commit.
  5. Continue with step 2.

How to adopt the scripts for your environment

In fix-copyright.sh, change lines 4-7 to suit your settings:

GREP_PATTERN="(Copyright )?\(C\) .* Open Source Matters.*All rights reserved\.?"
SED_PATTERN="\(Copyright \)\?(C) .* Open Source Matters.*All rights reserved\.\?"
OWNER="Open Source Matters, Inc."
CONTACT="https://www.joomla.org"

Be aware of the different kinds of escaping for grep rsp. sed.

You might want to adjust the default year in lines 18 and 20 (here the default year is 2005):

    if [[ ${FILE} == *.xml ]]; then
      REPLACEMENT="(C) ${YEAR:-2005} ${OWNER}"
    else
      REPLACEMENT="(C) ${YEAR:-2005} ${OWNER} <${CONTACT}>"
    fi

ToDo

  • Move functionality from fix-copyright.sh to fix-copyright.php
  • Provide PATTERN, OWNER and CONTACT as command line parameters
  • Escape pattern internally