The project I'm working on has inconsistency issues when it comes to coding style. Even though it's a Drupal project, it doesn't follow Drupal's coding standards and on top of that the .php files are saved using various character encoding and line endings. This forced me to develop a workflow that I use before changing legacy files which helped me to keep changes history readable.
Committing together a file encoding change, white-space conversion and a bug-fix will be a source of regret if you ever have to look at that diff again. Most likely you will pull your hair out trying to figure out what is a code change and what's only a code style fix. That's why I separate the coding standard fixes in a single commit before doing anything else. This commits will be easy to ignore when checking history and can provide a cleaner diff for the changes that follow.
The steps I follow are:
First - make sure the file uses UTF-8 encoding. In phpStorm (the IDE I use) there is a small toggle in the bottom right. I used that to change to UTF-8 and then proceed scanning the whole file for errors (improperly displayed or wrong characters after conversion). This issues arise when the file contains multi-byte characters but was saved using one byte encoding. These are usually found in the messages displayed to the client and can be easily fixed by retyping those characters by hand. Characters with accents or umlauts (from German and French) are the main suspects. The command specified at the step four can be used to make sure everything is correct after the change.
Second - convert indentation to spaces and the line-endings to \n. So in phpStorm I use Edit > Convert Indents > To spaces and the line separator drop-down (bottom right bar).
Third - re-format the code by going through all the lines and make sure that:
- indentation level is correct (you might get wrong indented lines after that conversion)
- all the dead code is removed (code that was commented, old comments that are not accurate anymore etc)
- curly brackets and parentheses are consistent (position and white-space around them).
- there is not trailing white-space. phpStorm will remove trailing white-space for most of the lines when converting to \n, but it will not fix trailing white-space inside strings (messages, HTML chunks etc).
Fourth - double-check changes using git diff -w -- <filepath>. This will display all the changes except white-spaces changes. I do this to make sure I didn't break the functionality and code will behave the same way it did before I did the cleanup.
Fifth - commit changes with a standard messages like "Coding standards fix" and keep this message consistent through the project
After I'm done with this I will continue with the bug-fix or changes of that file using regular process/work-flow. Whenever I need to check the history log I can easily skip the commit where I cleaned-up the file because it should not introduce any behavior changes to the code (step four). Also, this is something I only do once per file, so in the future most of the files will be fixed and have a easy to read history.
I'm curious how others are doing this. I've read "Add only non-whitespace changes" question on Stack Overflow while developing this workflow and I can't see the reason why don't you want to fix coding standard issues whenever you work on a file. In the end you are already changing that file so chances to introduce a bug are already there.
Comments