Syntax highlighting is backwards

Most code editors color different pieces of your program in different ways. For instance, they’ll make keywords like if bold and bright so that you notice when you’ve misspelled them. They’ll make non-executable parts like comments and documentation fainter so that you know that the computer isn’t seeing that part of the program. Take this example in Pycharm colors:

def frobnicate(swizzle):
    """Frobnicates the given swizzle."""
    pass

But isn’t this exactly backwards?

Take as an example the following function definition from the Zulip codebase (source):

def user_avatar_path(user_profile: UserProfile) -> str:
    # WARNING: If this method is changed, you may need to do
    # a migration similar to
    # zerver/migrations/0060_move_avatars_to_be_uid_based.py .
    return user_avatar_path_from_ids(
        user_profile.id, user_profile.realm_id)

def is the least important part of this snippet—I know it’s a function. The comment is way more important: if I update the code without reading it, I’ll probably ship a bug.

So what if we flipped these two styles?

def user_avatar_path(user_profile: UserProfile) -> str:
    # WARNING: If this method is changed, you may need to do
    # a migration similar to
    # zerver/migrations/0060_move_avatars_to_be_uid_based.py .
    return user_avatar_path_from_ids(
        user_profile.id, user_profile.realm_id)

Seems better! The keywords fade into the background and I definitely won’t forget to write a migration. I’m not convinced about bolding every comment—it’s a little obtrusive—but I don’t have any better ideas and I’d rather read too many comments than not enough.

I’ve been using this color scheme for a few weeks now and it’s been fine so far! The most noticeable effect is that my source code is way less noisy and distracting. I think I’m paying attention to comments more (and leaving them to be stale less often), but that’s a lot harder to tell so I can’t say for sure.

Random notes:

Comments

email me replies

format comments in markdown.

Your comment has been submitted! It should appear here within 30 minutes.
maxkwallace

“I tried to search around for any systematic studies… "

Me too. And I also didn’t find anything. The closest I found was:

https://www.cs.cmu.edu/~ckaestne/pdf/ese12.pdf - about background colors for #ifdef statements

https://dl.acm.org/citation.cfm?doid=2846680.2846685 (free PDF available if you search the name of the article) - which discusses syntax highlighting but doesn’t seem to present any actual research.

I interviewed the HCI researcher Ben Shneiderman once in college (for a technical writing assignment). I was only interested in his early work on program comprehensibility– IIRC he did systematic studies showing that (1) indentation helps and (2) at a certain point too many levels of control structure nesting (if, for) within functions is bad (better to have separate methods)– but he was much more keen to talk about his later research on HCI, touchscreens, etc.

The general impression I got was that source code comprehensibility is not a career-making field for an academic (I have no idea why). And there seems to be little active research in this area despite its practical importance.

My own $0.02:

For programmers, keywords like “def”, “return”, etc. are strongly chunked and act as a sort of signpost– to the effect that we don’t read these words as words themselves, but rather structural information that modifies the content they’re presented with. They “feel” a certain way based on their logical function in the language.

I agree that at the level of an individual method or function they are unimportant. But there are different “tasks” that we do when programming, and reading an individual method is only one of them. Two other “tasks” are scanning a file and figuring out the relationship between a set of different methods that are used together. And I think highlighting “def” is justified for these tasks because it provides important structural information at a more macro level in the overall program.

My guess is that when syntax highlighting was first designed, comments were an easy target for deemphasis because they are less “regular”-- highlighting them doesn’t help with detecting syntax errors– and more important, they are less relevant for macro-level scanning, and tend to be read at the level of individual methods. If it’s a comment about the whole method I think we tend to chunk it into the name of the method itself rather than reading it many times.

Deemphasizing comments always bothered me philosophically but the above are my conclusions as to why it didn’t seem to cause problems for me in practice. I tend to read comments in a different way than other parts of programs, and when I do, I am focused on that comment alone, so it’s not a big deal if it’s greyed out. But everyone is used to this style now so this could be partly post-rationalization? I feel like the optimal thing would be adaptive highlighting that changes based on what task you’re doing, or setting up a keybinding to toggle highlighting for comments, but I haven’t tried this yet.

Ben

I’m interested in your comments about how syntax highlighting helps with macro-level scanning. While writing this comment I actually realized that I almost never do that! When I’m trying to get a high-level overview of code, I instead use a mix of tools like:

  • grep (or its various smarter alternatives)

  • find-as-you-type

  • jump to definition

  • “find occurrences”

  • the editor’s “file structure” sidebar

  • code folding

  • drawing pictures

I also find that indentation is usually fine for figuring out structure, although this is definitely at least in part due to language–the codebase I’ve worked in most is a Python server, so it has (a) significant whitespace, (b) minimal syntactic noise, and (c) almost nothing (<5% of lines) is indented more than 3 levels. Plus, we have a fairly strong convention of including a “high level overview” in each module docstring, which eliminates some of the need for code navigation at all. I could see this working differently in a different language or a codebase with different conventions.

Max Wallace

I agree that it’s codebase-dependent. After I read your reply I realized that I did more scanning at my first job than at my second, because the codebase at my first job was not as well structured, and it wasn’t always clear what the relationship between certain methods was. I agree that indentation makes a big difference. I also use all the tools you do, with the exception of the “file structure” sidebar and code folding.

I guess what I was trying to say is that I think syntax highlighting (i.e. a consistent, unique color) for keywords like “def” and “return” helps with letting us process them more like unique symbols rather than words or letters, since it differentiates them from other types of text present in code. But this is all just conjecture :)

Michael Toomim

Don’t forget the great research by Baecker and Marcus from the 1980s on source code formatting:

http://www.cs.kent.edu/~jmaletic/softvis/papers/Baecker1983.pdf

email me replies

format comments in markdown.

Your comment has been submitted! It should appear here within 30 minutes.

matt lawless

syntax highlighting, meh. turn it off

paul wisehart

you’re right about everything. Syntax highlighting is a lie. It’s the man telling you what to think. Throw off the shackles.

email me replies

format comments in markdown.

Your comment has been submitted! It should appear here within 30 minutes.

gdewilde@gmail.com

I went with this highlighting scheme. It was for research but turned into a bit of a joke. http://opml.go-here.nl/the-internet-view-source.php It really bothered me that I couldn’t read the comments anymore.

email me replies

format comments in markdown.

Your comment has been submitted! It should appear here within 30 minutes.

co-dh

You may want to take a look at semantic highlighting, or colorForth

email me replies

format comments in markdown.

Your comment has been submitted! It should appear here within 30 minutes.

Goblin

A somewhat orthogonal alternative on shortcomings of today’s syntax highlighting. https://stackoverflow.com/questions/13882241/is-crockford-style-context-coloring-implemented-in-any-code-editor

email me replies

format comments in markdown.

Your comment has been submitted! It should appear here within 30 minutes.

Alex

Here’s an idea re: bolding comments. Comments could get bolder the more sequential lines there are.

One line of comments would be a little bold.

Three lines of comments would be more bold.

Seven lines of comments would be very bold.

This idea being the more lines of information the previous developer is trying to convey, the more important it may be.

eMBee

i’d do just the opposite. make one line comments full bold, and reduce as lines are added. large comment blocks already stand out by their size. single line comments, hopefully concise and to the point are easier to miss.

email me replies

format comments in markdown.

Your comment has been submitted! It should appear here within 30 minutes.

John Fitzpatrick

The bolding of keywords (as opposed to identifiers) may have started prior to syntax highlighters. I remember texts for Pascal that showed keywords in bold typeface and the rest of the program in plain text.

For learning a language, highlighting the keywords makes sense. Perhaps we carried this idea to on-line syntax highlighting.

email me replies

format comments in markdown.

Your comment has been submitted! It should appear here within 30 minutes.

rileyjshaw

You may be interested in Literate Theme, a CLI tool I wrote to update .tmTheme files. It emphasizes comments and mutes everything else.

email me replies

format comments in markdown.

Your comment has been submitted! It should appear here within 30 minutes.

cemery50

I fully agree with the need for color as a method to seperate areas of concern.It seems it should be easy to create varying styles based upon focus and switch among them.

email me replies

format comments in markdown.

Your comment has been submitted! It should appear here within 30 minutes.

Bob

You want https://books.google.com/books/about/Human_Factors_and_Typography_for_More_Re.html?id=QstWAAAAMAAJ

email me replies

format comments in markdown.

Your comment has been submitted! It should appear here within 30 minutes.

Desi

I ended up implementing something like this in a colourblind-friendly theme for RStudio: https://github.com/DesiQuintans/epergoes

Like you, I find that I leave comments stale less often.

email me replies

format comments in markdown.

Your comment has been submitted! It should appear here within 30 minutes.