Naming Things
There are famously two hard things in computer science: cache invalidation, naming things, and off-by-one errors.
Arguably, naming things is the hardest because there's an element of subjectivity. Cache invalidation is hard because of complexity, but at least incoherence is objectively measurable. Off-by-one errors are as easy to fix as they are to make, only hard to reliably avoid. However, there are infinite tensions in naming.
On the one hand, you can make a program work with nearly any choice of name. On the other, no great program consists of entirely arbitrary names.
Every name is eventually regretable. But, take heart because those who practice naming do get better at postponing that regret. And, dispair, for those who don't practice naming choose a lot of regretable names.
Here are some things I know about naming. First, don't settle for a name that works. Let the problem bother you for a bit.
The Three Laws of Naming in Computer Science #
The Three Laws of Naming in Computer Science are,
- The name must describe the thing.
- The name will ideally describe no other thing.
- The name is among the shortest of all concise names, and no shorter.
- The name is the funniest of short, concise names.
Corollary: Abbreviations are inevitably ambiguous. Don’t make me guess whether or how you abbreviated or contracted a name, or if you elided every other vowel. Do not make me guess.
These laws, like the robot laws are in order of precedence. A hat trick in naming things is a cause for celebration.
Also, names do not appear in isolation. The best names participate in systems of names.
The Laws of Systems of Names in Computer Science #
The Laws of Systems of Names in Computer Science are,
-
Every public name establishes a precedent and can thereafter never change. Choose wisely.
-
A system of names must not contain any synonyms. Choose one.
-
If the name of a thing has an antonym or dual, a thing with the other name should exist and have the implied relationship. Choose coherently.
That is, don't name something kiki unless you know what would be bouba.
If something is named up, there should be a down.
If clockwise is deosil, then counter-clockwise is widdershins.
If you need one dimension, left and right will do; whereas if you need
two dimensions, north, east, south, and west are there for you.
If you have a bunch of shapes like hexagon, septagon, and octogon,
so help me, the next one better be an
enneagon.
Nonagon does not belong in the company of Greeks.
Also, triangles are trigons.
Don’t get me started.
Your language or its standard library has chosen many names already and you are obligated to choose names that are coherent with the body of established precedent.
Do not mix metaphors.
The inverse of install is uninstall.
The inverse of add is remove or delete, and that precedent has almost
certainly been set.
I will personally haunt you if you cross begin with finish.
The opposite of begin is end.
The opposite of start is finish.
Kay's Aphorism #
Similar things should be the same or different. — Alan Kay
(I have had the pleasure of working with Mark S. Miller who in turn had the pleasure of working with Alan Kay so occasionally remembers things people have said that may never have been written down.)
You might be tempted to use begin and end for nouns and start with
finish for verbs.
These are not different enough to be distinct.
I recommend choosing one of these pairs for nouns and use an entirely
different group for verbs.
The tape deck metaphor would give you replay and skip and provide
verbs like play, pause, and seek if you ever need them.
There are precedents across language boundaries for using the words promise,
future, and deferred to variously refer to similar devices.
Use the prevailing convention, but if there isn’t a precedent, pick one.
But, if you need another similar thing, pick a different word entirely
that captures the distinction, like signal or observable.
It is sometimes okay to beg a fine distinction between groups of words or to make arbitrary choices about how these words are grouped. There are cases where we need more words than there are distinct meanings.
So, if you find you need two flavors of ends…
On Deque #
Two awful programming languages adopted push, pop, shift, and unshift
for the deque methods.
That makes it a Schelling Point and no other choices are valid going forward
unto forever.
If you make a new language with deque-like protocols and don't choose these
exact names, you have failed in your duty to coddle your fledgling base.
But, hear me out. What if there were a reliable mnemnoic you could use to distinguish whether the method operated on the I (input) side or the O (output) side? What if you could do this without straying very far from an English dictionary? What if you can do this with pithy monosyllables?
As it happens, tip and top both mean end and you could argue that their
middle letter indicates which end value they will report.
Likewise, pip and pop are very ordinary words, and while pip might
require a stretch, pop is already accepted as the correct name in both Perl
and JavaScript for removing a value from the output end of a queue.
If you have ever heard the idiom “pish posh”, you can easily remember
which end these methods will push onto.
And, if you ever need to rotate your dequeue, you might consider shift and
shoft.
- in / out
tip/top(peek at the value at an end, without mutation)pip/pop(remove a value at an end and return it)pish/posh(add a value to an end)shift/shoft(rotate a value from one end to the other, or move a cursor around a circular linked list forward or backward)
But, seriously, go with push, pop, shift, and unshift.
Of Trees and Tries #
I don't want to talk about this. Mistakes were made. Skip to prefix trees and forget about it.
But, if you're going to be mean about it, I’m going to continue pronouncing the G in GIF as if I were Dutch and the Ph in JPEG like “photograph”. Bake some ambiguity into your name's pronunciation and I'll find the third interpretation and socialize it at your favorite conference.
Map and Set #
The names map and set are cursed.
They occur in the same group of words both as nouns and as verbs.
Map is an interface with methods like get and set.
Set is an interface with methods like add and delete.
Lists often have a method like map.
Map is not map, and Set is not set.
So, keep in mind that that the case of a word is information and that names
cannot be trivially moved from one case convention to another.
The monospace gag #
You will note that Map methods tend to have three letters.
If “funny” were again more important than the preceding laws of naming things,
you might conceive of a convention for maps that used strictly three-letter
names: get, set, and has, of course.
We can use put as a variant on set that asserts that the key is not
present.
We can use cut as an analog to pop for removing and returning the value for
a key.
We often wring our hands for a good name for a method that gets the value for a
key, but also sets the value to a default it was not already present.
My muse is the idiom, “on your mark, ready, set, go!”, which can be
captured in three letters as initials, oym and rsg.
But that would be crazy.
JavaScript has run with the verbose but absolutely appropriate and righteous
getOrInsert and getOrInsertComputed.
Less crazy than the precedent set for naming operators in shell scripts.
Someday, the JavaScript community will have to establish a protocol for
comparison methods.
I am aware that they will be obligated to follow Java and name the methods like
equals and lessThan.
Down that way lies lessThanOrEqual, which will be a disappointment, when we
could have had eq, ne, lt, gt, le and ge.
Oh, well.
At least we won’t have to settle for aberrations that serve no-one like lte.
Brand Names #
Brand names do not follow the laws of naming things in computer science. You will probably have to resort to ten indecent tricks to find a name that is short, available, and memorable. You may name your project with a brand name. Your project must have exactly one brand name. You must not create a system of brand names inside your project. The brand name must not invade your project's interfaces, protocols, or properties.
So, for example, if your project is named fx, you might even consider
naming its main class Effect, but do so at your peril if it does not
correspond to anything that might follow from a cause.
For example, suppose you are choosing the conventional name where a tool
might install vendored dependencies, regardless of whether they're
using the tool with your very specific brand name.
That directory name is obviously dependencies.
A bad name for this would be bun_modules.
Pick One Case Convention #
Nobody wants to guess which case convention you used. Choose one throughout your project.
Your convention may differentiate files, languages, public versus private, code names, and classes versus instances. But, if a single directory has file names with different conventions: shame. If a single file name uses more than one delimiter: exile.
These are all available options:
kebab-casesnake_caseSCREAMING_CASEdromedaryCaseBactrianCase
Camel case is ambiguous. Be specific.
Avoid runoncase.
You will, in the fullness of time, encounter real combinations of terms with
meaningful semanticic differences depending on the place of the invisible
delimiter.
The examples of code-sign versus co-design and beans-owing versus
bean-sowing are very real and believe in you.
Notably, Golang sets a strong precedent for runoncase for package names.
The precedent is stronger than this rule.
But, you can trivially avoid names with undelimited terms, either by using
single words, or vestigial intermediate directories.
This is not arguable: if a file name might appear in an HTTP request, that name
is in kebab-case.
That includes any JavaScript module specifier.
I am looking at you node:child_process.
node:popen was right there and you know it.
Many languages have a prevailing convention. Just follow it.
A name may consist of one or more terms.
Those terms may themselves be initialisms and acronyms.
Case conventions like dromedaryCase and BactrianCase
use capitalization to indicate the boundary between terms.
You must only, ever capitalize the first letter of an initialism
in either of these case conventions.
And, in fact, you must never vary caplitalization in any
of the other case conventions.
Mark my words, XMLHttpRequest was a mistake we must never
repeat.
xml-http-requestxml_http_requestXML_HTTP_REQUESTxmlHttpRequestXmlHttpRequest
Accept no alternatives.
Some terms may be numbers.
You have my express permission to use an _ to separate any pair of numbers
that happen to run into each other.
For example, v1_2_3.
Identifer #
If you still believe that the initialism ID is somehow an abbreviation of “identifier”, you may wish to skip ahead and preserve your frail innocence. The initialism ID stands for Identity Document and has no business in your table’s column names or your language’s grammar. The dual of Id is Ego. Identifier is spelled “identifier”.
Get #
The verb get implies that a function is read-only and has no observable,
semantic side effects.
Do not break this expectation.
The get methods of splay trees may cause internal mutation that affects
the timing of subsequent operations, but these side effects fall under
the observable or semantic exceptions to the rule.
But, if the get qualifier does nothing to change the meaning of a compound
name, and if you are not writing in a sysem with a strong prevailing convention
for get methods like Java, sometimes get just makes a name longer than
necessary.
Prefer length over getLength, especially if you have the option
of making it a computed property.
The dual of get is set.
Accept no substitutes.
Size #
There are many words for size, and most of them correspond to a dimension or axis.
| number | name | name of size |
|---|---|---|
| 0 | x | width |
| 1 | y | length |
| 2 | z | height |
| 3 | t | duration |
In many languages, arrays, vectors, lists, and other ordered collections have a “length”, which erroneously suggests that they grow along the y axis. This is at least coherent with the arbitrary conceit that the heap grows up and the stack grows down, but I assume most folks imagine arrays growing left to right, and two-dimensional arrays reimagining the first axis as vertical and the secondary axis as horizontal. Truth be told, there is no rhyme or reason to the names, and we only really need one name most of the time: the dimensionaless “size”. If a collection is indexed by multiple axes, each with their own size, then the most sensible recourse is for the size to itself be a vector and to refer to each dimension by its number, not its name.
JavaScript makes a distinction between length and size.
That is, arrays have a length and other collections have a size.
This makes sense only in JavaScript because length implies that an object
participates in a protocol where individual values can be retrieved by
index, using protocol[index] notation, whereas having a size property
is another duck entirely.
The collection protocol addresses values using methods like get.
If you are writing JavaScript, use the name that is coherent with the
precedents established by the language.
In general, you are probably using a library or language that has established
its own precedent.
In Python, it is len.
In Math, it is “cardinality”, which fails most naming rules I mention and some
that I have not.
Mercifully, in Golang, the conventional name is Size.
In the spirit of using a naming system that is coherent and free of synonyms,
use the prevailing convention for your language or libraries.
But, if you have the liberty to establish a precedent, I recommend “size” and to accept no substitutes.
And so on #
The purpose of these rules is to replicate your system of naming through osmosis so that your fellow readers and writers can infer what a name means and what name they can use to extend your system in a way you will immediately and intuitively understand. This is about paving the way for working fluidly with your team or community.
Choosing good names is a skill much like kindness. They are both ultimately thoughtfulness, and can be obtained with little more than caring enough to think a little more, and that investment compounds non-linearly with practice.
Consider this a living document. Come again soon.
- Previous: Now