The Empty Collection Pattern

Published on 6 May 2025 • 2 min read

When picking a type for a field, function parameter, return value, variable or database column, developers sometimes allow the value to be unset. Different programming languages use a different term for a value being unset, and sometimes have more than one (I’m looking at you, JavaScript!). null , nil, None and undefined are probably the most common ones. Care must be taken not to attempt to access attributes or invoke methods on such values before first ensuring that they have been set. Failure to do so will result in an error. That error may either occur at compile-time or at run-time, depending on the language.

The remedy is to add code that ensures that value has been set before invoking attributes/methods on it. JavaScript and C# have a special optional chaining operator ?. that makes such checks quite terse. Python lacks such an operator, although it has been proposed. As a result, guarding against unset values in Python tends to add quite a bit of code.

if foo is not None:
    # do something with foo

Hence, there is something to be gained if such checks can be avoided. For collections and strings (which can be considered to be an ordered collection of characters), the empty collection is usually sufficient to convey that something is “not set” or “unspecified”. Therefore, favor empty collections over nulls.

More concretely:

For arrays, use [] instead of null.
For strings, use "" instead of null.
For dictionaries, use an empty dictionary instead of null.
For sets, use the empty set instead of null.

This is especially pertinent when writing schema definitions, whether they be for persistence (databases, files, objects) or for APIs. Consumers of schemas that adhere to this rule will be spared a non-zero amount of pain.