Why are gradual static types so great?

As of the last couple years, every popular dynamic language has optional static types. Why is this such a convergent feature of language designs?

The obvious answer:

Static types speed up large projects, but slow down small projects. The most popular dynamic languages have many “small projects” that became very large, and so their “enterprise users” want to retrofit static types onto them.

This may have been the impetus for many gradual typing projects, but I don’t think it can explain the explosive growth in interest. Writing Python today, I want to add types to anything larger than a tiny script—the biggest point of friction is from typing import and that I have to run mypy manually.

Instead, my theory of the success of gradual static types is that they’ve actually ended up as the optimal point in language design space, for a few underappreciated reasons.

Low syntactic noise

All the gradual type systems I cited were designed to easily allow adding annotations to existing code. That forces the system to have great type inference, because users won’t accept the extra labor of adding redundant annotations.

Most gradual type systems seem to have converged on a happy medium for inference: require annotations for function signatures, but (almost) nothing else. The type-signature requirement was probably originally chosen to make inference computationally tractable. But it’s also the only place where type annotations are ergonomically important, because it’s the place where it’s hardest for the reader to figure out the types from context.

High expressiveness

Another thing constraining these type systems is that existing code wasn’t designed with static types in mind—for instance, using heterogeneous dictionaries where a statically typed language might use classes. That forces the type systems to add expressive features like typed dicts, structural typing, unions, intersections, and literal types. They’ll even analyze your control flow to constrain your types further.

As language features, the expressiveness and clean syntax of gradual types are great, but not exactly an innovation—they’re shared with many static-only languages (except for maybe the control flow analysis). But gradual type systems also have great features that no static-only type system I know of has achieved:

Seamless dynamic code

If you have some code that you don’t want to type, a gradual type system makes it easy for you to do that. This doesn’t sound so great (we should type everything, shouldn’t we?) but it’s really useful in a couple situations.

Metaprogramming: Code that dynamically creates classes or functions (for instance, an object-relational mapper) is often very hard to type soundly. In a statically typed language, that means you have to resort to code generation, which is much more laborious to write.¹ It’s probably not an accident that Java’s Hibernate is 10x the code of Python’s SQLAlchemy.²

Testing: Test doubles like mocks and spies are way more verbose without the escape hatch of dynamic code. Some languages have frameworks like Mockito to make them less painful, but the contortions they have to go through to make the types work are often horrifying. By contrast, a fully general mock is easy to implement with Python’s getattr, Ruby’s method_missing, etc.

Unsoundness

A related benefit that you can’t get from static-only type systems is the ability to lie about types.

Why is this useful? Take the example of Python’s unittests.MagicMock. It doesn’t inherit from any object, but it’s a drop-in replacement for nearly anything. That would be forbidden in a type system like Java’s, but most gradual type systems will let you lie that the MagicMock is actually an AnnoyingToInstantiateDependency or whatever.³

This comes up in practice more than you might think. For instance, the mypy codebase annotates types that are lies with Bogus so that the mypyc static compiler doesn’t make wrong assumptions about their type; I count about 35 Boguses in mypy v0.660.

Five years ago, it seemed to me like static languages were obviously better for large codebases. They gave you two huge productivity boosts:

You could catch many trivial bugs without actually running your code.
You could have reliably good code completion.

I was confused why anyone would choose a dynamic language for something they expected to be a big project. But since then, dynamic languages have leapfrogged static ones. They captured the same productivity boosts—and they extended the idea of a type system in ways that let them stay cleaner and easier to write.

If your statically typed language supports macros, you can also use macros. But macros are notorious for being hard to manage both for language designers and for users. I think macro systems have made progress recently, but I don’t have enough experience using them to compare them to metaprogramming in a dynamic language. ↩︎
As counted by git clone sqlalchemy/sqlalchemy; git clone hibernate/hibernate-orm; sloccount sqlalchemy hibernate-orm. This comparison is somewhat unfair, since Hibernate seems to be more popular, and Java is exceptionally verbose and has a culture of crappy architecture. It’s also possible Hibernate has more features or something. Still, it’s hard for me to imagine those making a 10x difference on their own. ↩︎
Of course, in this case you could extract an interface and make the client take an IAnnoyingToInstantiateDependency, or use Mockito or something. But extracting interfaces everywhere can lead to huge boilerplate problems (as anyone who’s tried to read a complicated Java library can tell you), and Mockito is a whole complicated library built to solve the single use case of lying about mocks. ↩︎

Comments

Jim

July 2020

There’s a (quoted ?) claim here: “Static types speed up large projects, but slow down small projects.” – that’s a commonly expressed viewpoint, BUT I haven’t seen any good data to support it. FWIW, my experience, in a statically-type language that I’m fluent in, is that types don’t significantly slow me down. Well, other than the tiny bit of time it takes to enter a few extra keystrokes, which I consider not even a rounding error, in total coding time.

Another–arguably extraordinary–claim: “But since then, dynamic languages have leapfrogged static ones.” I’ll just say: citation needed.

Interestingly, I spent several years programming in a language (AS3) that is (~was) the opposite of the gradual typing discipline promoted here. It’s a static (and nominally) typed language, by default. But a class can be made dynamically typed, if desired, with trivial effort (one extra keyword on the class definition, if I remember correctly). At least for the types of applications that I was developing, I only very rarely made use of the dynamic type feature (the only one that I can recall are objects for dynamically configured data tables/columns).

Candidly, I have no ’love’ for meta-programming, and I don’t mind using interfaces for test mocks, so those two viewpoints probably contribute towards my skepticism of the value of dynamic–or even gradual–typing.

Low syntactic noise

High expressiveness

Seamless dynamic code

Unsoundness

Related

10x (engineer, context) pairs

In defense of blub studies

Essays on programming I think about a lot

Comments