Types: Inference

You may have noticed that not everything is annotated (e.g., local variables). However, the typechecker is still able to make rational assertions on type mismatches. It fills in the annotation gaps through type inference.

Basically type inference is a deduction of what the type of a variable should be based upon the knowns that are given to it. And the typechecker can do inference based on the knowns of the annotations it does see as well as the current flow of the program.

Local Variables

Local variables are not type annotated. Their types are inferred based on the flow of the program. In fact, you can assign different values of different types to local variables. When a local variable is returned from the function or method, or when it is compared or somehow otherwise used against a variable of a known type, is the only point at which it matters what the type of a local variable is.

<?hh

namespace Hack\UserDocumentation\Types\Inference\Examples\LocalVariables;

function foo(): int {
  $a = str_shuffle("ABCDEF"); // $a is a string
  if (strpos($a, "A") === false) {
    $a = 4; // $a is an int
  } else {
    $a = 2; // $a is an int
  }
  // Based on the flow of the program, $a is guaranteed to be an int at this
  // point, so it is safe to return as an int.
  return $a;
}

function run(): void {
  var_dump(foo());
}

run();
Output
int(2)

Unresolved Types

The above example showed a case where a variable was assigned to an int in both branches of the if/else. This makes it easy for the typechecker to determine that a variable can and only will be an int when it encounters the return.

However what happens if instead of assigning a variable to the same type in both branches of a conditional, you decide to assign it to a different type in each branch?

<?hh

namespace Hack\UserDocumentation\Types\Inference\Examples\Unresolved;

function foo(): arraykey {
  $a = str_shuffle("ABCDEF"); // $a is a string
  if (strpos($a, "A") === false) {
    $a = 4; // $a is an int
  } else {
    $a = "Hello"; // $a is string
  }
  // Based on the flow of the program, at this point $a is either an int or
  // string. You have an unresolved type; or, to look at it another way, you
  // the union of an int and string. So you can only perform operations that
  // can be performed on both of those types.

  var_dump($a + 20); // Nope. This isn't good for a string

  $arr = array();
  $arr[$a] = 4; // Fine. Since an array key can be an int or string

  // arraykey is fine since it is either an int or string
  return $a;
}

var_dump(foo());
Output
int(20)
string(5) "Hello"

In the conditional branch, we are assigning the same local variable to one of two types. This makes the local variable unresolved, meaning that the typechecker knows that the variable can be one of the two types, but doesn't know which. So at this point, only operations that can be performed on both types will be permitted.

Class Properties

Normally class properties are annotated, so the typechecker initially knows their expected type. But sometimes the typechecker has to make some assumptions that makes inferring further use of a property a bit more complicated than it is for local variables.

<?hh

namespace Hack\UserDocumentation\Types\Inference\Examples\Props;

class A {
  protected ?int $x;

  public function __construct() {
    $this->x = 3;
  }

  public function setPropToNull(): void {
    $this->x = null;
  }

  public function checkPropBad(): void {
    // Typechecker knows $x isn't null after this validates
    if ($this->x !== null) {
      // We know that this doesn't call A::setPropToNull(), but the typechecker
      // does not since inferences is local to the function.
      // Commenting out so typechecker passes on all examples
      does_not_set_to_null();
      // We know that $x is still not null, but the typechecker doesn't
      take_an_int($this->x);
    }
  }

  public function checkPropGood(): void {
    // Typechecker knows $x isn't null after this validates
    if ($this->x !== null) {
      // We know that this doesn't call A::setPropToNull(), but the typechecker
      // does not since inferences is local to the function.
      does_not_set_to_null();
      // Use this invariant to tell the typechecker what's happening.
      invariant($this->x !== null, "We know it is not null");
      // We know that $x is still not null, and now the typechecker does too
      // Could also have used a local variable here saying:
      //    $local = $this->x;
      //    takes_an_int($local);
      take_an_int($this->x);
    }
  }
}

function does_not_set_to_null(): void {
  echo "I don't set A::x to null" . PHP_EOL;
}

function take_an_int(int $x): void {
  var_dump($x);
}

function run(): void {
  $a = new A();
  $a->checkPropBad();
  $a->checkPropGood();
}

run();
Output
I don't set A::x to null
int(3)
I don't set A::x to null
int(3)

The typechecker only infers local to a function. It makes no assumptions about what might happen outside the function, say, for example, if a function calls another function. That's why the typechecker will throw an error even though we know by the eye test that there is not a null problem.

This issue is solved by using local variables set to the property value or by using invariant().