Skip to content

Instantly share code, notes, and snippets.

@smarter
Last active September 6, 2024 17:18
Show Gist options
  • Select an option

  • Save smarter/2e1c564c83bae58c65b4f3f041bfb15f to your computer and use it in GitHub Desktop.

Select an option

Save smarter/2e1c564c83bae58c65b4f3f041bfb15f to your computer and use it in GitHub Desktop.

Revisions

  1. smarter revised this gist Aug 7, 2016. 1 changed file with 2 additions and 2 deletions.
    4 changes: 2 additions & 2 deletions gadt.md
    Original file line number Diff line number Diff line change
    @@ -5,7 +5,7 @@
    Here's an ADT which is not a GADT, in Haskell:

    ```haskell
    date Expr = IntExpr Int | BoolExpr Bool
    data Expr = IntExpr Int | BoolExpr Bool
    ```

    And in Scala:
    @@ -21,7 +21,7 @@ And here's a GADT:
    ``` haskell
    data Expr a where
    IntExpr :: Int -> Expr Int
    BoolExpr :: Boolean -> Expr Bool
    BoolExpr :: Bool -> Expr Bool
    ```

    ``` scala
  2. smarter revised this gist Aug 7, 2016. 1 changed file with 1 addition and 1 deletion.
    2 changes: 1 addition & 1 deletion gadt.md
    Original file line number Diff line number Diff line change
    @@ -49,7 +49,7 @@ know more information about `T`: since `IntExpr` is a subtype of `Expr[Int]`,
    normally always keep the same meaning, this is why GADTs need to be treated specially
    by the compiler.

    *TODO*: expand this section with some details on how GADTs are implemented in Dotty
    **TODO**: expand this section with some details on how GADTs are implemented in Dotty

    ## A type soudness bug in pattern matching

  3. smarter revised this gist Aug 7, 2016. No changes.
  4. smarter revised this gist Aug 7, 2016. 1 changed file with 18 additions and 14 deletions.
    32 changes: 18 additions & 14 deletions gadt.md
    Original file line number Diff line number Diff line change
    @@ -31,7 +31,7 @@ case class BoolExpr(b: Boolean) extends Expr[Boolean]
    ```

    At first, it may look like GADTs should not require any special handling from
    the compiler in Scala since they're defined the same way as regular ADTs, but consider this example:
    the compiler in Scala since they're defined like regular ADTs, but consider this example:

    ``` scala
    object Test {
    @@ -46,10 +46,12 @@ object Test {
    `Boolean`, how does this typecheck? The answer is that inside each `case` we
    know more information about `T`: since `IntExpr` is a subtype of `Expr[Int]`,
    `T` must be equal to `Int`, etc. This is very unusual for types in Scala, they
    normally always keep the same meaning, this is why GADTs are treated specially
    in the compiler.
    normally always keep the same meaning, this is why GADTs need to be treated specially
    by the compiler.

    ## A type soudness buf in pattern matching
    *TODO*: expand this section with some details on how GADTs are implemented in Dotty

    ## A type soudness bug in pattern matching

    Consider the following code:
    ``` scala
    @@ -100,8 +102,8 @@ object Test {

    The issue is that when we typecheck `case d @ D(x) =>`, the compiler needs to
    infer the type parameter `S` of `D` under the constraint `D[S] <:< C[Object]`
    (`<:<` means `is a subtype of`). This constraint does uniquely determine `S`, at
    this point we only know that `S <:< Object`, but the compiler still decides to
    (`<:<` means `is a subtype of`). This constraint does not uniquely determine `S`, at
    this point we only know that `S <:< Object`, but the compiler still chooses to
    infer `S = Object` which is incorrect since `D[String]` is not a subtype of
    `D[Object]`.

    @@ -115,7 +117,7 @@ Non-variant type variable S cannot be uniquely instantiated.
    ^
    ```
    This warning becomes an error if `-strict` is passed to the compiler, this is not
    currently the default but is supposed to be in the future.
    currently the default but is supposed to become so in the future.
    What the warning says is that since we don't know
    exactly what `S` is (we only know `S <:< Object`) we should give up and fail to
    compile the pattern match.
    @@ -142,7 +144,7 @@ This code is perfectly safe but fails to compile under `-strict` because the
    type parameters `A` and `B` of `Pair(a, b)` are not constrained (we only know that `T =
    (A, B)`).

    #### Possible project: An alternative fix
    #### A possible fix

    I believe that it is possible to fix the soudness issue without preventing the
    previous example from compiling: instead of always assigning a specific type to
    @@ -159,7 +161,8 @@ we only know that `S` is some unknown type such that `S <:< Object`, this means
    that `d.s = new Integer(1)` won't compile, because we don't know if `Integer` is
    a subtype of `S`, on the other hand, the GADT example above will compile because
    even though we don't know anything about `A` and `B` themselves we, do know that
    `T = (A, B)`.
    `T = (A, B)`. Implementing this should not be too complicated because a lot of
    the required machinery is already present.


    ## GADTs and variance
    @@ -169,13 +172,14 @@ Combining GADTs with variance lead to some interesting issues, see

    ## GADTs and exhaustiveness checking

    The exhaustiveness checker is a component of the compiler that warns you when
    The exhaustiveness checker is the component of the compiler that warns you when
    you forget some cases in a pattern match, a new implementation for Dotty is
    currently being worked on at https://github.com/lampepfl/dotty/pull/1364,
    however GADTs are especially challenging to analyze, GADT exhaustiveness
    checking was recently improved in Haskell as described in
    currently being worked on at https://github.com/lampepfl/dotty/pull/1364 and
    already covers more cases than the existing Scala 2 checker.
    GADTs are especially challenging to analyze, a recent paper describes improvement
    in this area done for Haskell:
    [GADTs meet their match](http://research.microsoft.com/en-us/um/people/simonpj/papers/pattern-matching/gadtpm.pdf),
    it might be possible to adapt their algorithm to Scala.
    it might be possible to adapt this to Scala.

    ## Special syntax for GADTs ?

  5. smarter created this gist Aug 7, 2016.
    184 changes: 184 additions & 0 deletions gadt.md
    Original file line number Diff line number Diff line change
    @@ -0,0 +1,184 @@
    # Generalized Algebraic Data Types in Scala

    ## Basic GADTs

    Here's an ADT which is not a GADT, in Haskell:

    ```haskell
    date Expr = IntExpr Int | BoolExpr Bool
    ```

    And in Scala:
    ``` scala
    sealed trait Expr
    case class IntExpr(i: Int) extends Expr
    case class BoolExpr(b: Boolean) extends Expr

    ```

    And here's a GADT:

    ``` haskell
    data Expr a where
    IntExpr :: Int -> Expr Int
    BoolExpr :: Boolean -> Expr Bool
    ```

    ``` scala
    sealed trait Expr[T]
    case class IntExpr(i: Int) extends Expr[Int]
    case class BoolExpr(b: Boolean) extends Expr[Boolean]
    ```

    At first, it may look like GADTs should not require any special handling from
    the compiler in Scala since they're defined the same way as regular ADTs, but consider this example:

    ``` scala
    object Test {
    def eval[T](e: Expr[T]): T = e match {
    case IntExpr(i) => i
    case BoolExpr(b) => b
    }
    }
    ```

    `eval` must return a value of type `T`, but `i` has type `Int` and `b` has type
    `Boolean`, how does this typecheck? The answer is that inside each `case` we
    know more information about `T`: since `IntExpr` is a subtype of `Expr[Int]`,
    `T` must be equal to `Int`, etc. This is very unusual for types in Scala, they
    normally always keep the same meaning, this is why GADTs are treated specially
    in the compiler.

    ## A type soudness buf in pattern matching

    Consider the following code:
    ``` scala
    case class A[S](_s: S) {
    var s: S = _s
    }

    ```

    The parameter `S` of `A` is invariant, which means that `A[String]` is not a
    subtype of `A[Object]`, if we replace `A[S]` by `A[+S]` to make `S` covariant,
    the compiler will not allow us to compile `A` until we remove the `var`, this is
    important for type soundness, otherwise we would be able to write the following:

    ``` scala
    val x = new A[String]("abc")
    val y: A[Object] = x // Does not compile because A[String] is not a subtype of A[Object]
    y.s = new Integer(1)
    val z: String = x.s // If this compiled, we would get a crash here since x.s is
    // now an Integer
    ```

    However, by combining pattern matching with variance, we can get around this
    restriction in Scala 2, the following code compiles without any warning or error
    but crashes at runtime with a `ClassCastException`:

    ``` scala
    class C[+T]

    case class D[S](_s: S) extends C[S] {
    var s: S = _s
    }

    object Test {
    def main(args: Array[String]): Unit = {
    val x = new D[String]("abc")
    val y: C[Object] = x

    y match {
    case d @ D(x) =>
    d.s = new Integer(1)
    }

    val z: String = x.s // ClassCastException: Integer cannot be cast into String
    }
    }
    ```

    The issue is that when we typecheck `case d @ D(x) =>`, the compiler needs to
    infer the type parameter `S` of `D` under the constraint `D[S] <:< C[Object]`
    (`<:<` means `is a subtype of`). This constraint does uniquely determine `S`, at
    this point we only know that `S <:< Object`, but the compiler still decides to
    infer `S = Object` which is incorrect since `D[String]` is not a subtype of
    `D[Object]`.

    In Dotty this code still compiles by default but you get a warning:
    ``` scala
    warning: There is no best instantiation of pattern type D[S]
    that makes it a subtype of selector type C[Object].
    Non-variant type variable S cannot be uniquely instantiated.
    (This would be an error under strict mode)
    case d @ D(x) =>
    ^
    ```
    This warning becomes an error if `-strict` is passed to the compiler, this is not
    currently the default but is supposed to be in the future.
    What the warning says is that since we don't know
    exactly what `S` is (we only know `S <:< Object`) we should give up and fail to
    compile the pattern match.

    ### The strict mode is too restrictive

    So far, what I've described is not specific to GADTs, however using `-strict`
    currently prevents many useful GADTs from compiling:

    ``` scala
    trait Exp[T]
    case class Lit(value: Int) extends Exp[Int]
    case class Pair[A, B](fst: Exp[A], snd: Exp[B]) extends Exp[(A, B)]

    object Test {
    def eval[T](e: Exp[T]): T = e match {
    case Lit(x) => x
    case Pair(a, b) => (eval(a), eval(a))
    }
    }
    ```

    This code is perfectly safe but fails to compile under `-strict` because the
    type parameters `A` and `B` of `Pair(a, b)` are not constrained (we only know that `T =
    (A, B)`).

    #### Possible project: An alternative fix

    I believe that it is possible to fix the soudness issue without preventing the
    previous example from compiling: instead of always assigning a specific type to
    type variables of patterns, we should treat them as abstract bounded types, so
    in:

    ``` scala
    y match {
    case d @ D(x) =>
    d.s = new Integer(1)
    }
    ```
    we only know that `S` is some unknown type such that `S <:< Object`, this means
    that `d.s = new Integer(1)` won't compile, because we don't know if `Integer` is
    a subtype of `S`, on the other hand, the GADT example above will compile because
    even though we don't know anything about `A` and `B` themselves we, do know that
    `T = (A, B)`.


    ## GADTs and variance

    Combining GADTs with variance lead to some interesting issues, see
    [Open GADTs and Declaration-site Variance: A Problem Statement](http://lampwww.epfl.ch/~hmiller/scala2013/resources/pdfs/paper5.pdf).

    ## GADTs and exhaustiveness checking

    The exhaustiveness checker is a component of the compiler that warns you when
    you forget some cases in a pattern match, a new implementation for Dotty is
    currently being worked on at https://github.com/lampepfl/dotty/pull/1364,
    however GADTs are especially challenging to analyze, GADT exhaustiveness
    checking was recently improved in Haskell as described in
    [GADTs meet their match](http://research.microsoft.com/en-us/um/people/simonpj/papers/pattern-matching/gadtpm.pdf),
    it might be possible to adapt their algorithm to Scala.

    ## Special syntax for GADTs ?

    For Dotty we are considering adding some syntactic sugar to make ADTs less
    verbose to write, it would be interesting to figure out if and how this syntax could
    allow defining GADTs.