Saturday, February 27, 2016

Auto-generate a command line interface from a data type

I'm releasing the optparse-generic library which uses Haskell's support for generic programming to auto-generate command-line interfaces for a wide variety of types.

For example, suppose that you define a record with two fields:

data Example = Example { foo :: Int, bar :: Double }

You can auto-generate a command-line interface tailored to that record like this:

{-# LANGUAGE DeriveGeneric     #-}
{-# LANGUAGE OverloadedStrings #-}

import Options.Generic

data Example = Example { foo :: Int, bar :: Double }
    deriving (Generic, Show)

instance ParseRecord Example

main = do
    x <- getRecord "Test program"
    print (x :: Example)

This generates the following command-line interface:

$ stack runghc Example.hs -- --help
Test program

Usage: Example.hs --foo INT --bar DOUBLE

Available options:
  -h,--help                Show this help text

... and we can verify that the interface works by supplying the appropriate arguments:

$ stack runghc Example.hs -- --foo 1 --bar 2.5
Example {foo = 1, bar = 2.5}

You can also compile the program into a native executable binary:

$ stack ghc Example.hs
[1 of 1] Compiling Main             ( Example.hs, Example.o )
Linking Example ...
$ ./Example --foo 1 --bar 2.5
Example {foo = 1, bar = 2.5}

Features

The auto-generated interface tries to be as intelligent as possible. For example, if you omit the record labels:

data Example = Example Int Double

... then the fields will become positional arguments:

$ ./Example --help
Test program

Usage: Example INT DOUBLE

Available options:
  -h,--help                Show this help text

$ ./Example 1 2.5
Example 1 2.5

If you wrap a field in Maybe:

data Example = Example { foo :: Maybe Int }

... then the corresponding command-line flag/argument becomes optional:

$ ./Example --help
Test program

Usage: Example [--foo INT]

Available options:
  -h,--help                Show this help text

$ ./Example
Example {foo = Nothing}

$ ./Example --foo 2
Example {foo = Just 2}

If a field is a list of values:

data Example = Example { foo :: [Int] }

... then the corresponding command-line flag/argument can be repeated:

$ ./Example --foo 1 --foo 2
Example {foo = [1,2]}

$ ./Example
Example {foo = []}

If you wrap a value in First or Last:

data Example = Example { foo :: First Int, bar :: Last Int }

... then you will get the first or last match, respectively:

$ ./Example --foo 1 --foo 2 --bar 1 --bar 2
Example {foo = First {getFirst = Just 1}, bar = Last {getLast = Just 2}}

$ ./Example
Example {foo = First {getFirst = Nothing}, bar = Last {getLast = Nothing}}

You can even do fancier things like ask for the Sum or Product of all matching fields:

data Example = Example { foo :: Sum Int, bar :: Product Int }

... and it will do the "right thing":

$ ./Example --foo 1 --foo 2 --bar 1 --bar 2
Example {foo = Sum {getSum = 3}, bar = Product {getProduct = 2}}

$ ./Example
Example {foo = Sum {getSum = 0}, bar = Product {getProduct = 1}}

If a data type has multiple constructors:

data Example
    = Create { name :: Text, duration :: Maybe Int }
    | Kill   { name :: Text }

... then that translates to subcommands named after each constructor:

$ ./Example --help
Test program

Usage: Example (create | kill)

Available options:
  -h,--help                Show this help text

Available commands:
  create                   
  kill     

$ ./Example create --help
Usage: Example create --name TEXT [--duration INT]

Available options:
  -h,--help                Show this help text

$ ./Example kill --help
Usage: Example kill --name TEXT

Available options:

  -h,--help                Show this help text

$ ./Example create --name foo --duration 60
Create {name = "foo", duration = Just 60}

$ ./Example kill --name foo
Kill {name = "foo"}

This library also supports many existing Haskell data types out of the box. For example, if you just need to get a Double and Int from the command line you could just write:

{-# LANGUAGE DeriveGeneric     #-}
{-# LANGUAGE OverloadedStrings #-}

import Options.Generic

main = do
    x <- getRecord "Test program"
    print (x :: (Double, Int))

... and that will parse two positional arguments:

$ ./Example --help
Test program

Usage: Example DOUBLE INT

Available options:
  -h,--help                Show this help text

$ ./Example 1.1 2
(1.1,2)

Compile-time safety

Haskell's support for generic programming is done completely at compile time. This means that if you ask for something that cannot be sensibly converted into a command-line interface your program will fail to compile.

For example, if you ask for a list of lists:

data Example = Example { foo :: [[Int]] }

.. then the compiler will fail with the following error message since you can't (idiomatically) model "repeated (repeated Ints)" on the command line:

    No instance for (ParseField [Int])
      arising from a use of ‘Options.Generic.$gdmparseRecord’
    In the expression: Options.Generic.$gdmparseRecord
    In an equation for ‘parseRecord’:
        parseRecord = Options.Generic.$gdmparseRecord
    In the instance declaration for ‘ParseRecord Example’

Conclusion

If you would like to use this package or learn more you can find this package:

I also plan to re-export this package's functionality from turtle to further simplify command-line programming.

17 comments:

  1. This library is similar, even though it has less features and it seems incomplete https://github.com/soenkehahn/getopt-generics

    ReplyDelete
    Replies
    1. Yeah, that does look pretty similar. After studying it a bit, I think I can summarize the main differences as:

      * `optparse-generic` produces `Parser`s compatible with the `optparse-applicative` library
      * `optparse-generic` supports sum types
      * `optparse-generic` has instances for more types (particularly `Char`, which is tricky to do without conflicting with `String`)
      * `getopt-generics` lets you combine records using tuples
      * `getopt-generics` lets you add help information and short flags with the `Modifiers` feature

      Delete
  2. I love you Gabriel! You know what I need. Unlike Edward Kmett or Michael Snoyman.

    But this approach has a lot of restrictions: no short keys, no options, no explanation to the keys and commands. I think it is possible to fix with the help of type-level string literals. Also it creates insoluble problems with internalization.

    ReplyDelete
    Replies
    1. Wtf are Kmett or Snoyman doing in that rant? :(

      Delete
    2. Yeah, this is mostly intended as a way to quickly bind Haskell code to a command-line interface. However, I'm looking into ways to add help text

      Delete
  3. This is cool! I did something similar (just a week ago!) but with TH, and fewer features. I was hoping to re-implement with Generics but was stuck so I will look to your work for hints.

    My idea was also more general: to build lots of things from a data type. My original plan was an HTML form for an FRP library. But then I realized that as long as the thing you use to build each type is applicative plus a method to handle sum-types, you can make a fairly general builder.

    My idea so far is here: https://github.com/adamConnerSax/dataBuilder

    Have you thought about making the infrastructure more general? It seems like it would have a few applications--any kind of user input really: html forms, command line, etc.

    It also starts to seem sort of lensish though I can't make that analogy precise.

    Anyway, cool work!

    Adam

    ReplyDelete
    Replies
    1. Note that an `Applicative` with a method to handle sum types is Haskell's `Alternative` class. A typical idiom for using `Alternative` to combine multiple value is:

      data Example = C1 T1 | C2 T2 | C3 | T3

      instance Alternative F

      v1 :: F T1
      v2 :: F T2
      v3 :: F T3

      v :: F Example
      v = fmap C1 v1 <|> fmap C2 v2 <|> fmap C3 v3

      I agree that once you have some type constructor `F` that implements `Alternative` you should be able to derive some way to mix that with a `Generic` instance in a uniform way.

      One way you might provide a uniform interface is to do something like this:

      class Uniform a where
      auto :: Alternative f => f a

      instance (Uniform a, Uniform b) => Uniform (a, b) where
      auto = liftA2 (,) auto auto

      instance (Uniform a, Uniform b) => Uniform (Either a b) where
      auto = fmap Left auto <|> fmap Right auto

      instance Uniform Void where
      auto = empty

      instance Uniform () where
      auto = pure ()

      ... and then given any type that implements `Generic` you can automatically derive a `Uniform` instance for the same type. Then you just instantiate the `f` to your specific `Alternative` type and you are done.

      Would you mind if I release something like the above as a new package?

      Delete
    2. Yeah. I thought about alternative but I wasn't sure I could embed the metadata (and maybe configuration info) I wanted to pass around into it.

      But I'll look again!

      And of course release what you think is useful! I look forward to seeing the design more fully.

      Adam

      Delete
    3. I remember now why Alternative didn't work for me. I am interested in cases where I need all the alternatives before I can choose (e.g., turning the choice among constructors of a sum type into a dropdown in an HTML form). But to use alternative I need to combine them pair wise, right? I couldn't see how that would work for me.

      Adam

      Delete
    4. One more thing, on the Parser specific case: you might want to do something specific for the Enum (or Enum and Ord and Bounded) case. You likely don't want a command there so much as a set of single flag options. But YMMV.

      Delete
    5. So in the case of `Enum` the user can always opt in to the version that you describe by deferring to the `ParseFields` instance like this:

      instance ParseRecord MyEnum where
      parseRecord = fmap getOnly parseFields

      Delete
  4. This is cool! I did something similar (just a week ago!) but with TH, and fewer features. I was hoping to re-implement with Generics but was stuck so I will look to your work for hints.

    My idea was also more general: to build lots of things from a data type. My original plan was an HTML form for an FRP library. But then I realized that as long as the thing you use to build each type is applicative plus a method to handle sum-types, you can make a fairly general builder.

    My idea so far is here: https://github.com/adamConnerSax/dataBuilder

    Have you thought about making the infrastructure more general? It seems like it would have a few applications--any kind of user input really: html forms, command line, etc.

    It also starts to seem sort of lensish though I can't make that analogy precise.

    Anyway, cool work!

    Adam

    ReplyDelete
  5. Love the simplicity of this! Nice work as always.

    Can this be used with stack? When I run the binary built from a `stack install` I just get a "bad value of -N" error and then a list of opts for the GHC runtime...

    ReplyDelete
    Replies
    1. Could you open up an issue for this here: https://github.com/Gabriel439/Haskell-Optparse-Generic-Library/issues

      Delete
  6. Amazing!

    Any plans on adding support for records whose fields are other records?

    ReplyDelete
    Replies
    1. I didn't have any plans to add this, but if you think there is a reasonable way to do this then you can open up an issue to discuss this in more detail

      Delete