A consistent error format for Gargantext (#267) · Issues · gargantext / haskell-gargantext

A consistent error format for Gargantext

This ticket stems from purescript-gargantext#596.

Consistent errors

In Gargantext, the way we handle error is a bit all over the place. For historical reasons, we don't have a monomorphic error type in the handlers; what I mean is that all our server handlers and functions are typically polymorphic over a generic error err, which only at the top-level becomes a concrete GargError. This is not a bad design per se, because it allows us to swap the errors when testing a mock implementation. However, this also naturally brings us in a place where multiple errors might be defined in any style, without paying too much attention to consistency between them. Furthermore, we are not always disciplined and consistent when it comes to throwing errors; for example in some occasions we simply panic, returning junk to the frontend. Last but not least we are not consistent in how we transform those errors into a suitable JSON for the frontend to consume. To give you an example, here is how a GargNodeError is translated to JSON:

{"error":"GargNodeError No Root   found"}

In the same fashion, here is how a GargTreeError:

{"error":"GargTreeError Too many root nodes"}

And finally, here is a GargServerError:

{"error":"GargServerError (ServerError {errHTTPCode = 404, errReasonPhrase = "Not Found", errBody = "", errHeaders = []})"}

I hope I won't rub anybody the wrong way, but these are horrible. They are horrible for the following reasons:

They offers zero clue about what went wrong -- what the frontend should try to report to the user if we get a "Too many root nodes"?
They offers zero clue about the precise HTTP error code they need to be translated into. A REST server should (and perhaps MUST) harness a very rich spectrum of HTTP error codes to report to the frontend what is the nature of the error, spanning from unauthorised errors (403) to bad requests (400) to internal errors (500) and so on. In Gargantext we are doing none of that, we simply call showAsServantJSONErr and mush everything into either a 404 or a 500, which is just a lie;
Worst of all, they are not structured and/or consistent. In all three examples, the frontend has to parse a string in order to at least understand the nature of the error, but even so this string might contain sometimes just a text for the user, in the last case, for example directly the shown Haskell data types, something surely the frontend shouldn't have seen (not to mention that it cannot do anything with it).

Proposal and solution

Let's overhaul all of this in favour of a simple but consistent error type format, that the frontend can trust and rely upon. A possible sketch:

{ "error": {
    "diagnostic": "A human-friendly diagnostic of what went wrong"
  , "type": <an enum-style of the error>
  , "data": <any error-specific payload>
  }
}

In pseudo Haskell (this is just a sketch):

data FrontendError a
  = FrontendError
    { fe_diagnostic :: T.Text
    , fe_type :: BackendErrorType
    , fe_data :: Maybe a
    }

data BackendErrorType
  = BE_phylo_corpus_not_ready
  | BE_not_good_enough_ratio
  | BE_node_not_found
  | ... you get the idea

This will produce (if implemented correctly!) consistent errors, for example:

{ "error": {
    "diagnostic": "Your corpus and map list may not be ready for building a phylomemy."
  , "type": "BE_phylo_corpus_not_ready"
  , "data": null -- no extra info to add here, but we could
  }
}

Now we could easily write a function like BackendErrorType -> HTTP.Status, and return an HTTP response to the frontend with a correct error. The frontend, in turn, will know that if they get an object with a non-null error field, they will always find a diagnostic field to display directly to the user, any extra data if they need to do further processing and a precise type tag that tells them what the error was, in case they want to show more sophisticated things to the users.

A notable mention is that we might not need to throw FrontendErrors directly from the guts of Gargantext, but we could convert a GargServerError into a FrontendError directly inside what is now showAsServantJSONErr, but in order to do so we might need to extend existing backend errors to carry around enough information in order to perform such conversion.

Good error messages for the user

Then there is the concern of how to expose good error messages to our users; I think that it's the backend which should be in charge of producing such textual message, which I'm referring to as diagnostic from now on. Why? Because the backend will know exactly what has been causing a given error in the first place, so it can remove the "guesswork". In the Purescript example, one proposed error was:

"Oops something went wrong ! Your corpus and map list may not be ready for building a phylomemy. Have you annotated the entire time period of your corpus ? Have you a good ratio between specific and generic terms in your map list ?"

I think this is too broad and generic, we should try to tell the user something more specific, and we can do this only if we are principled about the way we issue errors in the backend, and know exactly the reason we are issuing an error in the first place.

Localising text

However, we shouldn't assume our audience all speak English, but rather we should plan ahead in a way that Gargantext can be translated in the future in other languages; this is what is known as i18n. There are various approaches which can be taken here, but a popular one is that we store the user's locale in their account settings, and this information is made available in the backend when we handle a request. Then, we use a Haskell library like this one to convert the diagnostic from English into the user's locale, and that's what we return to the frontend.

Mind, we don't have to start supporting anything different from English immediately, but we should design our code in a way that it's easy to add more locales down the line.

This ticket stems from https://gitlab.iscpif.fr/gargantext/purescript-gargantext/issues/596.

cc @cgenie and @anoe

## Consistent errors

In Gargantext, the way we handle error is a bit all over the place. For historical reasons, we don't have a _monomorphic_ error type in the handlers; what I mean is that all our server handlers and functions are typically polymorphic over a generic error `err`, which only at the top-level becomes a concrete `GargError`. This is not a bad design per se, because it allows us to swap the errors when testing a mock implementation. However, this also naturally brings us in a place where multiple errors might be defined in any style, without paying too much attention to consistency between them. Furthermore, we are not always disciplined and consistent when it comes to throwing errors; for example in some occasions we simply `panic`, returning junk to the frontend. Last but not least we are not **consistent** in how we transform those errors into a suitable JSON for the frontend to consume. To give you an example, here is how a `GargNodeError` is translated to JSON:

```
{"error":"GargNodeError No Root   found"}
```

In the same fashion, here is how a `GargTreeError`:

```
{"error":"GargTreeError Too many root nodes"}
```

And finally, here is a `GargServerError`:

```
{"error":"GargServerError (ServerError {errHTTPCode = 404, errReasonPhrase = "Not Found", errBody = "", errHeaders = []})"}
```

I hope I won't rub anybody the wrong way, but these are **horrible**. They are horrible for the following reasons:

1. They offers _zero_ clue about what went wrong -- what the frontend should try to report to the user if we get a "Too many root nodes"?

2. They offers zero clue about the precise _HTTP error code_ they need to be translated into. A REST server should (and perhaps MUST) harness a very rich spectrum of HTTP error codes to report to the frontend what is the nature of the error, spanning from unauthorised errors (403) to bad requests (400) to internal errors (500) and so on. In Gargantext we are doing none of that, we simply call `showAsServantJSONErr` and mush everything into either a 404 or a 500, which is just a lie;

3. Worst of all, they are not _structured_ and/or consistent. In all three examples, the frontend has to **parse a string** in order to at least understand the nature of the error, but even so this string might contain sometimes just a text for the user, in the last case, for example _directly the shown Haskell data types_, something surely the frontend shouldn't have seen (not to mention that it cannot do anything with it).

### Proposal and solution

Let's overhaul all of this in favour of a simple but consistent error type format, that the frontend can trust and rely upon. A possible sketch:

```
{ "error": {
    "diagnostic": "A human-friendly diagnostic of what went wrong"
  , "type": <an enum-style of the error>
  , "data": <any error-specific payload>
  }
}
```

In pseudo Haskell (this is just a sketch):

```hs
data FrontendError a
  = FrontendError
    { fe_diagnostic :: T.Text
    , fe_type :: BackendErrorType
    , fe_data :: Maybe a
    }

data BackendErrorType
  = BE_phylo_corpus_not_ready
  | BE_not_good_enough_ratio
  | BE_node_not_found
  | ... you get the idea
```

This will produce (if implemented correctly!) consistent errors, for example:

```
{ "error": {
    "diagnostic": "Your corpus and map list may not be ready for building a phylomemy."
  , "type": "BE_phylo_corpus_not_ready"
  , "data": null -- no extra info to add here, but we could
  }
}
```

Now we could easily write a function like `BackendErrorType -> HTTP.Status`, and return an HTTP response to the frontend with a correct error. The frontend, in turn, will know that if they get an object with a non-null `error` field, they will always find a `diagnostic` field to display directly to the user, any extra `data` if they need to do further processing _and_ a precise type tag that tells them what the error was, in case they want to show more sophisticated things to the users.

A notable mention is that we might not need to throw `FrontendError`s directly from the guts of Gargantext, but we could convert a `GargServerError` into a `FrontendError` directly inside what is now `showAsServantJSONErr`, but in order to do so we might need to extend existing backend errors to carry around enough information in order to perform such conversion.

## Good error messages for the user

Then there is the concern of how to expose good error messages to our users; I think that it's the _backend_ which should be in charge of producing such textual message, which I'm referring to as `diagnostic` from now on. Why? Because the backend will know exactly what has been causing a given error in the first place, so it can remove the "guesswork". In the Purescript example, one proposed error was:

> "Oops something went wrong ! Your corpus and map list may not be ready for building a phylomemy. Have you annotated the entire time period of your corpus ? Have you a good ratio between specific and generic terms in your map list ?"

## Localising text

However, we shouldn't assume our audience all speak English, but rather we should plan ahead in a way that Gargantext can be translated in the future in other languages; this is what is known as [i18n](https://en.wikipedia.org/wiki/Internationalization_and_localization). There are various approaches which can be taken here, but a popular one is that we store the user's _locale_ in their account settings, and this information is made available in the backend when we handle a request. Then, we use a Haskell library like [this one](https://hackage.haskell.org/package/i18n) to convert the `diagnostic` from English into the user's locale, and that's what we return to the frontend.

Mind, we don't have to start supporting anything different from English immediately, but we should design our code in a way that it's easy to add more locales down the line.