From 41c53b3f1efc7dbc713bc74917385db510d1f468 Mon Sep 17 00:00:00 2001 From: Matthijs Blom <19817960+MatthijsBlom@users.noreply.github.com> Date: Tue, 14 Mar 2023 15:41:22 +0100 Subject: [PATCH 1/2] Add introduction --- .../.approaches/introduction.md | 204 ++++++++++++++++++ 1 file changed, 204 insertions(+) create mode 100644 exercises/practice/rna-transcription/.approaches/introduction.md diff --git a/exercises/practice/rna-transcription/.approaches/introduction.md b/exercises/practice/rna-transcription/.approaches/introduction.md new file mode 100644 index 000000000..680ee702a --- /dev/null +++ b/exercises/practice/rna-transcription/.approaches/introduction.md @@ -0,0 +1,204 @@ +# Introduction + +This problem requires both + +- validating that all input characters validly denote DNA nucleobases, and +- producing these DNA nucleobases' corresponding RNA nucleobases. + +The first below listed approach has these tasks performed separately. +The other ones combine them in a single pass, in progressively more succinct ways. + + +## Approach: validate first, then transcribe + +```haskell +toRNA :: String -> Either Char String +toRNA dna = + case find (`notElem` "GCTA") dna of + Nothing -> Right (map transcribe dna) + Just c -> Left c + where + transcribe = \case + 'G' -> 'C' + 'C' -> 'G' + 'T' -> 'A' + 'A' -> 'U' +``` + +First search for the first invalid nucleobase. +If you find one, return it. +If all are valid, transcribe the entire strand in one go using `map`. + +This approach has the input walked twice. +Other approaches solve this problem in one pass. + +This solution deals with nucleobases twice: first when validating, and again when transcribing. +Ideally, nucleobases are dealt with in only one place in the code. + +[Read more about this approach][validate-first]. + + +## Approach: a single pass using only elementary operations + +```haskell +toRNA :: String -> Either Char String +toRNA [] = Right [] +toRNA (n : dna) = case transcribe n of + Nothing -> Left n + Just n' -> case toRNA dna of + Left c -> Left c + Right rna -> Right (n' : rna) + +transcribe :: Char -> Maybe Char +transcribe = \case + 'G' -> Just 'C' + 'C' -> Just 'G' + 'T' -> Just 'A' + 'A' -> Just 'U' + _ -> Nothing +``` + +This solution combines validation and transcription in a single list traversal. +It is _elementary_ in the sense that it employs no abstractions: it uses only constructors (`[]`, `(:)`, `Nothing`, `Just`, `Left`, `Right`) and pattern matching, and no predefined functions at all. + +Some of the code patterns used in this solution are very common, and were therefore abstracted into standard library functions. +The approaches listed below show how much these functions can help to concisely express this approach's logic. + +[Read more about this approach][elementary]. + + +## Approach: use `do`-notation + +```haskell +toRNA :: String -> Either Char String +toRNA [] = pure [] +toRNA (n : dna) = do + n' <- transcribe n + rna <- toRNA dna + pure (n' : rna) + +transcribe :: Char -> Either Char Char +transcribe = \case + 'G' -> Right 'C' + 'C' -> Right 'G' + 'T' -> Right 'A' + 'A' -> Right 'U' + c -> Left c +``` + +The [elementary solution][elementary] displays a common pattern that can equivalently be expressed using the common monadic `>>=` combinator and its `do`-notation [syntactic sugar][wikipedia-syntactic-sugar]. + +[Read more about this approach][do-notation]. + + +## Approach: use `Functor`/`Applicative` combinators + +```haskell +toRNA :: String -> Either Char String +toRNA [] = pure [] +toRNA (n : dna) = (:) <$> transcribe n <*> toRNA dna + +transcribe :: Char -> Either Char Char +transcribe = \case + 'G' -> Right 'C' + 'C' -> Right 'G' + 'T' -> Right 'A' + 'A' -> Right 'U' + c -> Left c +``` + +The [elementary solution][elementary] displays a number of common patterns. +As demonstrated by the [`do` notation solution][do-notation], these can be expressed with the `>>=` operator. +However, the full power of `Monad` is not required. +The same logic can also be expressed using common functorial combinators such as `fmap`/`<$>` and `<*>`. + +[Read more about this approach][functorial-combinators]. + + +## Approach: use `traverse` + +```haskell +toRNA :: String -> Either Char String +toRNA = traverse $ \case + 'G' -> Right 'C' + 'C' -> Right 'G' + 'T' -> Right 'A' + 'A' -> Right 'U' + n -> Left n +``` + +As it turns out, the [solution that uses functorial combinators][functorial-combinators] closely resembles the definition of `traverse` for lists. +In fact, through a series of rewritings it can be shown to be equivalent. + +[Read more about this approach][traverse]. + + +## General guidance + +### Language extensions + +For various reasons, some of GHC's features are locked behind switches known as _language extensions_. +You can enable these by putting so-called _language pragmas_ at the top of your file: + +```haskell +-- This 👇 is a language pragma +{-# LANGUAGE LambdaCase #-} + +module DNA (toRNA) where + +{- + The rest of your code here +-} +``` + + +#### `LambdaCase` + +Consider the following possible definition of `map`. + +```haskell +map f xs = case xs of + [] -> [] + x : xs' -> f x : map xs' +``` + +Here, a parameter `xs` is introduced only to be immediately pattern matched against, after which it is never used again. + +Coming up with good names for such throwaway variables can be tedious and hard. +The `LambdaCase` extension allows us to avoid having to by providing an extra bit of [syntactic sugar][wikipedia-syntactic-sugar]: + +```haskell +f = \case { } +-- is syntactic sugar for / an abbreviation of +f = \x -> case x of { } +``` + +The above definition of `map` can equivalently be written as + +```haskell +map f = \case + [] -> [] + x : xs -> f x : map f xs +``` + + +[do-notation]: + https://exercism.org/tracks/haskell/exercises/rna-transcription/approaches/do-notation + "Approach: use do-notation" +[elementary]: + https://exercism.org/tracks/haskell/exercises/rna-transcription/approaches/elementary + "Approach: a single pass using only elementary operations" +[functorial-combinators]: + https://exercism.org/tracks/haskell/exercises/rna-transcription/approaches/functorial-combinators + "Approach: use Functor/Applicative combinators" +[traverse]: + https://exercism.org/tracks/haskell/exercises/rna-transcription/approaches/traverse + "Approach: use traverse" +[validate-first]: + https://exercism.org/tracks/haskell/exercises/rna-transcription/approaches/validate-first + "Approach: validate first" + + +[wikipedia-syntactic-sugar]: + https://en.wikipedia.org/wiki/Syntactic_sugar + "Wikipedia: Syntactic sugar" From eaf801128e0ee84fb7a6eccfa3d0decce052bbc5 Mon Sep 17 00:00:00 2001 From: Matthijs Blom <19817960+MatthijsBlom@users.noreply.github.com> Date: Tue, 14 Mar 2023 22:05:19 +0100 Subject: [PATCH 2/2] Add approach: validate first, then transcribe --- .../rna-transcription/.approaches/config.json | 18 +++++++ .../.approaches/validate-first/content.md | 54 +++++++++++++++++++ .../.approaches/validate-first/snippet.txt | 8 +++ 3 files changed, 80 insertions(+) create mode 100644 exercises/practice/rna-transcription/.approaches/config.json create mode 100644 exercises/practice/rna-transcription/.approaches/validate-first/content.md create mode 100644 exercises/practice/rna-transcription/.approaches/validate-first/snippet.txt diff --git a/exercises/practice/rna-transcription/.approaches/config.json b/exercises/practice/rna-transcription/.approaches/config.json new file mode 100644 index 000000000..b202d7112 --- /dev/null +++ b/exercises/practice/rna-transcription/.approaches/config.json @@ -0,0 +1,18 @@ +{ + "introduction": { + "authors": [ + "MatthijsBlom" + ] + }, + "approaches": [ + { + "uuid": "209cd027-6f98-47ac-a77f-8a083e0cd100", + "slug": "validate-first", + "title": "Validate first, then transcribe", + "blurb": "First, find out whether there are invalid characters in the input. Then, if there aren't, transcribe the strand in one go.", + "authors": [ + "MatthijsBlom" + ] + } + ] +} diff --git a/exercises/practice/rna-transcription/.approaches/validate-first/content.md b/exercises/practice/rna-transcription/.approaches/validate-first/content.md new file mode 100644 index 000000000..7bed275a5 --- /dev/null +++ b/exercises/practice/rna-transcription/.approaches/validate-first/content.md @@ -0,0 +1,54 @@ +# Validate first, then transcribe + +```haskell +toRNA :: String -> Either Char String +toRNA dna = + case find (`notElem` "GCTA") dna of + Nothing -> Right (map transcribe dna) + Just c -> Left c + where + transcribe = \case + 'G' -> 'C' + 'C' -> 'G' + 'T' -> 'A' + 'A' -> 'U' +``` + +One approach to solving this problem is to + +- first check that all input characters are valid, +- return one of the invalid characters if there are any, and otherwise to +- convert all the DNA nucleotides into RNA nucleotides. + +Some submitted solutions retrieve the invalid character (if present) in two steps: + +- first check that there are _some_ invalid characters, for example using `any`, and +- then find the first one, for example using `filter` and `head`. + +The solution highlighted here combines these steps into one. +As used here, `find` returns `Nothing` if there are no invalid characters, and if there are then it returns `Just` the first one. +By pattern matching on `find`'s result it is determined how to proceed. + +For transcribing DNA nucleobases into RNA nucleobases a locally defined function `transcribe` is used. +It is a [partial function][wiki-partial-functions]: when given any character other than `'G'`, `'C'`, `'T'`, or `'A'` it will crash. + +Partial functions display behavior (e.g. crashing) that is not documented in their types. +This tends to make reasoning about code that uses them more difficult. +For this reason, partial functions are generally to be avoided. + +Partiality is less objectionable in local functions than in global ones, because in local contexts it is easier to make sure that functions are never applied to problematic arguments. +Indeed, in the solution highlighted above it is clear that `transcribe` will never be applied to a problematic character, as if there were any such characters in `dna` then `find` would have returned `Just _` and not `Nothing`. + +Still, it would be nice if it weren't necessary to check that `transcribe` is never applied to invalid characters. +`transcribe` is forced by its `Char -> Char` type to either be partial or else to return bogus values for some inputs – which would be similarly undesirable. +But another type, such as `Char -> Maybe Char`, would allow `transcribe` to be total. +The other approaches use such a variant. + +This approach has the input walked twice (or thrice). +It is possible to solve this problem by walking the input only once. +The other approaches illustrate how. + + +[wiki-partial-functions]: + https://wiki.haskell.org/Partial_functions + "Haskell Wiki: Partial functions" diff --git a/exercises/practice/rna-transcription/.approaches/validate-first/snippet.txt b/exercises/practice/rna-transcription/.approaches/validate-first/snippet.txt new file mode 100644 index 000000000..fb2f86ca6 --- /dev/null +++ b/exercises/practice/rna-transcription/.approaches/validate-first/snippet.txt @@ -0,0 +1,8 @@ +toRNA :: String -> Either Char String +toRNA dna = + case find (`notElem` "GCTA") dna of + Nothing -> Right (map transcribe dna) + Just c -> Left c + where + transcribe = \case + 'G' -> 'C'