Marco

A Markov-chains based text generator (currently a work in progress).

Currently at version v0.2.1.0. Click here to read the changelog.

Build

We recommend to get and use stack in order to build, run and install the project.

With stack

stack build

You can run the program with stack run, or install it into your setup with stack install and then run it with marco.

With GHC

In case you don't have stack (note that this is not supported and might not work).

cd src
ghc Main.hs -o marco

You will find the marco executable in src/.

Usage

Assuming the program has been built properly and the executable is called marco (replace marco with stack run if necessary):

Training the algorithm

marco train <foldername> <context_length>

Where

foldername is the (absolute or relative) path to a folder with the files to be used as the base texts
context_length is the amount of words to be considered as a "unit" in order to find the next suffix

For training to be successful, you need to save one or several .txt files within the specified folder (we suggest using the data folder in the repository).

Generating text

marco generate <max_length>

Where

max_length is an upper bound for the generated text length (i.e. an approximate, not necessarily exact, max length). In practice, the text's length may vary from 0 (which would be a really bad scenario that might imply that your base texts are not adecuate for the task) to (max_length + context_length).

To-do list

What follows is a non-exhaustive list of possible features to be added to marco:

Improved performance, particularly in training
An easy-to-use API to integrate text generation into other programs (e.g. bots)
Better normalization algorithms
The ability to define the first token to be used in text generation
Code refactoring
Exception handling
Edge-case testing
Sample texts: inputs and outputs
Improved README: add description, motivation, etc!

Name		Name	Last commit message	Last commit date
Latest commit History 45 Commits
.github/ISSUE_TEMPLATE		.github/ISSUE_TEMPLATE
data		data
src		src
trained-data		trained-data
.gitignore		.gitignore
CHANGELOG.md		CHANGELOG.md
LICENSE		LICENSE
README.md		README.md
Setup.hs		Setup.hs
marco.cabal		marco.cabal
stack.yaml		stack.yaml
stack.yaml.lock		stack.yaml.lock

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Marco

Build

With stack

With GHC

Usage

Training the algorithm

Generating text

To-do list

About

Releases 4

Packages

Languages

License

aitorres/marco

Folders and files

Latest commit

History

Repository files navigation

Marco

Build

With stack

With GHC

Usage

Training the algorithm

Generating text

To-do list

About

Topics

Resources

License

Stars

Watchers

Forks

Releases 4

Packages 0

Languages

Packages