Skip to content
/ aov-rx Public

A malloc-free, wchar_t based regular expression matching library

Notifications You must be signed in to change notification settings

hxrain/aov-rx

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

aov-rx README
=============

aov-rx is a malloc-free [1], wide-char based regular expression matching
library. This piece of code is designed to be part of MPDM [2] in the
future, but can also be used separately.

Please see the TODO file to know the completion status of this project.

Released under the public domain.

Angel Ortega <[email protected]>

 [1] Free as in 'without', not as in 'free()'
 [2] http://triptico.com/software/mpdm.html (Minimum Profit Data Manager)

Goals
-----

 - Not using dynamic memory,
 - using wide chars (wchar_t) natively,
 - returns the matching substring as offset + size,
 - source code being readable, and
 - being completely reentrant and thread-safe.

Features
--------

 - Start (^) and end of string ($) anchors
 - . (match any character)
 - Full quantifier support:
   - Curly bracket with optional minimum and maximum:
     - Exactly n matches ({n})
     - At least n matches ({n,})
     - At most m matches ({,m})
     - At least n and at most m matches ({n,m})
   - ? (match 0 or 1 occurrences, same as {0,1})
   - * (match 0 or many occurrences, same as {0,})
   - + (match 1 or many occurrences, same as {1,})
 - Alternate sets (str1|str2|str3)
 - Sub-regexes (between parentheses, also optionally
   quantifiable)
 - Bracket-wrapped character sets, positive and
   negative ([0-9], [^0-9])

Features that will (probably) be, but not soon
----------------------------------------------

 - Optional multiline matching (i.e. ^ and $ matching lines inside
   the string instead of the start and the end of the string)

Features that (probably) will never be
--------------------------------------

 - Backreferences

Usage
-----

aov-rx consists only of one function:

 wchar_t *aov_match(wchar_t *rx, wchar_t *tx, size_t *size);

And its usage is pretty straightforward:

 #include <stdio.h>
 #include <wchar.h>
 #include "aov-rx.h"
 
 int main(int argc, char *argv[])
 {
    size_t s;
    wchar_t *res;
 
    res = aov_match(L"[0-9]+", L"The number is 123, indeed", &s);
 
    if (s) {
        /* res points to 123, and s contains 3 */
        ...
    }
    ...

---
Angel Ortega
http://triptico.com

About

A malloc-free, wchar_t based regular expression matching library

Resources

Stars

Watchers

Forks

Packages

No packages published