Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Enable use of multiple patterns at the same time #3

Open
mahdix opened this issue Sep 22, 2017 · 6 comments
Open

Enable use of multiple patterns at the same time #3

mahdix opened this issue Sep 22, 2017 · 6 comments

Comments

@mahdix
Copy link

mahdix commented Sep 22, 2017

Compiled patterns are kept in a static variable which means they will be overwritten as soon as another pattern is compiled.

@kokke
Copy link
Owner

kokke commented Sep 29, 2017

Hi @mahdix - thanks for taking an interest :)

There is another issue detailing this flaw - I agree with you both that the compiled pattern should be passed as a pointer to the compile function.

I will see if I can find the time to implement this feature.
You are also welcome to suggest a change via a Pull Request :)

Thank you for your feedback

@ghost
Copy link

ghost commented Nov 11, 2017

Hi, there is a fork of this library available which I ported to work in the kernel of the FreeBSD. https://github.com/komdivkote/tiny-regex-c
The function re_compile(const char* pattern, uint32_t * o_reg_cnt) returns a pointer to an array of a new instance of struct re_regex * re_compiled allocated in heap because the kernel has small stack. It uses realloc approach when it runs out of the free REGEXP_OBJECTS. This approach decreases the performance comparing with the initial approach because malloc and realloc are both "expensive" operations.
The logic of some routines was changed a bit and there were introduced macroses (like FOR_EACH for automation of enums and its names creation during compilation for the different CPU architectures i.e Harvard/Von Neumann) to the code because the library works in both kernel/userland, but if this is a problem you can remove the 'kernel' code which is "#ifdef _KERNEL".

Also there some tuning available in the re.h:
`
#define RE_REGEX_INSTANCE_REALLOCATE YES //YES - reallocate on overuse, NO - exit

#define RE_TRUNC_ARRAY_IF_POSSIBLE YES //YES - truncate array to reduce size, NO - do not

#define RE_CHAR_CLASS_LENGTH 20 //per struct regex_t instance

#define MIN_REGEXP_OBJECTS 30 //min number of regex symbols in expression.

#define RE_BUILDWITH_DEBUG YES //compile with print and trace functions
`

I will try to form a pull request without kernel code.

@kokke
Copy link
Owner

kokke commented Nov 14, 2017

Hi @komdivkote - wow you've really done something with the code :D A pull request would be awesome :)

I really find it interesting to see how other people use my code.
I'm in the embedded business for a living, which is what inspires me to keep the resource utilization frugal. Kernel code has some of the same flavor, so I get your motivation for a small regex-project.

Anecdote:
I haven't run FreeBSD since 8.0-RELEASE, but I've always been fond of how the BSDs work and organize things. I also like NetBSD for its minimalism.

@Nable80
Copy link

Nable80 commented Sep 17, 2020

Oh, it looks like people asked for this feature (and provided some kind of implementations) multiple times in the past.
Here's one of the latest attempts: #45

@kokke, what do you think about this issue? IMHO there should be some final decision: either merge one of implementations as an optional feature or mark it as "wontfix" and clearly mention in documentation that it doesn't fit well here (and mention some featured alternatives for those who need this feature too much). What do you think?

Some kind of perfectionism makes me worry about old stale issues, especially in small well-written projects where it looks definitely possible to achieve and maintain "zero known/open issues" state.

@qgymib
Copy link

qgymib commented Jan 20, 2021

Hi,

There might be a simple solution for this (without dynamic memory allocation to keep original design):

  1. define struct regex as following:
struct regex
{
    size_t size;
    regex2_t data[];
};

Where regex2_t is the original regex structure in your source code.

  1. add a new interface:
re_t re_compile2(const char* pattern, void* buffer, size_t size);

This function initialize struct regex on buffer, just looks like:

re_t re_compile2(const char* pattern, void* buffer, size_t size)
{
    if (/* Some code to check if buffer size is large enough */)
    {
        return NULL; /* buffer is too small */
    }

    struct regex* ret = (struct regex*)buffer;
    ret->size = size / sizeof(struct regex2_t);

    /* Now compile patterm into ret->data just like #re_compile() */

    return ret;
}
  1. No more step.

If user want to get compile failure reason, may be another parameter err can be added, like:

re_t re_compile2(const char* pattern, void* buffer, size_t size, int* err);

@marler8997
Copy link

Here's my attempt: #58

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

5 participants