-
Notifications
You must be signed in to change notification settings - Fork 1
Stream processing tool
License
zherczeg/pcresp
Folders and files
Name | Name | Last commit message | Last commit date | |
---|---|---|---|---|
Repository files navigation
Pcresp is a stream processing tool which can be used to update stream data using PCRE2 regular expressions and custom scripts. Some examples: echo ABCD | pcresp '[BC]+' By default pcresp prints the matched strings. The unmatched characters are discarded by default. Result: BC echo ABCD | pcresp -s '' -p '[BC]+' An empty program prints nothing. The unmatched characters are printed because of -p. Result: AD echo ABCD | pcresp -p '[BC]+' -s '/bin/echo -n {#0}' The echo program prints its arguments. There is no newline because of the -n parameter. Result: A{BC}D echo A7 BB29 | pcresp '([A-Z]+)(\d+)' -s "*print -#1- :#2:" The script arguments can be printed by the *print flag. This is considerably faster than using the echo program. Result: -A- :7: -BB- :29: echo 29 30 31 32 33 | pcresp '(\d+)(*SKIP)(?C^ *null /usr/bin/expr #1 % 3 = 0 ^)' This command prints those decimal numbers whose are divisible by 3. This condition is checked by 'expr number % 3 = 0'. The return value of expr influence the matching: zero - continue, non-zero fail. Result: 30 33 SRC=' INC=`expr $2 + 1` echo $1 : $INC ' echo A7 BB29 | pcresp -d prog "$SRC" -s "/bin/bash -c #[prog] arg0 #1 #2" '([A-Z]+)(\d+)' This example shows an easy way to embed strings (usually source codes). The source referenced #[0] is not processed by pcresp. Result: A : 8 BB : 30 SRC=' INC=`expr $2 + 1` echo $1 : $INC ' echo A7 BB29 | pcresp --def-string prog "$SRC" --shell "/bin/bash -c ? arg0" \ -s "#[prog] #1 #2" '([A-Z]+)(\d+)' Same as above export PCRESP_SHELL="/bin/bash -c ? arg0" SRC=' INC=`expr $2 + 1` echo $1 : $INC ' echo A7 BB29 | pcresp -d prog "$SRC" -s "#[prog] #1 #2" '([A-Z]+)(\d+)' Same as above Usage pcresp [options] pcre_pattern [file list] -h, --help Prints help and exit -s, --script script Executing this script after each successul match -p, --print Prints characters between matched strings -d, --def-string name string Define a constant string (see #[name]) --shell default-shell Specify the default shell for each script [--pattern] pcre2_pattern Specify the pattern. The --pattern can be omitted if the pattern does not start with dash --limit n Stop after n successful match (0 - unlimited) -i Enable caseless matching -m Both ^ and $ match newlines -x Ignore white space and # comments (extended regex) -u, --utf Enable UTF-8 --dot-all Dot matches to any character --newline-[type] [type] can be: lf (linefeed), cr (carriage return) crlf (CR followed by LF), anycrlf (any combination of CR and LF), unicode (any Unicode newline sequence) --bsr-[type] [type] can be: anycrlf (any combination of CR and LF), unicode (any Unicode newline sequence) --verbose Display executed commands (useful for debugging) --end End of parameter list. If the pattern has not specified yet it must be the next argument Reads from stdin if file list is empty Return value: 0 - pattern matched at least once 1 - pattern does not match 2 - error occured Pattern format: Patterns must follow PCRE2 regular expression syntax Scripts can be executed during matching using (?C) callouts Script format: A list of arguments separated by white space(s) The first argument must be an executable file Recognized # (hash mark) sequences: Sequences marked by [*] are not accepted by --shell #idx - string value of capture block idx (0-65535) [*] #{idx} - string value of capture block idx (0-65535) [*] #[name] - insert constant string by name [*] #{idx,name} - same as #{idx} if capture block is not empty same as #[name] otherwise [*] #M - current MARK value [*] ## - # (hash mark) #< - less-than sign character #> - greater-than sign character #n - newline (\n) character Script arguments can be preceeded by control flags: *print - print arguments *!nl - no newline after arguments are printed *null - discard output *!sh - default-shell (--shell) is not used Arguments enclosed in <> brackets: Arguments can be enclosed in <> brackets. These enclosed arguments are never recognised as special arguments such as control flags but # sequences are still recognized. Examples: <argument with spaces> <!null> <?> a '/bin/echo <##>' script prints a # sign Setting the default shell: The default shell can be set by the --shell option or by the PCRESP_SHELL environment variable. Example: --shell '/bin/bash -c ? my_script' The arguments defined by the default shell are added before the arguments provided by the currently executed script. If a question mark argument is present in the default script it will be replaced by the first argument of the script and this first argument is not appended after the default shell.
About
Stream processing tool
Resources
License
Stars
Watchers
Forks
Releases
No releases published
Packages 0
No packages published