Skip to content

Commit

Permalink
add %- regex variable
Browse files Browse the repository at this point in the history
  • Loading branch information
fglock committed Oct 21, 2024
1 parent 78a0187 commit 0384791
Show file tree
Hide file tree
Showing 5 changed files with 30 additions and 16 deletions.
6 changes: 4 additions & 2 deletions FEATURE_MATRIX.md
Original file line number Diff line number Diff line change
Expand Up @@ -136,8 +136,8 @@
- ✔️ **lvalue `pos`**: lvalue `pos` operator is implemented.
- ✔️ **`m?pat?`** one-time match is implemented.
- ✔️ **`reset`** resetting one-time match is implemented
- ✔️ **`@-`, `@+`, `%+` variables** `@-`, `@+`, `%+` special variables are implemented
- ✔️ **`$&` variables** `` $` ``, `$&`, `$'` special variables are implemented
- ✔️ **`@-`, `@+`, `%+`, `%-` variables**: regex special variables are implemented
- ✔️ **`$&` variables**: `` $` ``, `$&`, `$'` special variables are implemented
-**Perl-specific Regex Features**: Some features like `/xx` `/ee` are missing.
-**Dynamically-scoped regex variables**: Regex variables are not dynamically-scoped.
-**Code blocks**: `(?{ code })` in regex is not implemented.
Expand Down Expand Up @@ -257,6 +257,8 @@
-**Fetching network info**: endprotoent, endservent, gethostbyaddr, gethostbyname, gethostent, getnetbyaddr, getnetbyname, getnetent, getprotobyname, getprotobynumber, getprotoent, getservbyname, getservbyport, getservent, sethostent, setnetent, setprotoent, setservent
-**Keywords related to the control flow of the Perl program**: `dump` operator.
-**Tail calls**: `goto` going to a different subroutine as a tail call is not supported.
-**Regex differences**:
- Java's regular expression engine does not support duplicate named capture groups. In Java, each named capturing group must have a unique name within a regular expression.

## Language Differences and Workarounds

Expand Down
2 changes: 1 addition & 1 deletion README.md
Original file line number Diff line number Diff line change
Expand Up @@ -331,7 +331,7 @@ file.
- Emulate Perl behaviour with unsigned integers in bitwise operators.
- Regex `m?pat?` match-once and the `reset()` operator are implemented.
- Regex `\G` and the `pos` operator are implemented.
- Regex `@-`, `@+`, `%+` special variables are implemented.
- Regex `@-`, `@+`, `%+`, `%-` special variables are implemented.
- Regex `` $` ``, `$&`, `$'` special variables are implemented.
- Regex performance comparable to Perl; optimized regex variables.
- Added `__SUB__` keyword; `readpipe`.
Expand Down
3 changes: 2 additions & 1 deletion src/main/java/org/perlonjava/runtime/GlobalContext.java
Original file line number Diff line number Diff line change
Expand Up @@ -64,7 +64,8 @@ public static void initializeGlobals(ArgumentParser.CompilerOptions compilerOpti
getGlobalArray("main::-").elements = new ArraySpecialVariable(ArraySpecialVariable.Id.LAST_MATCH_START); // regex @-

// Initialize hashes
getGlobalHash("main::+").elements = new HashSpecialVariable();
getGlobalHash("main::+").elements = new HashSpecialVariable(HashSpecialVariable.Id.CAPTURE); // regex %+
getGlobalHash("main::-").elements = new HashSpecialVariable(HashSpecialVariable.Id.CAPTURE_ALL); // regex %-

// Initialize %ENV
Map<String, RuntimeScalar> env = getGlobalHash("main::ENV").elements;
Expand Down
21 changes: 19 additions & 2 deletions src/main/java/org/perlonjava/runtime/HashSpecialVariable.java
Original file line number Diff line number Diff line change
Expand Up @@ -15,10 +15,14 @@
*/
public class HashSpecialVariable extends AbstractMap<String, RuntimeScalar> {

// Mode of operation for this special variable
private final HashSpecialVariable.Id mode;

/**
* Constructs a HashSpecialVariable for the given Matcher.
*/
public HashSpecialVariable() {
public HashSpecialVariable(HashSpecialVariable.Id mode) {
this.mode = mode;
}

@Override
Expand All @@ -43,9 +47,22 @@ public RuntimeScalar get(Object key) {
if (matcher != null && key instanceof String name) {
String matchedValue = matcher.group(name);
if (matchedValue != null) {
return new RuntimeScalar(matchedValue);
if (this.mode == Id.CAPTURE_ALL) {
return new RuntimeArray(new RuntimeScalar(matchedValue)).createReference();
} else if (this.mode == Id.CAPTURE) {
return new RuntimeScalar(matchedValue);
}
return scalarUndef;
}
}
return scalarUndef;
}

/**
* Enum to represent the mode of operation for HashSpecialVariable.
*/
public enum Id {
CAPTURE_ALL, // Perl %-
CAPTURE // Perl %+
}
}
14 changes: 4 additions & 10 deletions src/test/resources/regex_named_capture.pl
Original file line number Diff line number Diff line change
@@ -1,12 +1,13 @@
#!/usr/bin/perl
use strict;
use warnings;
use Test::More tests => 4;
use Test::More tests => 6;

# Test case 1: Simple named capture
my $string1 = 'foo';
if ($string1 =~ /(?<foo>foo)/) {
is($+{foo}, 'foo', 'Test case 1: Named capture for "foo"');
is($-{foo}[0], 'foo', 'Test case 1: All named captures for "foo"');
} else {
fail('Test case 1: Pattern did not match');
}
Expand All @@ -16,17 +17,10 @@
if ($string2 =~ /(?<bar>bar)(?<baz>baz)/) {
is($+{bar}, 'bar', 'Test case 2: Named capture for "bar"');
is($+{baz}, 'baz', 'Test case 2: Named capture for "baz"');
is($-{bar}[0], 'bar', 'Test case 2: All named captures for "bar"');
is($-{baz}[0], 'baz', 'Test case 2: All named captures for "baz"');
} else {
fail('Test case 2: Pattern did not match');
}

## # Test case 3: Overlapping named captures
## my $string3 = 'foobar';
## if ($string3 =~ /(?<foo>foo)(?<bar>bar)|(?<foo>foobar)/) {
## is($+{foo}, 'foo', 'Test case 3: Overlapping named capture for "foo"');
## } else {
## fail('Test case 3: Pattern did not match');
## }

done_testing();

0 comments on commit 0384791

Please sign in to comment.