word unicode boundary match #85

jomart-1985 · 2020-08-31T18:40:43Z

hi i need way to ignore groups in regex for example in modern browser which support Positive lookbehind
i can simply do this in regex find pattern with word boundary :

pattern = XRegExp("(?<=[\s\p{P}\p{S}]|^)("+words+")(?=[\s\p{P}\p{S}]|$)", "g");

findAndReplaceDOMText($($jQueryObject2).find('.pagebody').get( 0 ), {
find: pattern,
replace: function(portion, match) {

				var el = document.createElement('mark');
                                   el.classList.add("mark");

				el.innerHTML = portion.text;
	
				return el;
			}
      });

the issue with this start group boundary :
(?<=[\s\p{P}\p{S}]|^)
its not supported with old browser so i use alternative like this:
pattern = XRegExp("(^|[\s\p{P}\p{S}])("+term+")(?=[\s\p{P}\p{S}]|$)", "g");

but that will also capture first group which i don't want so how can i ignore first group in this pattern:
(^|[\s\p{P}\p{S}])("+term+")(?=[\s\p{P}\p{S}]|$)

The text was updated successfully, but these errors were encountered:

hftf · 2020-08-31T20:37:11Z

Did you try using a non-capturing group, like (?:foo)?

Since this is an off-topic question about help with regular expressions, not an issue related to findAndReplaceDOMText, I believe it should be closed.

jomart-1985 · 2020-08-31T21:30:04Z

Did you try using a non-capturing group, like (?:foo)?

Since this is an off-topic question about help with regular expressions, not an issue related to findAndReplaceDOMText, I believe it should be closed.

yes i did that nothing change..

jomart-1985 · 2020-08-31T21:32:29Z

I believe the issue related to the findAndReplaceDOMText as it must be there are option to catch specific groups

hftf · 2020-09-01T11:30:59Z

I apologize. I read too fast, and realize now that my advice to use a non-capturing group was wrong. However, it's quite hard to fully understand your question from the title and description as there is no MWE (minimal working example). It seemed like you just needed help with regular expressions.

I suggest you design a custom replace function that returns a document fragment on the first portion of each match, and a <mark> element on other portions. The fragment would have two nodes: a text node containing the prefix matched by the first capturing group, and a <mark> element containing the rest of the first portion. I wrote an example replace function here: https://jsfiddle.net/ta13xb9y/2/

var replaceFunction = function(portion, match) {
  // should be   document.createDocumentFragment() ideally
  var fragment = document.createElement('fragment');
  var mark = document.createElement('mark');

  if (portion.index === 0) {
    fragment.appendChild(document.createTextNode(match[1]));
    mark.innerHTML = portion.text.substr(match[1].length);
    fragment.appendChild(mark);
  } else {
    mark.innerHTML = portion.text;
    fragment = mark;
  }

  return fragment;
}

However, findAndReplaceDOMText doesn't handle DocumentFragment well (that's why I just filed PR #86 to fix it). You can work around it by using full-fledged wrapper elements like <fragment> temporarily, and then removing them after finishing the replacement using whatever methods are available in the browsers you care about.

I don't know much about XRegExp, but aren't you ostensibly using it exactly because it rolls its own regex implementation in order to support new features on old browsers? Also, I'm pretty sure you need to escape metacharacters with backslashes in the XRegExp constructor, for example XRegExp("\\s") not XRegExp("\s").

But again, this isn't really a venue to help you design algorithms or work around bad regular expression support in ancient browsers.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

word unicode boundary match #85

word unicode boundary match #85

jomart-1985 commented Aug 31, 2020

hftf commented Aug 31, 2020

jomart-1985 commented Aug 31, 2020

jomart-1985 commented Aug 31, 2020

hftf commented Sep 1, 2020

jomart-1985 commented Sep 1, 2020

word unicode boundary match #85

word unicode boundary match #85

Comments

jomart-1985 commented Aug 31, 2020

hftf commented Aug 31, 2020

jomart-1985 commented Aug 31, 2020

jomart-1985 commented Aug 31, 2020

hftf commented Sep 1, 2020

jomart-1985 commented Sep 1, 2020