Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

word unicode boundary match #85

Open
jomart-1985 opened this issue Aug 31, 2020 · 5 comments
Open

word unicode boundary match #85

jomart-1985 opened this issue Aug 31, 2020 · 5 comments

Comments

@jomart-1985
Copy link

hi i need way to ignore groups in regex for example in modern browser which support Positive lookbehind
i can simply do this in regex find pattern with word boundary :

pattern = XRegExp("(?<=[\s\p{P}\p{S}]|^)("+words+")(?=[\s\p{P}\p{S}]|$)", "g");

findAndReplaceDOMText($($jQueryObject2).find('.pagebody').get( 0 ), {
find: pattern,
replace: function(portion, match) {

				var el = document.createElement('mark');
                                   el.classList.add("mark");

				el.innerHTML = portion.text;
	
				return el;
			}
      }); 

the issue with this start group boundary :
(?<=[\s\p{P}\p{S}]|^)
its not supported with old browser so i use alternative like this:
pattern = XRegExp("(^|[\s\p{P}\p{S}])("+term+")(?=[\s\p{P}\p{S}]|$)", "g");

but that will also capture first group which i don't want so how can i ignore first group in this pattern:
(^|[\s\p{P}\p{S}])("+term+")(?=[\s\p{P}\p{S}]|$)

@hftf
Copy link

hftf commented Aug 31, 2020

Did you try using a non-capturing group, like (?:foo)?

Since this is an off-topic question about help with regular expressions, not an issue related to findAndReplaceDOMText, I believe it should be closed.

@jomart-1985
Copy link
Author

Did you try using a non-capturing group, like (?:foo)?

Since this is an off-topic question about help with regular expressions, not an issue related to findAndReplaceDOMText, I believe it should be closed.

yes i did that nothing change..

@jomart-1985
Copy link
Author

I believe the issue related to the findAndReplaceDOMText as it must be there are option to catch specific groups

@hftf
Copy link

hftf commented Sep 1, 2020

I apologize. I read too fast, and realize now that my advice to use a non-capturing group was wrong. However, it's quite hard to fully understand your question from the title and description as there is no MWE (minimal working example). It seemed like you just needed help with regular expressions.

I suggest you design a custom replace function that returns a document fragment on the first portion of each match, and a <mark> element on other portions. The fragment would have two nodes: a text node containing the prefix matched by the first capturing group, and a <mark> element containing the rest of the first portion. I wrote an example replace function here: https://jsfiddle.net/ta13xb9y/2/

var replaceFunction = function(portion, match) {
  // should be   document.createDocumentFragment() ideally
  var fragment = document.createElement('fragment');
  var mark = document.createElement('mark');

  if (portion.index === 0) {
    fragment.appendChild(document.createTextNode(match[1]));
    mark.innerHTML = portion.text.substr(match[1].length);
    fragment.appendChild(mark);
  } else {
    mark.innerHTML = portion.text;
    fragment = mark;
  }

  return fragment;
}

However, findAndReplaceDOMText doesn't handle DocumentFragment well (that's why I just filed PR #86 to fix it). You can work around it by using full-fledged wrapper elements like <fragment> temporarily, and then removing them after finishing the replacement using whatever methods are available in the browsers you care about.

I don't know much about XRegExp, but aren't you ostensibly using it exactly because it rolls its own regex implementation in order to support new features on old browsers? Also, I'm pretty sure you need to escape metacharacters with backslashes in the XRegExp constructor, for example XRegExp("\\s") not XRegExp("\s").

But again, this isn't really a venue to help you design algorithms or work around bad regular expression support in ancient browsers.

See also: #2 (comment)

@jomart-1985
Copy link
Author

OK thank you for the info you provide i will try you method thanks

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants