You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
What I ultimately want to obtain is the content excluding the tag.
(What i want to do is to remove something like <rPh sb="0" eb="2"><t>キンガク</t></rPh>)
Is there any processing method available? I would appreciate your help very much if you could assist me.
The text was updated successfully, but these errors were encountered:
Following the import process and content extraction, all tags are removed. Nonetheless, if you wish to exclude specific content based on these tags, you must work at the 'preParseHandlers' level under "Importer," where all the tags are still preserved before extraction. You can find more information about this configuration in the documentation at https://opensource.norconex.com/importer/v2/configuration#tbl-transformer. You can achieve this using the 'ReduceConsecutivesTransformer' or by implementing a custom script using the 'ScriptTransformer.'
The contents of the sharedStrings.xml file in the target xlsx file for crawling are as follows.
What I ultimately want to obtain is the content excluding the tag.
(What i want to do is to remove something like
<rPh sb="0" eb="2"><t>キンガク</t></rPh>
)Is there any processing method available? I would appreciate your help very much if you could assist me.
The text was updated successfully, but these errors were encountered: