-
Notifications
You must be signed in to change notification settings - Fork 134
Stashdb Matching
Some Background: The primary purpose of matching scenes in XBVR to Stashdb is to match Actor records. Stashdb has the best Actor metadata by far. However, there are many issues with matching Actors from 2 different systems, there are many actors with the same name, misspelling, etc, so Actors in XBVR are only matched to Stashdb, when there are matching names in a matched scene.
The default matching of scenes in XBVR with StashDB follows fairly strict matching rules using scene URLs. This minimizes the number of incorrect matches (which can still occur) at the expense of more matches overall but also with (probably) more incorrect matches. This approach reduces the effort required to undo incorrect matches. Unfortunately, this may not always work well, for example, a custom scraper for VRConk on an aggregator site compared to VRConk on StashDB won't yield many matches as the scene URLs will usually differ.
However, as of version 0.4.33 users can apply more relaxed rules on a site-by-site basis at their discretion. An alternative to using the relaxed rules (with a possible increase in mismatches) is to stick with the stricter rules and utilize manual matching with StashDB, which has also been introduced in version 0.4.33. The new matching options available are Studio Code, Title/Release Date, and Title. Multiple rules can be specified, and these are evaluated in the order they appear.
Studio Codes are not used on a lot of sites and even when Studio Codes exist, they are often not added to StashDB, although this is becomming more common. Where Studio Codes are available, they usually provide reliable matches.
Because the new rules are evaluated in sequence, specify the strictest rule first, it is not logical to specify Title followed by Title and Date.
Rules are defined in the actor_scraper_custom_config.json file located in the XBVR app directory. Use the StashSceneMatching section. Note that the second section, GenericActorScrapingConfig, is used to define how to extract actor details and not relavant to matching to StashDB.
Json Structure
"StashSceneMatching": {
"siteid": [
{
"StashId": "Stash guid of the Staudio, used when names don't match exactly",
"ParentId": "Stash guid of parent, if tag filtering used (used by NAVR)",
"TagIdFilter": "Stash guid of tag, if tag filtering used (used by NAVR)",
"Rules": [
{
"XbvrField": "Enter xbvr field you are matching to, scene_url or scene_id",
"XbvrMatch": "Enter regex express to extract value from field to match on",
"XbvrMatchResultPosition": 0,
"StashRule": "Enter rule name, ie title, title/date, studio_code or regex expression to extract value to match from the stash url",
"StashMatchResultPosition": 0
}
]
}
]
},
siteid: Matches the ID from the site table.
StashId: Matches the ID in StashDB, e.g., VR Conk is b038d55c-1e94-41ff-938a-e6aafb0b1759.
**ParentId: Allows searching scenes across the parent site, useful when scenes are published across multiple sites (though uncommon for VR sites).
TagIdFilter: Should not be needed, this is used specifically to handle the default Naughty America matching, where VR scenes are posted to multiple sites in Stashdb and a Tag is used to identify the VR ones. It is the built-in default for NA and not tested on other sites.
Rules:
- In the StashRule field, specify "studio_code", "title/date", or "title" (use exact values and case-sensitivity).
- Other fields (XBVRField, XBVRMatch, XbvrMatchResultPosition, StashRule, StashMatchResultPosition) allow matching scene URLs or IDs between XBVR and StashDB using regex expressions to extract matching components. Refer to the actor_scraper_config_examples.json file for examples of rules used by XBVR.
While these regex rule configurations are user-editable they can become complex. Only advanced users who are comfortable debugging should attempt modifications.
You can also configure XBVR to search multiple StashDB sites for one XBVR site, each with separate rules. This is mainly for future-proofing, as scenes are increasingly published across multiple sites or moved between sites (e.g., VRConk scenes were moved to VRBangers a few years ago).
Example json for search 2 Stashdb sites with multiple rules
"site-slr": [
{
"Comment": "The comment isn't used, I just put it in and use it for the Stashdb Site is",
"StashId": "xxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxxxx",
"Rules": [
{
"StashRule": "title/date"
},
{
"StashRule": "title"
}
]
},
{
"Comment": "The second Stashdb site to match with the same XBVR site site-slr, using different rules",
"StashId": "zzzzzzzz-zzzz-zzzz-zzzz-zzzzzzzzzzzz",
"Rules": [
{
"StashRule": "studio_code"
},
{
"StashRule": "title"
}
]
}
]
AS WITH ALL CASES OF EDITING JSON CONFIG FILES, MAKE A BACKUP OF THE FILE FIRST