You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
PaulVrugt was asking this question, but never got a response to it:
The FieldCache GetStrings method was replace by GetTerms, but GetTerms requires an AtomicReader, we used to be able to pass an IndexReader into this method and it used to return a string array containing the values. How to I get the same kind of behavior from the GetTerms method?
Is there no way to have the same behavior that GetStrings did in version 3.0.3?
The field values returned when sorting by SortField.STRING are now BytesRef. You can call value.Utf8ToString() to convert back to
string, if necessary.
In FieldCache, GetStrings (returning string[]) has been replaced
with GetTerms (returning a BinaryDocValues instance). BinaryDocValues provides a Get method, taking a docID and a BytesRef
to fill (which must not be null), and it fills it in with the
reference to the bytes for that term.
Note however that it can be costly to convert to String, so it's better to work directly with the BytesRef.
Similarly, in FieldCache, GetStringIndex (returning a StringIndex
instance, with direct arrays int[] order and String[] lookup) has
been replaced with GetTermsIndex (returning a SortedDocValues instance). SortedDocValues provides the GetOrd(int docID) method to lookup the int order for a document, LookupOrd(int ord, BytesRef result) to lookup the term from a given
order, and the sugar method Get(int docID, BytesRef result)
which internally calls GetOrd and then LookupOrd.
If you had code like this before:
StringIndexidx= FieldCache.DEFAULT.GetStringIndex(reader, field);
...
int ord = idx.order[docID];StringaValue= idx.lookup[ord];
you can do this instead:
DocTermsIndexidx= FieldCache.DEFAULT.GetTermsIndex(reader, field);
...
int ord = idx.GetOrd(docID);BytesRefterm=new BytesRef();
idx.LookupOrd(ord, term);stringaValue= term.Utf8ToString();
Note however that it can be costly to convert to String, so it's better to work directly with the BytesRef.
DocTermsIndex also has a GetTermsEnum() method, which returns an iterator (TermsEnum) over the term values in the index (ie, iterates ord = 0..NumOrd-1).
Furthermore, if you drill down into the issue LUCENE-2380, there is an explanation for the change: primarily, this was done for performance reasons. There is no longer a string[] stored in the field cache, the underlying data is now a byte[] so extra steps are required to get a UTF8 string.
Do note that you are meant to reuse the BytesRef instance that is passed in to get better performance.
Is there an existing issue for this?
Describe the documentation issue
PaulVrugt was asking this question, but never got a response to it:
The FieldCache GetStrings method was replace by GetTerms, but GetTerms requires an AtomicReader, we used to be able to pass an IndexReader into this method and it used to return a string array containing the values. How to I get the same kind of behavior from the GetTerms method?
Is there no way to have the same behavior that GetStrings did in version 3.0.3?
Additional context
Here is the link to that thread:
#398
No response
The text was updated successfully, but these errors were encountered: