Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Document required encoding of query parameters of search #2515

Open
wants to merge 1 commit into
base: master
Choose a base branch
from
Open

Document required encoding of query parameters of search #2515

wants to merge 1 commit into from

Commits on Jun 7, 2017

  1. Document required encoding of query parameters of search

    ## Solr
    
    A note in the documented changes of Solr 4.1.0 regarding portability of Solr across Web containers points out that ["Query strings passed in via the URL need to be properly-%-escaped, UTF-8 encoded bytes, otherwise Solr refuses to handle the request"](https://github.com/apache/lucene-solr/blob/releases/lucene-solr/4.10.4/solr/CHANGES.txt#L3376-L3381).
    A note in the documented changes of Solr 4.5.0 mentions parametrization of encoding of query parameters by [`ie` parameter](https://github.com/apache/lucene-solr/blob/releases/lucene-solr/4.10.4/solr/CHANGES.txt#L1995-L1997) (e.g. [`ie=iso-8859-1`](https://github.com/apache/lucene-solr/blob/releases/lucene-solr/4.10.4/solr/core/src/test/org/apache/solr/servlet/SolrRequestParserTest.java#L249)), parametrization of encoding of POST request body by [`Content-Type` header](https://github.com/apache/lucene-solr/blob/releases/lucene-solr/4.10.4/solr/CHANGES.txt#L1997-L1998) (e.g. [`application/x-www-form-urlencoded; charset=iso-8859-1`](https://github.com/apache/lucene-solr/blob/releases/lucene-solr/4.10.4/solr/core/src/test/org/apache/solr/servlet/SolrRequestParserTest.java#L251)), and [UTF-8 as the default encoding](https://github.com/apache/lucene-solr/blob/releases/lucene-solr/4.10.4/solr/CHANGES.txt#L1997).
    As of Solr 4.10.4 UTF-8 is still the [default](https://github.com/apache/lucene-solr/blob/releases/lucene-solr/4.10.4/solr/core/src/java/org/apache/solr/servlet/SolrRequestParsers.java#L345-L348) encoding for both [query parameters](https://github.com/apache/lucene-solr/blob/releases/lucene-solr/4.10.4/solr/core/src/java/org/apache/solr/servlet/SolrRequestParsers.java#L248) and [POST request body](https://github.com/apache/lucene-solr/blob/releases/lucene-solr/4.10.4/solr/core/src/java/org/apache/solr/servlet/SolrRequestParsers.java#L602-L606).
    
    ## Riak Search
    
    [The version of yokozuna in riak kv 2.2.3 is 2.1.10](https://github.com/basho/riak/blob/riak-2.2.3/rebar.config#L24)
    [that integrates Solr 4.10.4](https://github.com/basho/yokozuna/blob/2.1.10/tools/grab-solr.sh#L21)
    (see also basho/yokozuna@7f0d464)
    whose documentation is available [online](https://archive.apache.org/dist/lucene/solr/ref-guide/apache-solr-ref-guide-4.10.pdf).
    
    [Yokozuna 2.1.10 depends on riak_kv 2.1.7](https://github.com/basho/yokozuna/blob/2.1.10/rebar.config#L14)
    that [via](https://github.com/basho/riak_kv/blob/2.1.7/rebar.config#L38) [riak_api 2.1.6 depends on basho/webmachine 1.10.8-basho1](https://github.com/basho/riak_api/blob/2.1.6/rebar.config#L6)
    that contains e.g. module [`wrq`](https://github.com/basho/webmachine/blob/1.10.8-basho1/src/wrq.erl),
    and
    that [depends on mochiweb v2.9.0p2](https://github.com/basho/webmachine/blob/1.10.8-basho1/rebar.config#L9)
    that contains e.g. module [`mochiweb_util`](https://github.com/basho/mochiweb/blob/v2.9.0p2/src/mochiweb_util.erl).
    
    When receiving a [search request](https://docs.basho.com/riak/kv/2.2.3/developing/api/http/search-query/#request),
    yokozuna [calls the `search` function](https://github.com/basho/yokozuna/blob/2.1.10/src/yz_wm_search.erl#L58),
    that [extracts](https://github.com/basho/yokozuna/blob/2.1.10/src/yz_wm_search.erl#L125) [the](https://github.com/basho/webmachine/blob/1.10.8-basho1/src/wrq.erl#L111) [query](https://github.com/basho/webmachine/blob/1.10.8-basho1/src/wrq.erl#L68-L70) - [percent-decoded but not further decoded e.g. Unicode](https://github.com/basho/mochiweb/blob/v2.9.0p2/src/mochiweb_util.erl#L202-L203) -
    then [appends some distributed search related parameters](https://github.com/basho/yokozuna/blob/2.1.10/src/yz_solr.erl#L323)
    then [percent-encodes (not further e.g. Unicode) the parameters](https://github.com/basho/yokozuna/blob/2.1.10/src/yz_solr.erl#L330)
    and [contacts Solr via POST request](https://github.com/basho/yokozuna/blob/2.1.10/src/yz_solr.erl#L334)
    [setting header content type to `application/x-www-form-urlencoded`](https://github.com/basho/yokozuna/blob/2.1.10/src/yz_solr.erl#L332).
    
    As such content type header has no charset specified, Solr interprets the POST body as UTF-8.
    Luca Favatella committed Jun 7, 2017
    Configuration menu
    Copy the full SHA
    4e251ea View commit details
    Browse the repository at this point in the history