Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Getting Error with Basic Authentication and Solr Committer #21

Open
MRC-westat opened this issue Apr 28, 2021 · 8 comments
Open

Getting Error with Basic Authentication and Solr Committer #21

MRC-westat opened this issue Apr 28, 2021 · 8 comments

Comments

@MRC-westat
Copy link

using Basic Authentication on a stand alone (no cloud) windows platform Solr 8.8.2 installation.
the crawl is successful and the error is thrown in the committer, SSL is turned off (for the moment)
the user name and password are clear text - same as the basic auth login to the Solr admin
the core is called sops

my committer code is:

  <committer class="com.norconex.committer.solr.SolrCommitter">
    <solrURL>http://mysolr:8983/solr/sops</solrURL>
      <username>solr_admin</username>
      <password>mypassword</password>
    <queueDir>${workDir}/committer-queue</queueDir>
  </committer>

from the norconex log file

[non-job]: 2021-04-27 19:52:54 INFO - Starting execution.
[non-job]: 2021-04-27 19:52:54 INFO - Version: Norconex HTTP Collector 2.9.0 (Norconex Inc.)
[non-job]: 2021-04-27 19:52:54 INFO - Version: Norconex Collector Core 1.10.0 (Norconex Inc.)
[non-job]: 2021-04-27 19:52:54 INFO - Version: Norconex Importer 2.10.0 (Norconex Inc.)
[non-job]: 2021-04-27 19:52:54 INFO - Version: Norconex JEF 4.1.2 (Norconex Inc.)
[non-job]: 2021-04-27 19:52:54 INFO - Version: Norconex Committer Core 2.1.3 (Norconex Inc.)
[non-job]: 2021-04-27 19:52:54 INFO - Version: Norconex Committer Solr 2.4.0 (Norconex Inc.)
...
xyz crawler: 2021-04-27 19:53:49 INFO - Committing 56 files
xyz crawler: 2021-04-27 19:53:50 INFO - Sending 56 documents to Solr for update/deletion.
xyz crawler: 2021-04-27 19:53:50 INFO - xyz crawler: Crawler executed in 56 seconds.
xyz crawler: 2021-04-27 19:53:50 INFO - xyz crawler: Closing sitemap store...
xyz crawler: 2021-04-27 19:53:50 ERROR - Execution failed for job: xyz crawler
com.norconex.committer.core.CommitterException: Cannot index document batch to Solr.
at com.norconex.committer.solr.SolrCommitter.commitBatch(SolrCommitter.java:400)
at com.norconex.committer.core.AbstractBatchCommitter.commitAndCleanBatch(AbstractBatchCommitter.java:179)
at com.norconex.committer.core.AbstractBatchCommitter.commitComplete(AbstractBatchCommitter.java:159)
at com.norconex.committer.core.AbstractFileQueueCommitter.commit(AbstractFileQueueCommitter.java:233)
at com.norconex.collector.core.crawler.AbstractCrawler.execute(AbstractCrawler.java:274)
at com.norconex.collector.core.crawler.AbstractCrawler.doExecute(AbstractCrawler.java:228)
at com.norconex.collector.core.crawler.AbstractCrawler.startExecution(AbstractCrawler.java:184)
at com.norconex.jef4.job.AbstractResumableJob.execute(AbstractResumableJob.java:49)
at com.norconex.jef4.suite.JobSuite.runJob(JobSuite.java:354)
at com.norconex.jef4.suite.JobSuite.doExecute(JobSuite.java:293)
at com.norconex.jef4.suite.JobSuite.execute(JobSuite.java:166)
at com.norconex.collector.core.AbstractCollector.start(AbstractCollector.java:150)
at com.norconex.collector.core.AbstractCollectorLauncher.launch(AbstractCollectorLauncher.java:95)
at com.norconex.collector.http.HttpCollector.main(HttpCollector.java:74)
Caused by: org.apache.solr.client.solrj.impl.HttpSolrClient$RemoteSolrException: Error from server at http://mysolr:8983/solr/sops: Expected mime type application/octet-stream but got text/html.

<title>Error 401 require authentication</title>

HTTP ERROR 401 require authentication

URI:/solr/sops/update
STATUS:401
MESSAGE:require authentication
SERVLET:default
at org.apache.solr.client.solrj.impl.HttpSolrClient.executeMethod(HttpSolrClient.java:629)
at org.apache.solr.client.solrj.impl.HttpSolrClient.request(HttpSolrClient.java:265)
at org.apache.solr.client.solrj.impl.HttpSolrClient.request(HttpSolrClient.java:248)
at org.apache.solr.client.solrj.SolrRequest.process(SolrRequest.java:211)
at org.apache.solr.client.solrj.SolrClient.commit(SolrClient.java:504)
at org.apache.solr.client.solrj.SolrClient.commit(SolrClient.java:479)
at com.norconex.committer.solr.SolrCommitter.commitBatch(SolrCommitter.java:397)
@essiembre
Copy link
Contributor

Which version of the Solr Committer are you using?

To establish whether the problem is specific to the Committer, can you confirm if you are able to successfully make queries and updates to Solr directly from the command line? Can you do so from the same host where the HTTP Collector runs? If not, the problem is likely misconfiguration on the Solr side.

If it works from the command line, it is harder to help out without a way to reproduce. Do you have more information about the error in your Solr logs? Check also for any kind of errors, especially around startup. You can also try upgrading the SolrJ library installed with the Committer to match your version of Solr.

@MRC-westat
Copy link
Author

Hi .. thanks for the assistance!

the version of the Solr committer is 2.4.0
Solr version is 8.8.2
I am using the latest Java 11 version from AdoptOpenJDK

I can do a query successfully using postman (passing in the username and password)
and I can also do it successfully on the command line using curl
curl --user solr_admin:password http://wessolrtest1:8983/solr/sops/select?q=test
the collector/committer is running on the same stand alone Solr server.

not much information on Solr startup
D:\Solr\bin>solr start -h wessolrtest1
"java version info is 11.0.10"
"Extracted major version is 11"
OpenJDK 64-Bit Server VM warning: JVM cannot use large page memory because it does not have enough privilege to lock pages in memory.
Waiting up to 30 to see Solr running on port 8983

I am not seeing any errors in the Solr log file, just INFO messages

the log file from the committer is this
xyz crawler: 2021-05-04 19:03:40 INFO - xyz crawler: Crawler finishing: committing documents.
xyz crawler: 2021-05-04 19:03:40 INFO - Committing 92 files
xyz crawler: 2021-05-04 19:03:40 INFO - Sending 92 documents to Solr for update/deletion.
xyz crawler: 2021-05-04 19:03:41 INFO - xyz crawler: Crawler executed in 57 seconds.
xyz crawler: 2021-05-04 19:03:41 INFO - xyz crawler: Closing sitemap store...
xyz crawler: 2021-05-04 19:03:41 ERROR - Execution failed for job: xyz crawler
com.norconex.committer.core.CommitterException: Cannot index document batch to Solr.
at com.norconex.committer.solr.SolrCommitter.commitBatch(SolrCommitter.java:400)
at com.norconex.committer.core.AbstractBatchCommitter.commitAndCleanBatch(AbstractBatchCommitter.java:179)
at com.norconex.committer.core.AbstractBatchCommitter.commitComplete(AbstractBatchCommitter.java:159)
at com.norconex.committer.core.AbstractFileQueueCommitter.commit(AbstractFileQueueCommitter.java:233)
at com.norconex.collector.core.crawler.AbstractCrawler.execute(AbstractCrawler.java:274)
at com.norconex.collector.core.crawler.AbstractCrawler.doExecute(AbstractCrawler.java:228)
at com.norconex.collector.core.crawler.AbstractCrawler.startExecution(AbstractCrawler.java:184)
at com.norconex.jef4.job.AbstractResumableJob.execute(AbstractResumableJob.java:49)
at com.norconex.jef4.suite.JobSuite.runJob(JobSuite.java:354)
at com.norconex.jef4.suite.JobSuite.doExecute(JobSuite.java:293)
at com.norconex.jef4.suite.JobSuite.execute(JobSuite.java:166)
at com.norconex.collector.core.AbstractCollector.start(AbstractCollector.java:150)
at com.norconex.collector.core.AbstractCollectorLauncher.launch(AbstractCollectorLauncher.java:95)
at com.norconex.collector.http.HttpCollector.main(HttpCollector.java:74)
Caused by: org.apache.solr.client.solrj.impl.HttpSolrClient$RemoteSolrException: Error from server at http://wessolrtest1:8983/solr/sops: Expected mime type application/octet-stream but got text/html.

<title>Error 401 require authentication</title>

HTTP ERROR 401 require authentication

URI:/solr/sops/update
STATUS:401
MESSAGE:require authentication
SERVLET:default

@essiembre
Copy link
Contributor

Can you share the key elements of your Solr security config in an attempt to reproduce the issue?

@MRC-westat
Copy link
Author

Thanks! .. I will post the security.json file along with the solr.in.cmd file and the norconex related files.
Solr 8.8.2 is installed as a stand alone Solr server on a windows 2019 server
let me know if you need anything else.
SOLR_debugging.zip

@essiembre essiembre added the bug label May 17, 2021
@essiembre
Copy link
Contributor

I could reproduce. It turns out the credentials are currently not applied when commit is invoked. Until a fix is provided, you can add the following to your Solr Committer configuration block:

<solrCommitDisabled>true</solrCommitDisabled>

Solr will rely on its auto-commit configuration to commit the data.

@essiembre
Copy link
Contributor

FYI, a new snapshot release of the Solr Committer was made with a proper fix. You no longer have to apply the workaround (disabling Solr commits).

Please confirm.

@MRC-westat
Copy link
Author

I downloaded the committer 2.4.1 snapshot and ran the install script, but still getting the error and my log file still says 2.4.0
do i need to do something special to overwrite the old committer?

thanks
Michael

@essiembre
Copy link
Contributor

Yes, look in your lib folder and you will likely see duplicate JARs. If you see two files starting with norconex-committer-solr-.... delete/backup the older one(s) you have.

If it still failed, you may want to reinstall the collector files and the Solr committer files to make sure you have no other duplicates.

The easiest way to install a Committer is to run the install script found once you extracted the Committer Zip file. It takes care of eliminating possible duplicates.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants