Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Spacy-RU integration with Rasa Open source #30

Open
EugenSmith opened this issue Sep 6, 2020 · 0 comments
Open

Spacy-RU integration with Rasa Open source #30

EugenSmith opened this issue Sep 6, 2020 · 0 comments

Comments

@EugenSmith
Copy link

EugenSmith commented Sep 6, 2020

Приветствую.

Описание установки и используемые версии пакетов.
apt update && apt install -y python3-venv python3-dev python3-pip

python3 -m venv ./venv
source ./venv/bin/activate

pip install -U pip
pip install rasa --use-feature=2020-resolver

pip install pymorphy2==0.8
pip install spacy==2.1.9

git clone -b v2.1 https://github.com/buriy/spacy-ru.git
cp -r ./spacy-ru/ru2/. ./ru2/

python -V
Python 3.6.9

pip -V
pip 20.2.2 from /home/rasa/venv/lib/python3.6/site-packages/pip (python 3.6)

pip show tensorflow
Version: 2.1.1

pip show tensorflow_addons
Version: 0.7.1

pip show pymorphy2
Version: 0.8

pip show spacy
Version: 2.1.9

rasa --version
Rasa 1.10.12

cat ./config.yml

# Configuration for Rasa NLU.
# https://rasa.com/docs/rasa/nlu/components/
language: ru
pipeline:
  - name: "SpacyNLP"
    model: ru2
  - name: "SpacyTokenizer"
  - name: "SpacyFeaturizer"
  - name: "RegexFeaturizer"
  - name: "CRFEntityExtractor"
  - name: "EntitySynonymMapper"
  - name: "SklearnIntentClassifier"
    analyzer: "char_wb"
    min_ngram: 1
    max_ngram: 4
  - name: DIETClassifier
    epochs: 100
  - name: ResponseSelector
    epochs: 100
# Configuration for Rasa Core.
# https://rasa.com/docs/rasa/core/policies/
policies:
  - name: MemoizationPolicy
  - name: TEDPolicy
    max_history: 5
    epochs: 100

После запуска комманды:
rasa train

Training Core model...
Processed Story Blocks: 100%|███████████████| 5/5 [00:00<00:00, 3274.24it/s, # trackers=1]
Processed Story Blocks: 100%|███████████████| 5/5 [00:00<00:00, 1573.97it/s, # trackers=5]
Processed Story Blocks: 100%|███████████████| 5/5 [00:00<00:00, 405.48it/s, # trackers=20]
Processed Story Blocks: 100%|███████████████| 5/5 [00:00<00:00, 301.93it/s, # trackers=24]
Processed trackers: 100%|███████████████████| 5/5 [00:00<00:00, 1970.45it/s, # actions=16]
Processed actions: 16it [00:00, 10648.82it/s, # examples=16]
Processed trackers: 100%|███████████████| 231/231 [00:00<00:00, 822.90it/s, # actions=126]
Epochs: 100%|██████| 100/100 [00:26<00:00,  3.71it/s, t_loss=0.084, loss=0.011, acc=1.000]
2020-09-06 16:12:45 INFO     rasa.utils.tensorflow.models  - Finished training.
2020-09-06 16:12:45 INFO     rasa.core.agent  - Persisted model to '/tmp/tmpwnqa2h6f/core'
Core model training completed.
Training NLU model...
2020-09-06 16:12:45 INFO     rasa.nlu.utils.spacy_utils  - Trying to load spacy model with name 'ru2'
2020-09-06 16:12:45 INFO     pymorphy2.opencorpora_dict.wrapper  - Loading dictionaries from /home/rasa/venv/lib/python3.6/site-packages/pymorphy2_dicts/data
2020-09-06 16:12:45 INFO     pymorphy2.opencorpora_dict.wrapper  - format: 2.4, revision: 393442, updated: 2015-01-17T16:03:56.586168
2020-09-06 16:12:51 INFO     rasa.nlu.components  - Added 'SpacyNLP' to component cache. Key 'SpacyNLP-ru2'.
2020-09-06 16:12:51 INFO     rasa.nlu.training_data.training_data  - Training data stats:
2020-09-06 16:12:51 INFO     rasa.nlu.training_data.training_data  - Number of intent examples: 33 (7 distinct intents)
2020-09-06 16:12:51 INFO     rasa.nlu.training_data.training_data  -   Found intents: 'mood_unhappy', 'bot_challenge', 'deny', 'mood_great', 'goodbye', 'greet', 'affirm'
2020-09-06 16:12:51 INFO     rasa.nlu.training_data.training_data  - Number of response examples: 0 (0 distinct responses)
2020-09-06 16:12:51 INFO     rasa.nlu.training_data.training_data  - Number of entity examples: 0 (0 distinct entities)
2020-09-06 16:12:51 INFO     rasa.nlu.model  - Starting to train component SpacyNLP
2020-09-06 16:12:51 INFO     rasa.nlu.model  - Finished training component.
2020-09-06 16:12:51 INFO     rasa.nlu.model  - Starting to train component SpacyTokenizer
2020-09-06 16:12:51 INFO     rasa.nlu.model  - Finished training component.
2020-09-06 16:12:51 INFO     rasa.nlu.model  - Starting to train component SpacyFeaturizer
2020-09-06 16:12:51 INFO     rasa.nlu.model  - Finished training component.
2020-09-06 16:12:51 INFO     rasa.nlu.model  - Starting to train component RegexFeaturizer
2020-09-06 16:12:51 INFO     rasa.nlu.model  - Finished training component.
2020-09-06 16:12:51 INFO     rasa.nlu.model  - Starting to train component CRFEntityExtractor
2020-09-06 16:12:51 INFO     rasa.nlu.model  - Finished training component.
2020-09-06 16:12:51 INFO     rasa.nlu.model  - Starting to train component EntitySynonymMapper
2020-09-06 16:12:51 INFO     rasa.nlu.model  - Finished training component.
2020-09-06 16:12:51 INFO     rasa.nlu.model  - Starting to train component SklearnIntentClassifier
Fitting 2 folds for each of 6 candidates, totalling 12 fits
[Parallel(n_jobs=1)]: Using backend SequentialBackend with 1 concurrent workers.
[Parallel(n_jobs=1)]: Done  12 out of  12 | elapsed:    0.0s finished
Traceback (most recent call last):
  File "/home/rasa/venv/bin/rasa", line 8, in <module>
    sys.exit(main())
  File "/home/rasa/venv/lib/python3.6/site-packages/rasa/__main__.py", line 92, in main
    cmdline_arguments.func(cmdline_arguments)
  File "/home/rasa/venv/lib/python3.6/site-packages/rasa/cli/train.py", line 76, in train
    additional_arguments=extract_additional_arguments(args),
  File "/home/rasa/venv/lib/python3.6/site-packages/rasa/train.py", line 50, in train
    additional_arguments=additional_arguments,
  File "uvloop/loop.pyx", line 1456, in uvloop.loop.Loop.run_until_complete
  File "/home/rasa/venv/lib/python3.6/site-packages/rasa/train.py", line 101, in train_async
    additional_arguments,
  File "/home/rasa/venv/lib/python3.6/site-packages/rasa/train.py", line 188, in _train_async_internal
    additional_arguments=additional_arguments,
  File "/home/rasa/venv/lib/python3.6/site-packages/rasa/train.py", line 245, in _do_training
    persist_nlu_training_data=persist_nlu_training_data,
  File "/home/rasa/venv/lib/python3.6/site-packages/rasa/train.py", line 482, in _train_nlu_with_validated_data
    persist_nlu_training_data=persist_nlu_training_data,
  File "/home/rasa/venv/lib/python3.6/site-packages/rasa/nlu/train.py", line 90, in train
    interpreter = trainer.train(training_data, **kwargs)
  File "/home/rasa/venv/lib/python3.6/site-packages/rasa/nlu/model.py", line 191, in train
    updates = component.train(working_data, self.config, **context)
  File "/home/rasa/venv/lib/python3.6/site-packages/rasa/nlu/classifiers/sklearn_intent_classifier.py", line 125, in train
    self.clf.fit(X, y)
  File "/home/rasa/venv/lib/python3.6/site-packages/sklearn/model_selection/_search.py", line 739, in fit
    self.best_estimator_.fit(X, y, **fit_params)
  File "/home/rasa/venv/lib/python3.6/site-packages/sklearn/svm/_base.py", line 148, in fit
    accept_large_sparse=False)
  File "/home/rasa/venv/lib/python3.6/site-packages/sklearn/utils/validation.py", line 755, in check_X_y
    estimator=estimator)
  File "/home/rasa/venv/lib/python3.6/site-packages/sklearn/utils/validation.py", line 578, in check_array
    allow_nan=force_all_finite == 'allow-nan')
  File "/home/rasa/venv/lib/python3.6/site-packages/sklearn/utils/validation.py", line 60, in _assert_all_finite
    msg_dtype if msg_dtype is not None else X.dtype)
ValueError: Input contains NaN, infinity or a value too large for dtype('float64').

Если установить языковую модель en запускается без ошибок.
Прошу поделиться опытом тех у кого получилось использовать RASA и русский язык.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant