not able to get data from csv file to train network in "train-theano.py" #14

totial · 2017-02-03T15:13:56Z

Hey, Im having troubles getting the data to train the RNN. Specifically on this line:
sentences = itertools.chain(*[nltk.sent_tokenize(x[0].decode('utf-8').lower()) for x in reader])
if I open the file as 'rb' i get the error:

_csv.Error: iterator should return strings, not bytes (did you open the file in text mode?)

and if I open it up with 'r' i get:

sentences = itertools.chain(*[nltk.sent_tokenize(x[0].decode('utf-8').lower()) for x in reader])
AttributeError: 'str' object has no attribute 'decode'

Im not sure wich is the very basic idea to train the NN with strings or binary codes (guess binary codes).
thanks for your time!

The text was updated successfully, but these errors were encountered:

GoingMyWay · 2017-05-05T01:37:12Z

Maybe your Python version is 3.x, the code below runs without error under Python 2.7

with open('data/reddit-comments-2015-08.csv', 'rb') as f:
    reader = csv.reader(f, skipinitialspace=True)
    reader.next()
    # Split full comments into sentences
    sentences = itertools.chain(*[nltk.sent_tokenize(x[0].decode('utf-8').lower()) for x in reader])
    # Append SENTENCE_START and SENTENCE_END
    sentences = ["%s %s %s" % (sentence_start_token, x, sentence_end_token) for x in sentences]

chrischang80 · 2018-02-12T10:32:31Z

You can remove ".decode('utf-8')" and try again.

Pavonlo · 2019-02-07T19:30:37Z

You can remove ".decode('utf-8')" and try again.

Yes, you must remove this but a couple of other changes are also required so that entire line becomes -
with open('data/reddit-comments-2015-08.csv', 'rt', encoding="utf8") as f:

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

not able to get data from csv file to train network in "train-theano.py" #14

not able to get data from csv file to train network in "train-theano.py" #14

totial commented Feb 3, 2017

GoingMyWay commented May 5, 2017

chrischang80 commented Feb 12, 2018

Pavonlo commented Feb 7, 2019 •

edited

Loading

not able to get data from csv file to train network in "train-theano.py" #14

not able to get data from csv file to train network in "train-theano.py" #14

Comments

totial commented Feb 3, 2017

GoingMyWay commented May 5, 2017

chrischang80 commented Feb 12, 2018

Pavonlo commented Feb 7, 2019 • edited Loading

Pavonlo commented Feb 7, 2019 •

edited

Loading