Skip to content
This repository has been archived by the owner on Apr 18, 2018. It is now read-only.

bolt died because the read_tuple() has TypeError: 'int' object has no attribute '__getitem__' #109

Open
n1epan opened this issue Mar 31, 2015 · 3 comments

Comments

@n1epan
Copy link

n1epan commented Mar 31, 2015

I have a bolt A that continuously emits a namedtuple that contains a list that could have thousands of ids in it.

Fields = namedtuple("Fields", "table action ids"), ids is a list that contains thousands of items in it.

The bolt B in the downstream have the error below. My guess is that the namedtuple is overflowing the sys.stdin. Is this possible? What I should do in this case ?

  4 topology:
  5     - spout:
  6         name: spoutA
  7         module: folder.spoutA
  8 
  9     - bolt:
 10         name: boltA
 11         module: folder.boltA
 12         groupings:
 13             - shuffle_grouping:
 14                 component: spoutA
 15 
 16     - bolt:
 17         name: boltB
 18         module: folder.boltB
 19         groupings:
 20             - shuffle_grouping:
 21                 component: boltA

I further checked what it is returned in read_tuple() to cmd. It is 56.
03/31/2015 03:55:13 PM - pyleus.storm.component - read_tuple - INFO: 56

30 03/31/2015 02:56:29 PM - pyleus.storm.component - run - ERROR: Exception in bolt.run
 31 Traceback (most recent call last):                                                                                      
 32   File "/usr/lib/python2.7/site-packages/pyleus/storm/component.py", line 233, in run
 33     self.run_component()                                                                                                
 34   File "/usr/lib/python2.7/site-packages/pyleus/storm/bolt.py", line 45, in run_component
 35     tup = self.read_tuple()                                                                                             
 36   File "/usr/lib/python2.7/site-packages/pyleus/storm/component.py", line 291, in read_tuple
 37     cmd['id'], cmd['comp'], cmd['stream'], cmd['task'], cmd['tuple'])                                                   
 38 TypeError: 'int' object has no attribute '__getitem__'
@poros
Copy link
Contributor

poros commented Apr 7, 2015

Can you try to run it locally and to debug the issue? I run a topology with some tuples in the order of the MB some time ago and it seemed to work fine (with some Storm tweaks).

@n1epan
Copy link
Author

n1epan commented Apr 7, 2015

The issues is because the msgs are too long for msgpack to handle. After I switch to json, it worked.

@poros poros added the bug label Apr 7, 2015
@poros
Copy link
Contributor

poros commented Apr 7, 2015

Ah, so this is actually a bug/limitation of our msgpack serializer. Good to know, thanks

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

No branches or pull requests

2 participants