fitting the model using generators #16

naarkhoo · 2021-01-18T10:28:15Z

I am working with a huge file which can not fully be loaded therefore have to use generators.

here is the iris example, which I am trying to read the data from the csv file in batches

!wget https://gist.githubusercontent.com/netj/8836201/raw/6f9306ad21398ea43cba4f7d537619d0e07d5ae3/iris.csv
    
def generate_data_from_file(params):
    data_out = pd.read_csv('iris.csv',  
                           skiprows = range(1, params['skiprows']),
                           index_col = 0, 
                           nrows = params['train_upto_row_num'], 
                           chunksize = params['num_observation'])

    for item_df in data_out:
        
        target = item_df['variety']
        item_df = item_df[["sepal.length","sepal.width","petal.length","petal.width"]]
        
        yield np.array(item_df), np.array(target)

types = (tf.float32, tf.int16)

training_params = {'skiprows': 0, 
                   'train_upto_row_num': 40, 
                   'num_observation': 5}

valid_params = {'skiprows': 40, 
                   'train_upto_row_num': 20, 
                   'num_observation': 5}

training_dataset = tf.data.Dataset.from_generator(lambda: generate_data_from_file(params_train),
                                         output_types=types
                                         #, output_shapes=shapes
                                        ).repeat(1)

validation_dataset = tf.data.Dataset.from_generator(lambda: generate_data_from_file(params_val),
                                         output_types=types
                                         #, output_shapes=shapes
                                        ).repeat(1)

col_names = ['sepal_length', 'sepal_width', 'petal_length', 'petal_width']

feature_columns = []
for col_name in col_names:
    feature_columns.append(tf.feature_column.numeric_column(col_name))

    
model = tabnet.TabNetClassifier(feature_columns, num_classes=3,
                                feature_dim=8, output_dim=4,
                                num_decision_steps=4, relaxation_factor=1.0,
                                sparsity_coefficient=1e-5, batch_momentum=0.98,
                                virtual_batch_size=None, norm_type='group',
                                num_groups=1)

lr = tf.keras.optimizers.schedules.ExponentialDecay(0.01, decay_steps=100, decay_rate=0.9, staircase=False)
optimizer = tf.keras.optimizers.Adam(lr)
model.compile(optimizer, loss='categorical_crossentropy', metrics=['accuracy'])

model.fit(training_dataset, 
          epochs=100, 
          validation_data=validation_dataset, 
          verbose=2)

model.summary()

however it gives me the following error

ValueError: in user code:

    /opt/conda/lib/python3.8/site-packages/tensorflow/python/keras/engine/training.py:805 train_function  *
        return step_function(self, iterator)
    /opt/conda/lib/python3.8/site-packages/tabnet/tabnet.py:421 call  *
        self.activations = self.tabnet(inputs, training=training)
    /opt/conda/lib/python3.8/site-packages/tabnet/tabnet.py:213 call  *
        features = self.input_features(inputs)
    /opt/conda/lib/python3.8/site-packages/tensorflow/python/keras/engine/base_layer.py:1012 __call__  **
        outputs = call_fn(inputs, *args, **kwargs)
    /opt/conda/lib/python3.8/site-packages/tensorflow/python/keras/feature_column/dense_features.py:158 call  **
        raise ValueError('We expected a dictionary here. Instead we got: ',

    ValueError: ('We expected a dictionary here. Instead we got: ', <tf.Tensor 'IteratorGetNext:0' shape=<unknown> dtype=float32>)

I wonder if you can help me with formatting the data - thanks

The text was updated successfully, but these errors were encountered:

gfade · 2021-02-25T04:38:32Z

I changed the feature_columns to None and it started working for me, but I'm not sure why

mainguyenanhvu · 2021-08-17T04:43:40Z

Have you found the reason why it works when changing the feature_columns to None?

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fitting the model using generators #16

fitting the model using generators #16

naarkhoo commented Jan 18, 2021 •

edited

Loading

gfade commented Feb 25, 2021

mainguyenanhvu commented Aug 17, 2021

fitting the model using generators #16

fitting the model using generators #16

Comments

naarkhoo commented Jan 18, 2021 • edited Loading

gfade commented Feb 25, 2021

mainguyenanhvu commented Aug 17, 2021

naarkhoo commented Jan 18, 2021 •

edited

Loading