Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Can I use this model as a layer of a larger model? #35

Open
Benjamim-EP opened this issue Jul 20, 2021 · 3 comments
Open

Can I use this model as a layer of a larger model? #35

Benjamim-EP opened this issue Jul 20, 2021 · 3 comments

Comments

@Benjamim-EP
Copy link

I would like to know how I can use this template as in the example below

`
class DCNNBERTEmbedding(tf.keras.Model):

def __init__(self,
             nb_filters=50,
             FFN_units=512,
             nb_classes=2,
             dropout_rate=0.1,
             name="dcnn"):
    super(DCNNBERTEmbedding, self).__init__(name=name)
    
    # Layer embedding  bert
    self.bert_layer = hub.KerasLayer("https://tfhub.dev/tensorflow/bert_en_uncased_L-12_H-768_A-12/1",name = "bert",
                                     trainable = False)

    self.bigram = layers.Conv1D(filters=nb_filters,
                                kernel_size=2,
                                padding="valid",
                                activation="relu")
    self.trigram = layers.Conv1D(filters=nb_filters,
                                 kernel_size=3,
                                 padding="valid",
                                 activation="relu")
    self.fourgram = layers.Conv1D(filters=nb_filters,
                                  kernel_size=4,
                                  padding="valid",
                                  activation="relu")
    self.pool = layers.GlobalMaxPool1D()
    self.dense_1 = layers.Dense(units=FFN_units, activation="relu")
    self.dropout = layers.Dropout(rate=dropout_rate)
    if nb_classes == 2:
        self.last_dense = layers.Dense(units=1,
                                       activation="sigmoid")
    else:
        self.last_dense = layers.Dense(units=nb_classes,
                                       activation="softmax")
# Fazer embedding com bert
def embed_with_bert(self, all_tokens):
  # Lembrar dos parametros retornados pelo bert_layers, o primeiro relacionado a sentença inteira
   # O segundo relacionado aos embedding, então queremos só o segundo retorno
  _, embs = self.bert_layer([all_tokens[:, 0, :], # [: (todos os tokens), 0 (os ids), : (tudo que tiver no restante)]
                             all_tokens[:, 1, :], # [:,1 (mascara),:]
                             all_tokens[:, 2, :]])
  return embs

# Função para buscar a camada de embedding
def call(self, inputs, training):
    x = self.embed_with_bert(inputs)
    
    x_1 = self.bigram(x)
    x_1 = self.pool(x_1)
    x_2 = self.trigram(x)
    x_2 = self.pool(x_2)
    x_3 = self.fourgram(x)
    x_3 = self.pool(x_3)
    
    merged = tf.concat([x_1, x_2, x_3], axis=-1) # (batch_size, 3 * nb_filters)
    merged = self.dense_1(merged)
    merged = self.dropout(merged, training)
    output = self.last_dense(merged)
    
    return output

`

@fabiocapsouza
Copy link
Contributor

Hi @Benjamim-EP ,

I am not a TensorFlow user, so unfortunately I can't give you directions. But it should be possible to adapt a working example for English BERT (or other language) using BERTimbau TensorFlow checkpoint (weights) and config file.
I'll leave this issue open so others may help you. Please share your experience with us if you find a solution :)

@dimitreOliveira
Copy link

Hi @Benjamim-EP and @fabiocapsouza , using this model as part of another model should be straightforward, here is a minimal example, that uses BERT as base and adds a classifier head at the top:

# Load BERT with the HF API
encoder = TFBertModel.from_pretrained('path/to/bert_dir/', from_pt=True)

# Build model composed with BERT
# Input layers (from the tokenizer)
input_ids = tf.keras.layers.Input(shape=(None,), dtype=tf.int32, name='input_ids')
token_type_ids = tf.keras.layers.Input(shape=(None,), dtype=tf.int32, name='token_type_ids')
attention_mask = tf.keras.layers.Input(shape=(None,), dtype=tf.int32, name='attention_mask')

# BERT encoder
encoded = encoder({"input_ids": input_ids, 
                   "token_type_ids": token_type_ids, 
                   "attention_mask": attention_mask})['pooler_output']
# Classifier head
outputs = tf.keras.layers.Dense(n_classes, activation='softmax', name='classifier')(encoded)

# Build the model
model = tf.keras.models.Model(inputs=[input_ids, token_type_ids, attention_mask], outputs=outputs)

The only issue I faced is that the TFBertModel is not able to load the files from the Tensorflow checkpoints, so you need to load from PyTorch and use from_pt=True

@jvanz
Copy link

jvanz commented Mar 9, 2022

I believe this example should be in the README as an example of how to use it with Tensorflow. ;)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants