We read every piece of feedback, and take your input very seriously.
To see all available qualifiers, see our documentation.
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
For anyone interested, here is a simple snippet to convert the TFRecord file to JSON format:
import base64 import json import tensorflow as tf file_path = "dev.tfrecord" def parse_tfrecord(record): example = tf.train.Example() example.ParseFromString(record.numpy()) return example def read_tfrecord_file(file_path): raw_dataset = tf.data.TFRecordDataset(file_path) parsed_records = [] for raw_record in raw_dataset: example = parse_tfrecord(raw_record) record = {} for key, value in example.features.feature.items(): if value.bytes_list.value: try: # Try to decode as UTF-8 string record[key] = value.bytes_list.value[0].decode('utf-8') except UnicodeDecodeError: # If decoding fails, store as raw bytes record[key] = base64.b64encode(value.bytes_list.value[0]).decode('utf-8') elif value.float_list.value: record[key] = value.float_list.value[0] elif value.int64_list.value: record[key] = value.int64_list.value[0] parsed_records.append(record) return parsed_records records = read_tfrecord_file(file_path) json_records = json.dumps(records, indent=4) with open('output.json', 'w') as json_file: json_file.write(json_records) print("TFRecord has been converted to JSON and saved as output.json")
The text was updated successfully, but these errors were encountered:
Thanks for providing the codes! We have added a simple script to show how to retrieve the labels from the dataset at https://github.com/google-research/google-research/blob/master/richhf_18k/parse_tfrecord_file.py, which can be used together with this script to convert the dataset to JSON or other formats.
Sorry, something went wrong.
No branches or pull requests
For anyone interested, here is a simple snippet to convert the TFRecord file to JSON format:
The text was updated successfully, but these errors were encountered: