collection of text2cypher datasets, evaluations, and finetuning instructions
-
Updated
Jun 13, 2024 - Jupyter Notebook
collection of text2cypher datasets, evaluations, and finetuning instructions
Repository for organizing datasets and papers used in Open LLM.
A data-centric AI package for ML/AI. Get the best high-quality data for the best results. Discord: https://discord.gg/t6ADqBKrdZ
A collection of LLM related papers, thesis, tools, datasets, courses, open source models, benchmarks
Efficiently fetch and perform sentiment analysis (Turkish Only) on eksisozluk.com entries using Rust
A collection of recent open-source math datasets for training and evaluating Math LLMs
A framework to analyze how AGI/ASI might emerge from decentralized, adaptive systems, rather than as the fruit of a single model deployment. It also aims to present orientation as a dynamic and self-evolving Magna Carta, helping to guide the emergence of such phenomena.
Convert multi-speaker audio files to structured chat data for LLMs
A bunch of very famous repos source code's in python as pure localdocs all in this repo to train CODE AI
WikiText syntax dataset generation pipeline and open dataset for auto UI generation in TiddlyWiki. (WIP)
Synthetically Generating Intent-Aware Information-Seeking Dialogues! Useful for various tasks such as training/evaluating User Intent Predictors with the possibility to training/evaluating on real human dialogues. The backbone LLM of SOLID is Zephyr-7b-beta.
PARROT (Performance Assessment of Reasoning and Responses On Trivia) is a novel benchmarking framework designed to evaluate Large Language Models (LLMs) on real-world, complex, and ambiguous QA tasks.
A modified dataset consisting of English dialogs between a user and an assistant discussing movie preferences in natural language.
Collection of ETL scripts used to create a dataset of text in Spanish to train Large Language Models.
Add a description, image, and links to the llm-datasets topic page so that developers can more easily learn about it.
To associate your repository with the llm-datasets topic, visit your repo's landing page and select "manage topics."