GitHub - thombashi/pytablereader: A Python library to load structured table data from files/strings/URL with various data format: CSV / Excel / Google-Sheets / HTML / JSON / LDJSON / LTSV / Markdown / SQLite / TSV.

pytablereader

Summary
- Features
Examples
Installation
- Install from PyPI
- Install from PPA (for Ubuntu)
Dependencies
- Optional Python packages
- Optional packages (other than Python packages)
Documentation
Related Project
Sponsors

Summary

pytablereader is a Python library to load structured table data from files/strings/URL with various data format: CSV / Excel / Google-Sheets / HTML / JSON / LDJSON / LTSV / Markdown / SQLite / TSV.

Features

Extract structured tabular data from various data format:
- CSV / Tab separated values (TSV) / Space separated values (SSV)
- Microsoft Excel ^TM file
- Google Sheets
- HTML (table tags)
- JSON
- Labeled Tab-separated Values (LTSV)
- Line-delimited JSON(LDJSON) / NDJSON / JSON Lines
- Markdown
- MediaWiki
- SQLite database file
Supported data sources are:
- Files on a local file system
- Accessible URLs
- str instances
Loaded table data can be used as:
- pandas.DataFrame instance
- dict instance

Examples

Load a CSV table

Sample Code:

import pytablereader as ptr
import pytablewriter as ptw


# prepare data ---
file_path = "sample_data.csv"
csv_text = "\n".join([
    '"attr_a","attr_b","attr_c"',
    '1,4,"a"',
    '2,2.1,"bb"',
    '3,120.9,"ccc"',
])

with open(file_path, "w") as f:
    f.write(csv_text)

# load from a csv file ---
loader = ptr.CsvTableFileLoader(file_path)
for table_data in loader.load():
    print("\n".join([
        "load from file",
        "==============",
        "{:s}".format(ptw.dumps_tabledata(table_data)),
    ]))

# load from a csv text ---
loader = ptr.CsvTableTextLoader(csv_text)
for table_data in loader.load():
    print("\n".join([
        "load from text",
        "==============",
        "{:s}".format(ptw.dumps_tabledata(table_data)),
    ]))

Output:

load from file
==============
.. table:: sample_data

    ======  ======  ======
    attr_a  attr_b  attr_c
    ======  ======  ======
         1     4.0  a
         2     2.1  bb
         3   120.9  ccc
    ======  ======  ======

load from text
==============
.. table:: csv2

    ======  ======  ======
    attr_a  attr_b  attr_c
    ======  ======  ======
         1     4.0  a
         2     2.1  bb
         3   120.9  ccc
    ======  ======  ======

Get loaded table data as pandas.DataFrame instance

Sample Code:	import pytablereader as ptr loader = ptr.CsvTableTextLoader( "\n".join([ "a,b", "1,2", "3.3,4.4", ])) for table_data in loader.load(): print(table_data.as_dataframe())
Output:	a b 0 1 2 1 3.3 4.4

For more information

More examples are available at https://pytablereader.rtfd.io/en/latest/pages/examples/index.html

Installation

Install from PyPI

pip install pytablereader

Some of the formats require additional dependency packages, you can install the dependency packages as follows:

Excel
- pip install pytablereader[excel]
Google Sheets
- pip install pytablereader[gs]
Markdown
- pip install pytablereader[md]
Mediawiki
- pip install pytablereader[mediawiki]
SQLite
- pip install pytablereader[sqlite]
Load from URLs
- pip install pytablereader[url]
All of the extra dependencies
- pip install pytablereader[all]

Install from PPA (for Ubuntu)

sudo add-apt-repository ppa:thombashi/ppa
sudo apt update
sudo apt install python3-pytablereader

Dependencies

Python 3.7+
Python package dependencies (automatically installed)

Optional Python packages

logging extras
- loguru: Used for logging if the package installed
excel extras
- excelrd
md extras
- Markdown
mediawiki extras
- pypandoc
sqlite extras
- SimpleSQLite
url extras
- retryrequests
pandas
- required to get table data as a pandas data frame
lxml

Optional packages (other than Python packages)

libxml2 (faster HTML conversion)
pandoc (required when loading MediaWiki file)

Documentation

https://pytablereader.rtfd.io/

Related Project

pytablewriter
- Tabular data loaded by pytablereader can be written another tabular data format with pytablewriter.

Name		Name	Last commit message	Last commit date
Latest commit History 1,082 Commits
.github		.github
docs		docs
examples		examples
pytablereader		pytablereader
requirements		requirements
test		test
.gitignore		.gitignore
.readthedocs.yaml		.readthedocs.yaml
LICENSE		LICENSE
MANIFEST.in		MANIFEST.in
Makefile		Makefile
README.rst		README.rst
invoke_pytest.py		invoke_pytest.py
pylama.ini		pylama.ini
pyproject.toml		pyproject.toml
setup.py		setup.py
tox.ini		tox.ini

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Repository files navigation

Summary

Features

Examples

Load a CSV table

Get loaded table data as pandas.DataFrame instance

For more information

Installation

Install from PyPI

Install from PPA (for Ubuntu)

Dependencies

Optional Python packages

Optional packages (other than Python packages)

Documentation

Related Project

Sponsors

About

Uh oh!

Releases 26

Sponsor this project

Uh oh!

Packages

Uh oh!

Contributors 3

Uh oh!

Languages

Uh oh!

Uh oh!

License

Uh oh!

thombashi/pytablereader

Folders and files

Latest commit

History

Repository files navigation

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Sponsor this project

Uh oh!

Uh oh!

Uh oh!

Languages