Skip to content
/ parsel Public

Parsel lets you extract data from XML/HTML documents using XPath or CSS selectors

License

Notifications You must be signed in to change notification settings

scrapy/parsel

Folders and files

NameName
Last commit message
Last commit date

Latest commit

4966533 · Mar 31, 2025
Mar 24, 2025
Jan 30, 2025
Mar 31, 2025
Mar 31, 2025
Feb 22, 2024
Oct 28, 2022
Mar 24, 2025
Mar 24, 2025
Jul 25, 2016
Oct 11, 2024
Dec 16, 2024
Mar 24, 2025
Jan 31, 2025
May 17, 2017
Jan 30, 2025
Mar 24, 2025

Repository files navigation

Parsel

Tests Supported Python versions PyPI Version Coverage report

Parsel is a BSD-licensed Python library to extract data from HTML, JSON, and XML documents.

It supports:

Find the Parsel online documentation at https://parsel.readthedocs.org.

Example (open online demo):

>>> from parsel import Selector
>>> text = """
        <html>
            <body>
                <h1>Hello, Parsel!</h1>
                <ul>
                    <li><a href="http://example.com">Link 1</a></li>
                    <li><a href="http://scrapy.org">Link 2</a></li>
                </ul>
                <script type="application/json">{"a": ["b", "c"]}</script>
            </body>
        </html>"""
>>> selector = Selector(text=text)
>>> selector.css('h1::text').get()
'Hello, Parsel!'
>>> selector.xpath('//h1/text()').re(r'\w+')
['Hello', 'Parsel']
>>> for li in selector.css('ul > li'):
...     print(li.xpath('.//@href').get())
http://example.com
http://scrapy.org
>>> selector.css('script::text').jmespath("a").get()
'b'
>>> selector.css('script::text').jmespath("a").getall()
['b', 'c']