First commit
This commit is contained in:
@@ -0,0 +1,49 @@
|
||||
Behold, mortal, the origins of Beautiful Soup...
|
||||
================================================
|
||||
|
||||
Leonard Richardson is the primary maintainer.
|
||||
|
||||
Aaron DeVore and Isaac Muse have made significant contributions to the
|
||||
code base.
|
||||
|
||||
Mark Pilgrim provided the encoding detection code that forms the base
|
||||
of UnicodeDammit.
|
||||
|
||||
Thomas Kluyver and Ezio Melotti finished the work of getting Beautiful
|
||||
Soup 4 working under Python 3.
|
||||
|
||||
Simon Willison wrote soupselect, which was used to make Beautiful Soup
|
||||
support CSS selectors. Isaac Muse wrote SoupSieve, which made it
|
||||
possible to _remove_ the CSS selector code from Beautiful Soup.
|
||||
|
||||
Sam Ruby helped with a lot of edge cases.
|
||||
|
||||
Jonathan Ellis was awarded the prestigious Beau Potage D'Or for his
|
||||
work in solving the nestable tags conundrum.
|
||||
|
||||
An incomplete list of people have contributed patches to Beautiful
|
||||
Soup:
|
||||
|
||||
Istvan Albert, Andrew Lin, Anthony Baxter, Oliver Beattie, Andrew
|
||||
Boyko, Tony Chang, Francisco Canas, "Delong", Zephyr Fang, Fuzzy,
|
||||
Roman Gaufman, Yoni Gilad, Richie Hindle, Toshihiro Kamiya, Peteris
|
||||
Krumins, Kent Johnson, Marek Kapolka, Andreas Kostyrka, Roel Kramer,
|
||||
Ben Last, Robert Leftwich, Stefaan Lippens, "liquider", Staffan
|
||||
Malmgren, Ksenia Marasanova, JP Moins, Adam Monsen, John Nagle, "Jon",
|
||||
Ed Oskiewicz, Martijn Peters, Greg Phillips, Giles Radford, Stefano
|
||||
Revera, Arthur Rudolph, Marko Samastur, James Salter, Jouni Seppänen,
|
||||
Alexander Schmolck, Tim Shirley, Geoffrey Sneddon, Ville Skyttä,
|
||||
"Vikas", Jens Svalgaard, Andy Theyers, Eric Weiser, Glyn Webster, John
|
||||
Wiseman, Paul Wright, Danny Yoo
|
||||
|
||||
An incomplete list of people who made suggestions or found bugs or
|
||||
found ways to break Beautiful Soup:
|
||||
|
||||
Hanno Böck, Matteo Bertini, Chris Curvey, Simon Cusack, Bruce Eckel,
|
||||
Matt Ernst, Michael Foord, Tom Harris, Bill de hOra, Donald Howes,
|
||||
Matt Patterson, Scott Roberts, Steve Strassmann, Mike Williams,
|
||||
warchild at redho dot com, Sami Kuisma, Carlos Rocha, Bob Hutchison,
|
||||
Joren Mc, Michal Migurski, John Kleven, Tim Heaney, Tripp Lilley, Ed
|
||||
Summers, Dennis Sutch, Chris Smith, Aaron Swartz, Stuart
|
||||
Turner, Greg Edwards, Kevin J Kalupson, Nikos Kouremenos, Artur de
|
||||
Sousa Rocha, Yichun Wei, Per Vognsen
|
||||
@@ -0,0 +1 @@
|
||||
pip
|
||||
@@ -0,0 +1,31 @@
|
||||
Beautiful Soup is made available under the MIT license:
|
||||
|
||||
Copyright (c) Leonard Richardson
|
||||
|
||||
Permission is hereby granted, free of charge, to any person obtaining
|
||||
a copy of this software and associated documentation files (the
|
||||
"Software"), to deal in the Software without restriction, including
|
||||
without limitation the rights to use, copy, modify, merge, publish,
|
||||
distribute, sublicense, and/or sell copies of the Software, and to
|
||||
permit persons to whom the Software is furnished to do so, subject to
|
||||
the following conditions:
|
||||
|
||||
The above copyright notice and this permission notice shall be
|
||||
included in all copies or substantial portions of the Software.
|
||||
|
||||
THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND,
|
||||
EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF
|
||||
MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND
|
||||
NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS
|
||||
BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN
|
||||
ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN
|
||||
CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
|
||||
SOFTWARE.
|
||||
|
||||
Beautiful Soup incorporates code from the html5lib library, which is
|
||||
also made available under the MIT license. Copyright (c) James Graham
|
||||
and other contributors
|
||||
|
||||
Beautiful Soup has an optional dependency on the soupsieve library,
|
||||
which is also made available under the MIT license. Copyright (c)
|
||||
Isaac Muse
|
||||
@@ -0,0 +1,117 @@
|
||||
Metadata-Version: 2.1
|
||||
Name: beautifulsoup4
|
||||
Version: 4.11.2
|
||||
Summary: Screen-scraping library
|
||||
Home-page: https://www.crummy.com/software/BeautifulSoup/bs4/
|
||||
Author: Leonard Richardson
|
||||
Author-email: leonardr@segfault.org
|
||||
License: MIT
|
||||
Download-URL: https://www.crummy.com/software/BeautifulSoup/bs4/download/
|
||||
Platform: UNKNOWN
|
||||
Classifier: Development Status :: 5 - Production/Stable
|
||||
Classifier: Intended Audience :: Developers
|
||||
Classifier: License :: OSI Approved :: MIT License
|
||||
Classifier: Programming Language :: Python
|
||||
Classifier: Programming Language :: Python :: 3
|
||||
Classifier: Topic :: Text Processing :: Markup :: HTML
|
||||
Classifier: Topic :: Text Processing :: Markup :: XML
|
||||
Classifier: Topic :: Text Processing :: Markup :: SGML
|
||||
Classifier: Topic :: Software Development :: Libraries :: Python Modules
|
||||
Requires-Python: >=3.6.0
|
||||
Description-Content-Type: text/markdown
|
||||
Requires-Dist: soupsieve (>1.2)
|
||||
Provides-Extra: html5lib
|
||||
Requires-Dist: html5lib ; extra == 'html5lib'
|
||||
Provides-Extra: lxml
|
||||
Requires-Dist: lxml ; extra == 'lxml'
|
||||
|
||||
Beautiful Soup is a library that makes it easy to scrape information
|
||||
from web pages. It sits atop an HTML or XML parser, providing Pythonic
|
||||
idioms for iterating, searching, and modifying the parse tree.
|
||||
|
||||
# Quick start
|
||||
|
||||
```
|
||||
>>> from bs4 import BeautifulSoup
|
||||
>>> soup = BeautifulSoup("<p>Some<b>bad<i>HTML")
|
||||
>>> print(soup.prettify())
|
||||
<html>
|
||||
<body>
|
||||
<p>
|
||||
Some
|
||||
<b>
|
||||
bad
|
||||
<i>
|
||||
HTML
|
||||
</i>
|
||||
</b>
|
||||
</p>
|
||||
</body>
|
||||
</html>
|
||||
>>> soup.find(text="bad")
|
||||
'bad'
|
||||
>>> soup.i
|
||||
<i>HTML</i>
|
||||
#
|
||||
>>> soup = BeautifulSoup("<tag1>Some<tag2/>bad<tag3>XML", "xml")
|
||||
#
|
||||
>>> print(soup.prettify())
|
||||
<?xml version="1.0" encoding="utf-8"?>
|
||||
<tag1>
|
||||
Some
|
||||
<tag2/>
|
||||
bad
|
||||
<tag3>
|
||||
XML
|
||||
</tag3>
|
||||
</tag1>
|
||||
```
|
||||
|
||||
To go beyond the basics, [comprehensive documentation is available](https://www.crummy.com/software/BeautifulSoup/bs4/doc/).
|
||||
|
||||
# Links
|
||||
|
||||
* [Homepage](https://www.crummy.com/software/BeautifulSoup/bs4/)
|
||||
* [Documentation](https://www.crummy.com/software/BeautifulSoup/bs4/doc/)
|
||||
* [Discussion group](https://groups.google.com/group/beautifulsoup/)
|
||||
* [Development](https://code.launchpad.net/beautifulsoup/)
|
||||
* [Bug tracker](https://bugs.launchpad.net/beautifulsoup/)
|
||||
* [Complete changelog](https://bazaar.launchpad.net/~leonardr/beautifulsoup/bs4/view/head:/CHANGELOG)
|
||||
|
||||
# Note on Python 2 sunsetting
|
||||
|
||||
Beautiful Soup's support for Python 2 was discontinued on December 31,
|
||||
2020: one year after the sunset date for Python 2 itself. From this
|
||||
point onward, new Beautiful Soup development will exclusively target
|
||||
Python 3. The final release of Beautiful Soup 4 to support Python 2
|
||||
was 4.9.3.
|
||||
|
||||
# Supporting the project
|
||||
|
||||
If you use Beautiful Soup as part of your professional work, please consider a
|
||||
[Tidelift subscription](https://tidelift.com/subscription/pkg/pypi-beautifulsoup4?utm_source=pypi-beautifulsoup4&utm_medium=referral&utm_campaign=readme).
|
||||
This will support many of the free software projects your organization
|
||||
depends on, not just Beautiful Soup.
|
||||
|
||||
If you use Beautiful Soup for personal projects, the best way to say
|
||||
thank you is to read
|
||||
[Tool Safety](https://www.crummy.com/software/BeautifulSoup/zine/), a zine I
|
||||
wrote about what Beautiful Soup has taught me about software
|
||||
development.
|
||||
|
||||
# Building the documentation
|
||||
|
||||
The bs4/doc/ directory contains full documentation in Sphinx
|
||||
format. Run `make html` in that directory to create HTML
|
||||
documentation.
|
||||
|
||||
# Running the unit tests
|
||||
|
||||
Beautiful Soup supports unit test discovery using Pytest:
|
||||
|
||||
```
|
||||
$ pytest
|
||||
```
|
||||
|
||||
|
||||
|
||||
@@ -0,0 +1,55 @@
|
||||
beautifulsoup4-4.11.2.dist-info/AUTHORS,sha256=uSIdbrBb1sobdXl7VrlUvuvim2dN9kF3MH4Edn0WKGE,2176
|
||||
beautifulsoup4-4.11.2.dist-info/INSTALLER,sha256=zuuue4knoyJ-UwPPXg8fezS7VCrXJQrAP7zeNuwvFQg,4
|
||||
beautifulsoup4-4.11.2.dist-info/LICENSE,sha256=VbTY1LHlvIbRDvrJG3TIe8t3UmsPW57a-LnNKtxzl7I,1441
|
||||
beautifulsoup4-4.11.2.dist-info/METADATA,sha256=jVN3HgxwfZMiRGCvQ-M26DbYTna1JZHIhGoQpsF_dDY,3481
|
||||
beautifulsoup4-4.11.2.dist-info/RECORD,,
|
||||
beautifulsoup4-4.11.2.dist-info/WHEEL,sha256=g4nMs7d-Xl9-xC9XovUrsDHGXt-FT0E17Yqo92DEfvY,92
|
||||
beautifulsoup4-4.11.2.dist-info/top_level.txt,sha256=gpUVJcTwW3q7-QGp6tAEomZsskknmgSqVe6xn1C0jJI,26
|
||||
bs4/__init__.py,sha256=_8wGj0EDYCD1EKXJGc1c0cuwlt6WNrk_yf6BMrz1ATU,32908
|
||||
bs4/__pycache__/__init__.cpython-39.pyc,,
|
||||
bs4/__pycache__/dammit.cpython-39.pyc,,
|
||||
bs4/__pycache__/diagnose.cpython-39.pyc,,
|
||||
bs4/__pycache__/element.cpython-39.pyc,,
|
||||
bs4/__pycache__/formatter.cpython-39.pyc,,
|
||||
bs4/builder/__init__.py,sha256=KGBl_FgX1KV1wBIshW4EXlWjP3KLcRiF2opZ-zVcyAc,24393
|
||||
bs4/builder/__pycache__/__init__.cpython-39.pyc,,
|
||||
bs4/builder/__pycache__/_html5lib.cpython-39.pyc,,
|
||||
bs4/builder/__pycache__/_htmlparser.cpython-39.pyc,,
|
||||
bs4/builder/__pycache__/_lxml.cpython-39.pyc,,
|
||||
bs4/builder/_html5lib.py,sha256=LnhimXrUdKujKoHHbmzwNk8OBb11YfTRFXUwhZjwqow,19078
|
||||
bs4/builder/_htmlparser.py,sha256=K4wEtxzvg8Zxace9UyqNLJpy9ADZRvlpMzM8BGbPhLI,13736
|
||||
bs4/builder/_lxml.py,sha256=ik6BFGnxAzV2-21S_Wc-7ZeA174muSA_ZhmpnAe3g0E,14904
|
||||
bs4/dammit.py,sha256=G0cQfsEqfwJ-FIQMkXgCJwSHMn7t9vPepCrud6fZEKk,41158
|
||||
bs4/diagnose.py,sha256=MRbN2bJSpa8VFt8HemqP8BK9hL5ronCxZmrfGRZYwBg,7911
|
||||
bs4/element.py,sha256=NF3n9C9g8jFPNG3HhCKSIla2Kdv8o8kIJzEg_rcoqsU,87687
|
||||
bs4/formatter.py,sha256=1LbCGDzW6k4FmxoP3QyLKLzzrP5Qm2eOwAawVB0nLLE,7192
|
||||
bs4/tests/__init__.py,sha256=_bTVNKsMjaB0z7u_jk8WCBgfyp7cUGG-FNPl3AI-keY,49569
|
||||
bs4/tests/__pycache__/__init__.cpython-39.pyc,,
|
||||
bs4/tests/__pycache__/test_builder.cpython-39.pyc,,
|
||||
bs4/tests/__pycache__/test_builder_registry.cpython-39.pyc,,
|
||||
bs4/tests/__pycache__/test_dammit.cpython-39.pyc,,
|
||||
bs4/tests/__pycache__/test_docs.cpython-39.pyc,,
|
||||
bs4/tests/__pycache__/test_element.cpython-39.pyc,,
|
||||
bs4/tests/__pycache__/test_formatter.cpython-39.pyc,,
|
||||
bs4/tests/__pycache__/test_html5lib.cpython-39.pyc,,
|
||||
bs4/tests/__pycache__/test_htmlparser.cpython-39.pyc,,
|
||||
bs4/tests/__pycache__/test_lxml.cpython-39.pyc,,
|
||||
bs4/tests/__pycache__/test_navigablestring.cpython-39.pyc,,
|
||||
bs4/tests/__pycache__/test_pageelement.cpython-39.pyc,,
|
||||
bs4/tests/__pycache__/test_soup.cpython-39.pyc,,
|
||||
bs4/tests/__pycache__/test_tag.cpython-39.pyc,,
|
||||
bs4/tests/__pycache__/test_tree.cpython-39.pyc,,
|
||||
bs4/tests/test_builder.py,sha256=nc2JE5EMrEf-p24qhf2R8qAV5PpFiOuNpYCmtmCjlTI,1115
|
||||
bs4/tests/test_builder_registry.py,sha256=7WLj2prjSHGphebnrjQuI6JYr03Uy_c9_CkaFSQ9HRo,5114
|
||||
bs4/tests/test_dammit.py,sha256=MbSmRN6VEP0Rm56-w6Ja0TW8eC-8ZxOJ-wXWVf_hRi8,15451
|
||||
bs4/tests/test_docs.py,sha256=xoAxnUfoQ7aRqGImwW_9BJDU8WNMZHIuvWqVepvWXt8,1127
|
||||
bs4/tests/test_element.py,sha256=92oRSRoGk8gIXAbAGHErKzocx2MK32TqcQdUJ-dGQMo,2377
|
||||
bs4/tests/test_formatter.py,sha256=0qV9H7mMDBcnFFH-dwNCrSm2zNi_40WMB2GMcV35PoY,4128
|
||||
bs4/tests/test_html5lib.py,sha256=2-ipm-_MaPt37WTxEd5DodUTNhS4EbLFKPRaO6XSCW4,8322
|
||||
bs4/tests/test_htmlparser.py,sha256=rcRtGJR-VOhc7-1c2CjlvMBMVncvmOA8JaWis-XUL4k,5119
|
||||
bs4/tests/test_lxml.py,sha256=Pn__rhKDKd7vcvRLRRF28N0y8xxnfg6OG6ZQLxZ9_j4,7504
|
||||
bs4/tests/test_navigablestring.py,sha256=RGSgziNf7cZnYdEPsoqL1B2I68TUJp1JmEQVxbh_ryA,5081
|
||||
bs4/tests/test_pageelement.py,sha256=E5GaojisoP3IgAXzRmW3uZ_PuF3ztoIngHoED3aXQJg,28009
|
||||
bs4/tests/test_soup.py,sha256=GVmaD6ngxs9yTtenuODPVFYGBabvm2qFiDMXqN0ORGU,18035
|
||||
bs4/tests/test_tag.py,sha256=f19uie7QehvgvhIqNWfjDRR4TKa-ftm_RRoo6LXZyqk,9016
|
||||
bs4/tests/test_tree.py,sha256=n9nTQOzJb3-ZnZ6AkmMdZQ5TYcTUPnqHoVgal0mYXfg,48129
|
||||
@@ -0,0 +1,5 @@
|
||||
Wheel-Version: 1.0
|
||||
Generator: bdist_wheel (0.34.2)
|
||||
Root-Is-Purelib: true
|
||||
Tag: py3-none-any
|
||||
|
||||
@@ -0,0 +1,3 @@
|
||||
bs4
|
||||
bs4/builder
|
||||
bs4/tests
|
||||
Reference in New Issue
Block a user