Python BeautifulSoup - Use BeautifulSoup's find() Method

How To Use BeautifulSoup's find() Method

BeautifulSoup's .find() method is a powerful tool for finding the first page element in a HTML or XML page that matches your query criteria.

In this guide, we will look at the various ways you can use the find method to extract the data you need:

BeautifulSoup .find() Method
Find By Class And Ids
Find By Text
Find With Multiple Criteria
Find Using Regex
Find Using Custom Functions

If you would like to find all the elements that match your query criteria then use the find_all() method.

Need help scraping the web?

Then check out ScrapeOps, the complete toolkit for web scraping.

Proxy Manager

Scraper Monitoring

Job Scheduling

BeautifulSoup `.find()` Method

You should use the .find() method when there is only one element that matches your query criteria, or you just want the first element.

The .find() returns the first element that matches your query criteria.

To use the .find() method simply pass the page element you want to find as an argument, for example .find('h1'). In this case, we want to find the first <h1> tag on a HTML page.

from bs4 import BeautifulSoup

html_doc = """
<html>
    <body>
        <h1>Hello, BeautifulSoup!</h1>
        <ul>
            <li><a href="http://example.com">Link 1</a></li>
            <li><a href="http://scrapy.org">Link 2</a></li>
        </ul>
    </body>
</html>
"""

soup = BeautifulSoup(html_doc, 'html.parser')

## Find first <h1> Tag
print(soup.find('h1'))
## --> <h1>Hello, BeautifulSoup!</h1>


print(soup.find('h1').get_text())
## -->'Hello, BeautifulSoup!'

For more details then check out the full findall documentation here.

Find By Class And Ids

The .find() method allows you to find the first element on the page by class name, id, or any other element attribute using the attrs parameter that matches your query criteria.

For example, here are examples on how to find the first <p> tag that have the following classes, ids or attributes:

## <p> Tag + Class Name
soup.find('p', class_='class_name')

## <p> Tag + Id
soup.find('p', id='id_name')

## <p> Tag + Any Attribute
soup.find('p', attrs={"aria-hidden": "true"})

Find By Text

The .find() method allows you to search by string too using the string parameter. It returns the first string that exactly match your string.

## Strings that exactly match 'Link 1'
soup.find(string="Link 1")
## --> 'Link 1'

If you want to find the first string that contains your substring then you need to use regular expressions:

import re

## Strings that contain 'Link'
soup.find(string=re.compile("Link"))
## --> 'Link 1'

Find With Multiple Criteria

If you need to find the first page element that requires you to add multiple attributes to the query then you can do so with the attrs parameter:

## <p> Tag + Class Name & Id
soup.find('p', attrs={"class": "class_name", "id": "id_name"})

Find Using Regex

The .find() method also supports the use of regular expressions.

Simply add the regex query into the .find() method.

For example, here we are using the .find() method with a regex expression to find all tags that start with the letter b:

import re

## Find First Element That Starts With The Letter 'b'
soup.find(re.compile("^b"))
# --> <body>...</body>

Find Using Custom Functions

If you need to make very complex queries then you can also pass functions into the .find() method:

def custom_selector(tag):
	# Return "span" tags with a class name of "target_span"
	return tag.name == "span" and tag.has_attr("class") and "target_span" in tag.get("class")

soup.find(custom_selector)

How To Use BeautifulSoup's find() Method

Need help scraping the web?

BeautifulSoup .find() Method​

Find By Class And Ids​

Find By Text​

Find With Multiple Criteria​

Find Using Regex​

Find Using Custom Functions​

More Web Scraping Tutorials​