- Python Automation Cookbook
- Jaime Buelta
- 192字
- 2021-06-30 14:52:59
Using Selenium for advanced interaction
Sometimes, nothing short of the real thing will work. Selenium is a project to use to achieve automation in web browsers. It's conceived as a way of automatic testing, but it also can be used to automate interactions with a site.
Selenium can control Safari, Chrome, Firefox, Internet Explorer, or Microsoft Edge, though it requires installing a specific driver for each case. We'll use Chrome.
Getting ready
We need to install the right driver for Chrome, called chromedriver
. It is available here: https://sites.google.com/a/chromium.org/chromedriver/. It is available for most platforms. It also requires that you have Chrome installed: https://www.google.com/chrome/.
Add the selenium
module to requirements.txt
and install it:
$ echo "selenium==3.141.0" >> requirements.txt
$ pip install -r requirements.txt
How to do it...
- Import Selenium, start a browser and load the form page. A page will open reflecting the operations:
>>> from selenium import webdriver >>> browser = webdriver.Chrome() >>> browser.get('https://httpbin.org/forms/post')
Note the banner in Chrome showing it is being controlled by automated test software.
- Add a value in the Customer name field. Remember that it is called
custname
:>>> custname = browser.find_element_by_name("custname") >>> custname.clear() >>> custname.send_keys("Sean O'Connell")
Figure 3.6: Form being filled automatically
- Set the pizza size to
medium
:>>> for size_element in browser.find_elements_by_name("size"): ... if size_element.get_attribute('value') == 'medium': ... size_element.click() ... >>>
This will set the Pizza Size radio button.
- Add
bacon
andcheese
:>>> for topping in browser.find_elements_by_name('topping'): ... if topping.get_attribute('value') in ['bacon', 'cheese']: ... topping.click() ... >>>
Finally, the checkboxes will appear as marked:
Figure 3.7: Form with checked boxes
- Submit the form. The page will submit and the result will be displayed:
>>> browser.find_element_by_tag_name('form').submit()
The form will be submitted and the result from the server will be displayed:
Figure 3.8: Returned JSON information
- Close the browser:
>>> browser.quit()
How it works...
Step 1 in the How to do it… section shows how to create a Selenium page and go to a particular URL.
Selenium works in a similar way to Beautiful Soup: you select an element and then manipulate it. The selectors in Selenium work in a similar way to those in Beautiful Soup, with the most common ones being find_element_by_id
, find_element_by_class_name
, find_element_by_name
, find_element_by_tag_name
, and find_element_by_css_selector
.
There are equivalent find_elements_by_X
actions that return lists by other attributes other than the first found element (such as find_elements_by_tag_name
, find_elements_by_name
, and more). This is also useful when checking whether an element is there or not. If there are no elements, find_element
will raise an error while find_elements
will return an empty list.
Data on the elements can be obtained through .get_attribute()
for HTML attributes (such as the values on the form elements) or .text
.
Elements can be manipulated by simulating sending keystrokes to input text with the method .send_keys()
, sending clicks with .click()
, or submitting the form with .submit()
. Note that .click()
will select/deselect in the same way that a click of the mouse will.
Finally, step 6 closes the browser.
There's more...
Here is the Python Selenium documentation: http://selenium-python.readthedocs.io/.
For each of the elements, there's extra information that can be extracted, such as .is_displayed()
or .is_selected()
. Text can be searched using .find_element_by_link_text()
and .find_element_by_partial_link_text()
.
Sometimes, opening a browser can be inconvenient. An alternative is to start the browser in headless mode and manipulate it from there, like this:
>>> from selenium.webdriver.chrome.options import Options
>>> chrome_options = Options()
>>> chrome_options.add_argument("--headless")
>>> browser = webdriver.Chrome(chrome_options=chrome_options)
>>> browser.get('https://httpbin.org/forms/post')
The page won't be displayed, but a screenshot can be saved anyway with the following line:
>>> browser.save_screenshot('screenshot.png')
See also
- The Parsing HTML recipe, earlier in this chapter, to learn how to parse elements in HTML.
- The Interacting with forms recipe, earlier in this chapter, to see alternatives to dealing with forms.