14.2. Get news links from faculty webpages¶
Let’s say that you want to get the link to the first news article on your favorite umsi faculty’s webpages.

But clicking through to gather all those links would be a pain. Fortunately, we can do that task with BeautifulSoup!
Run the code below to see what it collects.
Before you keep reading...
Runestone Academy can only continue if we get support from individuals like you. As a student you are well aware of the high cost of textbooks. Our mission is to provide great books to you for free, but we ask that you consider a $10 donation, more if you can or less if $10 is a burden.
This code is made up of three plans. Click on each of the plans below to learn more about it.
Plan 3: Get a soup from multiple URLs# Load libraries for web scraping from bs4 import BeautifulSoup import requests # Get a soup from multiple URLs base_url = 'https://web.archive.org/web/20230128074139/https://www.si.umich.edu/people/' endings = ['barbara-ericson', 'steve-oney', 'paul-resnick'] for ending in endings: url = base_url + ending r = requests.get(url) soup = BeautifulSoup(r.content, 'html.parser')
Plan 4: Get info from a single tag# Get first tag of a certain type from the soup tag = soup.find('a', class_='item-teaser--heading-link') # Get info from tag info = tag.get('href')
Plan 9: Print info# Print the info print(info)
You have attempted 1 of 2 activities on this page