How to scrape data using Python?

Photo by Sigmund on Unsplash
  1. requests
  2. BeautifulSoup (if the BeautifulSoup is not installed then try running this command in your environment ‘pip install beautifulsoup4’ (ignore the quote symbols))
import requests 
from bs4 import BeautifulSoup
URL = ‘https://www.redbus.com/info/aboutus'
response = requests.get(URL)
Soup = BeautifulSoup(response.content, ‘html.parser’)
#if i try to print the soup using prettify function
print(Soup.prettify())
#then we will get output in this way.
Fig01: Printing the soup object
  1. About us Title
  2. Description of About us
  3. Management Team Title
  4. Names of the Management Team
  5. Description of Management Team
RedbusAbout = Soup.findAll(attrs= {‘class’: ‘Red XCN’})[0].text
Soup.findAll(attrs= {‘class’: ‘Red XCN’})#the output will be [<h3 class="Red XCN">About us</h3>,
<h3 class="Red XCN" id="BirdsEye">Management Team</h3>]
# here the [0], represents the Oth index, which will help us the extract only the fisrst line of the code.Soup.findAll(attrs= {‘class’: ‘Red XCN’})[0]#the output for this code is:<h3 class="Red XCN">About us</h3>
Soup.findAll(attrs= {‘class’: ‘Red XCN’})[0].text#output 'About us'
RedbusAbout = Soup.findAll(attrs= {‘class’: ‘Red XCN’})[0].text
AboutRedBus = Soup.findAll(attrs= {‘class’: ‘western’})[0].textManagementTitle = Soup.findAll(attrs= {'class': 'Red XCN'})[1].textName1 = Soup.findAll(attrs= {'class': 'Red TextBold XCN'})[0].text[0:14]Name2 = Soup.findAll(attrs= {'class': 'Red TextBold XCN'})[1].text[0:11]Name1Desc = Soup.findAll(attrs= {'class' : 'western'})[1].textName2Desc = Soup.findAll(attrs= {'class' : 'western'})[2].textlastline = Soup.findAll('div',attrs= {'class': False, 'id' : False})[3].text.replace('\n', '')
lastline = Soup.findAll('div',attrs= {'class': False, 'id' : False})[3].text.replace('\n', '')

--

--

--

2.5 years, experienced ML Engineer & thriving analyst with the ability to apply ML techniques & Statistical approaches to solve real-world business problems.

Love podcasts or audiobooks? Learn on the go with our new app.

Recommended from Medium

Math in IT — All About Arrays…

How to Parse and Validate a Full Name in Node.JS

APIs For Mobile App Development

KWOC | Kharagpur Winter Of Code Project Report.

To the Discord Development Community.

This Software Allows you to trade in Fifa21 in a breeze

Promote your NFTs FreePromote your NFTs Free::🎯Promote your NFTs Free:

Deploying Geo-Distributed Oracle Database on Oracle Cloud Infrastructure (OCI)

A sample Oracle Database Sharding deployment architecture

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store
Siva Santosh S

Siva Santosh S

2.5 years, experienced ML Engineer & thriving analyst with the ability to apply ML techniques & Statistical approaches to solve real-world business problems.

More from Medium

Rentometer API: Get Rental Comps using Python for Real Estate

Keyword Translation using Python

How to Create a Data Frame from the Spotify API

Create a data scraper with Python+YouTube Data API+PostgreSQL+HerokuScheduler