How to Scrape Google Search Results Data using Mechanicalsoup

QS blog-02

Web scraping is the process of extracting data from websites automatically. It is a powerful technique that allows you to gather large amounts of data quickly and efficiently. In this blog post, we’ll learn how to scrape Google Search results data using the Mechanicalsoup library in Python.

Prerequisites

Before we start, you’ll need to have the following installed on your system:

  • Python 3.x
  • Mechanicalsoup library
  • BeautifulSoup4 library
  • Requests library

You can install these libraries using pip:

pip install mechanicalsoup
pip install beautifulsoup4
pip install requests

Step 1: Import the Required Libraries

import mechanicalsoup
import requests
from bs4 import BeautifulSoup
import csv
import json

Step 2: Connect to the Website

# Connect to Website
browser = mechanicalsoup.StatefulBrowser()
access_token = '6JrJjz0MVZ7EBN584a' # Access token from app.quickscraper.co
url = f"https://api.quickscraper.co/parse?access_token={access_token}&url=https://www.google.com/search?q=laptop&rlz=1C1CHBF_enIN979IN979&oq=laptop&gs_lcrp=EgZjaHJvbWUqDAgAEEUYOxixAxiABDIMCAAQRRg7GLEDGIAEMgYIARBFGEAyDQgCEAAYgwEYsQMYgAQyCggDEAAYsQMYgAQyDQgEEAAYgwEYsQMYgAQyBggFEEUYPTIGCAYQRRg8MgYIBxBFGDzSAQc5NTVqMGo3qAIAsAIA&sourceid=chrome&ie=UTF-8"
page = browser.get(url)

Note: In the provided code, we’re using the api.quickscraper.co service to bypass Google’s anti-scraping measures. You’ll need to replace the access_token value with your own token from the service.

Step 3: Parse the HTML

# Parse HTML
soup = BeautifulSoup(page.content, 'html.parser')
items = soup.find_all('div', class_=['g', 'Ww4FFb', 'vt6azd', 'asEBEc', 'tF2Cxc'])

Step 4: Extract the Search Results Data

google_search_items = []

for item in items:
    title = item.find('h3', class_=['LC20lb', 'MBeuO', 'DKV0Md']).text.strip() if item.find('h3', class_=['LC20lb', 'MBeuO', 'DKV0Md']) else None
    desciption = item.find('div', class_=['VwiC3b', 'yXK7lf', 'lVm3ye', 'r025kc', 'hJNv6b', 'Hdw6tb']).text.strip() if item.find('div', class_=['VwiC3b', 'yXK7lf', 'lVm3ye', 'r025kc', 'hJNv6b', 'Hdw6tb']) else None
    url_element = item.find('a', {'jsname': 'UWckNb'})
    url = url_element.get('href') if url_element else None

    foundItem = {
        "title": title,
        "desciption": desciption,
        "url": url,
    }

    google_search_items.append(foundItem)

Step 5: Save the Data to a JSON File

with open("google_search_items.json", "w") as file:
    json.dump(google_search_items, file, indent=4)

Conclusion

Congratulations! You’ve learned how to scrape Google Search results data using the Mechanicalsoup library in Python. This technique can be useful for various purposes, such as data analysis, market research, or content aggregation. However, it’s essential to respect website terms of service and use web scraping responsibly.

Remember to replace the access_token value with your own token from the app.quickscraper.co service, as using the provided token may result in errors or rate limiting.

Happy scraping!

Related Articles

Comparison of Web Scraping Libraries

Comparison of Web Scraping Libraries Web scraping is the process of extracting data from websites automatically. It’s a crucial technique for businesses, researchers, and data enthusiasts who need to gather large amounts of data from the web. With the increasing demand for data-driven decision-making, web scraping has become an indispensable

Read Article

How to Scrape Google Search Results Data using Mechanicalsoup

How to Scrape Google Search Results Data using Mechanicalsoup Web scraping is the process of extracting data from websites automatically. It is a powerful technique that allows you to gather large amounts of data quickly and efficiently. In this blog post, we’ll learn how to scrape Google Search results data

Read Article

How to Scrape Reddit Using Python

How to Scrape Reddit Using Python Web scraping is a technique used to extract data from websites. In this blog post, we’ll learn how to scrape Reddit using Python. Reddit is a popular social news aggregation, web content rating, and discussion website. We’ll be using the mechanicalsoup library to navigate

Read Article

How to Scrape Any Website Using PHP

How to Scrape Any Website Using PHP   Do you hate manually copying and pasting data from websites? With web scraping, you can automate the process of extracting valuable information from the web. It can, however, be a time-consuming and complicated process to code your own scraper. With QuickScraper, you

Read Article

How to Scrape Meta Tags from Any Website

How to Scrape Meta Tags from Any Website Meta tags are snippets of text that describe a website’s content, and search engines use them to understand the purpose and relevance of a web page. Extracting meta tags can be useful for various purposes, such as SEO analysis, content categorization, and

Read Article

How to Scrape Images from Any Website?

How to Scrape Images from Any Website Scraping images from websites can be a useful technique for various purposes, such as creating image datasets, backing up images, or analyzing visual content. In this guide, we’ll be using the QuickScraper SDK, a powerful tool that simplifies the process of web scraping.

Read Article

Get started with 1,000 free API credits.

Get Started For Free

Copyright All Rights Reserved ©

💥 FLASH SALE: Grab 30% OFF on all monthly plans! Use code: QS-ALNOZDHIGQ. Act fast!