In this blog post, we will learn how to scrape stock prices from a financial website using Python. We will be using the requests and BeautifulSoup libraries to fetch the HTML content of a webpage and then parse it to extract the desired data.
1. Import necessary libraries
Python
import requests
from bs4 import BeautifulSoup
import json
requests: This library is used to make HTTP requests to websites and retrieve their content.BeautifulSoup: This library is used to parse HTML content and extract data from it.json: This library is used to work with JSON data.
2. Obtain a Quick Scraper Access Token
To use the Quick Scraper API, you will need to obtain an access token. You can get a free token by signing up for an account on [invalid URL removed].
3. Set up the API request
Python
access_token = 'YOUR_ACCESS_TOKEN' # Replace with your actual access token
url = f"<https://api.quickscraper.co/parse?access_token={access_token}&url=https://seekingalpha.com/symbol/AAPL>"
We are using the Quick Scraper API to simplify the process of fetching the HTML content of the target webpage. The API takes two arguments:
access_token: Your Quick Scraper access token.url: The URL of the webpage you want to scrape.
4. Make the API request and parse the HTML content
Python
response = requests.get(url)
soup = BeautifulSoup(response.text, 'html.parser')
- The
requests.get(url) method sends a GET request to the specified URL and retrieves the HTML content of the webpage. - The
BeautifulSoup library is used to parse the HTML content into a tree-like structure that can be easily navigated.
5. Extract the stock data
Python
symbol_name = soup.select_one('div[data-test-id="symbol-name"] h1').text.strip()
symbol_price = soup.select_one('span[data-test-id="symbol-price"]').text.strip()
- We are using CSS selectors to target specific elements on the webpage.
soup.select_one('div[data-test-id="symbol-name"] h1') selects the first <h1> element within a div element that has the attribute data-test-id set to "symbol-name". This element most likely contains the name of the stock..text.strip() extracts the text content from the selected element and removes any leading or trailing whitespace characters.- Similarly, we can select the element containing the stock price using
soup.select_one('span[data-test-id="symbol-price"]').text.strip().
6. Save the data to a JSON file
Python
data = {
"name": symbol_name,
"price": symbol_price
}
with open('symbol_data.json', 'w') as f:
json.dump(data, f)
print("Data saved to symbol_data.json")
- We create a dictionary named
data to store the scraped stock data. - The
json.dump function is used to convert the Python dictionary to a JSON string and write it to a file named symbol_data.json.
7. Scheduling the script to run daily
To scrape stock prices every day, you can use a task scheduler like cron on Linux/macOS or Task Scheduler on Windows. You can set the scheduler to run the Python script at a specific time each day.
Additional Considerations
- This code example scrapes data from a specific website. The HTML structure of the website can change, so you may need to modify the CSS selectors if the website structure changes.
- It is important to be respectful of the website’s robots.txt file and avoid overwhelming the server with too many requests.
- Consider using a more robust scraping library like Scrapy for more complex scraping tasks.
I hope this blog post helps you get started with scraping stock prices using Python. By following these steps, you can create a Python script that automatically scrapes stock prices from a financial website and saves the data to a JSON file every day.