Building A Simple Proxy Server With Raspberry Pi
About the project
Discover the simplicity of setting up a proxy server on your Raspberry Pi using the user-friendly and open-source software known as Squid.
Platforms: Raspberry Pi
Estimated time: 1 hour
Items used in this project
Discover the simplicity of setting up a proxy server on your Raspberry Pi using the user-friendly and open-source software known as Squid. In this tutorial, we provide a step-by-step guide, demonstrating its application for web scraping. However, the advantages of establishing a proxy server extend beyond this, encompassing enhanced security, efficient caching, accelerated networking requests, and streamlined connection management. Unlock the potential of your Raspberry Pi with this comprehensive tutorial on Squid proxy server setup!
Before reading the remainder, be sure to subscribe and support the channel if you have not!
Hire me at UpWork to build your IoT projects:Part 1: Setting up the Raspberry Pi
Initial Setup:- Ensure your Raspberry Pi 4 is set up with Raspbian (or another compatible OS) and that it's connected to the internet.- Access your Raspberry Pi terminal through SSH or directly using a monitor and keyboard.
Update and Upgrade Packages:- Run
sudo apt-get update and
sudo apt-get upgrade to ensure all packages are up to date.
Install Squid:- Execute
sudo apt-get install squid.- Once installed, the Squid service should start automatically.
Configure Squid:- Backup the original configuration file:
sudo cp /etc/squid/squid.conf /etc/squid/squid.conf.backup.- Edit the configuration file:
sudo nano /etc/squid/squid.conf.
You can see the lines are uncommented:
acl localnet src 192.168.1.0/24
http_access allow localnet
This is needed to only allow devices on your local home network to connect to the proxy. It is a very simple setting; you can play with the config in this file to filter what IPs and what network security parameters you would like to configure. For the sake of this video, we keep it simple.
Restart Squid:Restart the Squid service to apply the changes:
sudo systemctl restart squid.
Verify Squid is Running:Check the status of Squid:
sudo systemctl status squid.
Get Raspberry Pi IP Address:
To get the Raspberry Pi IP address you can type in the command ifconfig in the terminal and look at the inet address. You will need this address to use on the Python script on your local computer that will send the scraping request to the Pi's IP address first as proxy!Part 2: Local Computer Code
Create a Python Script on your local computer and run the following script, make sure you substitute your Pi IP address.
This script is designed to scrape web content from a specified URL (in this case, a search page on eBay for laptops) using the Python
requests library for making HTTP requests and the
BeautifulSoup library from
bs4 for parsing HTML content. The script uses a proxy server to make the request. Here's a step-by-step explanation:
- Import Libraries:
import requests: Imports the
requestslibrary, which is used to send HTTP requests easily in Python.
from bs4 import BeautifulSoup: Imports the
BeautifulSoupclass from the
bs4(Beautiful Soup) library. Beautiful Soup is used for parsing HTML and XML documents, making it easier to navigate, search, and modify the parse tree.
- Set Up Proxies:The
proxiesdictionary is defined with keys
'https', each containing the URL of the proxy server (
http://192.168.1.111:3128). This tells the
requestslibrary to route HTTP and HTTPS requests through the proxy server at the specified IP address (
192.168.1.111) and port (
- Specify Target URL:The
urlvariable is set to the target URL, which in this case is an eBay search page for laptops.
- Send HTTP GET Request:
response = requests.get(url, proxies=proxies): Sends an HTTP GET request to the specified
proxiesparameter is used, meaning the request will be sent through the proxy server defined earlier.
- Check Response Status:The script checks if the HTTP request was successful by examining
response.status_code.HTTP status code 200 indicates success. If the request was successful (
response.status_code == 200), the script proceeds to parse the response content.
- Parse and Print HTML Content (if successful):
soup = BeautifulSoup(response.content, 'html.parser'): Parses the content of the HTTP response (
response.content) using Beautiful Soup, which creates a BeautifulSoup object (
'html.parser'argument specifies the parser to use.
print(soup): Prints the parsed HTML content to the console. In a more practical scraping scenario, you would use Beautiful Soup's functionality to extract specific data from
soup(like product names, prices, etc.) instead of printing the entire HTML content.
- Handle Failed Request (if not successful):If the HTTP request was not successful (status code is not 200), the script prints a message indicating that the webpage retrieval failed, along with the HTTP status code received.
In a real-world scenario, you'd likely want to add more functionality after parsing the HTML to extract and process the specific data you're interested in.Conclusion:
Hope you enjoyed the quick tutorial, if the Python script ran it means that the requests are going through your Pi's IP address which can help you webscrape, let me know if you have any questions.
If you enjoy the video, please subscribe to my channel Shilleh on Youtube in the video above, your support would be appreciated. Ping me for any questions, thanks everyone!
Mechanical and Software Engineering Background. University at Buffalo 2019 Texas A&M 2021 I make data pipelines for my day job. Outside of work, I participate in online communities regarding Full Stack Engineering, Microelectronics, and more. You can find more details about me on my Youtube Channel. https://www.youtube.com/@mmshilleh Feel free to reach out!