etutorialspoint
  • Home
  • PHP
  • MySQL
  • MongoDB
  • HTML
  • Javascript
  • Node.js
  • Express.js
  • Python
  • Jquery
  • R
  • Kotlin
  • DS
  • Blogs
  • Theory of Computation

PHP Web Scraping: Documentation with example

In this post, you will learn a simple process to create web scraping using the PHP programming language.

Web Scraping is a process of extracting information from web sites. The extracted data can be contents, URLs, titles, contact information, and so on, which we can store in a local file or database. This process can be done manually by a code called a scrapper or by automated software implemented using a bot or a web crawler. Some popular sites provide APIs to access their data in a structured way. But not all websites. Web scraping is not always legal. Some sites have dis-allow the scraping in the 'robots.txt' file. So, we need a web scraper for data extraction, data mining, and to store it in a structured way.





Why web scraping has become so basic is a result of a bunch of elements. It is not necessary that the information that you access on the Internet is accessible for download. Nonetheless, you need it to download in an alternate configuration. So you need an approach to download the information from numerous pages of a site or from different sites. Along these lines, you need web scraping.

The web scraping in PHP is very simple. If the owner of the website doesn't provide an API through which we can get that information, and we really want the data of a web page, then web scraping is the only solution. There are a number of web scraping libraries available in PHP. In this article, we will mention the process in detail of scraping data from a web page.





Simple HTML DOM

The Simple HTML Dom parser is a good choice, as it enables us to access and use HTML quite easily and comfortably. One can parse website pages as a DOM (Document Object Model) tree, which is in a way a depiction of which projects can gain admittance to which parts of the pages. To give you a model, an HTML or XML archive is changed over to the DOM. What DOM does is that it expresses the structure of records and how an archive can be gotten to. PHP gives DOM augmentation.

Download SimpleHTMLDom

Let's start by downloading the Simple HTML DOM from the given link-
Download SimpleHTMLDom

Next, extract the above downloaded folder, we will have a file name 'simple_html_dom.php' in the extracted folder.

 

Take a web page for scraping

We definitely need a web page on which we scrape data, like the title of the article, date, images, and much more. Here, we have taken a news article and inspected the elements to find out the class of the HTML tag. So that we can scrape the requisite information from HTML based on CSS selectors like class, id, etc.

PHP Web Scraping

As we can see in the above screenshot, the CSS class "card__title" is applied to all DIV tags that contain titles, and the CSS class "card__posted-on" is applied to all DIV tags which contain dates. This will be useful in the process of filtering the field from the rest of the other content in the response object. This will be helpful during the process of separating the field from the rest of the content in the response object.





PHP Script to scrap web content

Here is the complete PHP code to scrape titles and posted dates from a news website.

<?php

// include library file
require_once 'simple_html_dom.php';

// fetch HTML content from the site.
$dom = file_get_html('https://www.thestatesman.com/cities/delhi', false); 

// gather all the articles 
$article = array();
if(!empty($dom)) {
$div_class = $title = "";
$i = 0;
foreach($dom->find(".card-content") as $div_class) {
	// article title
	foreach($div_class->find(".card__title") as $title ) {
		$article[$i]['title'] = $title->plaintext;
	}
	// article posted date
	foreach($div_class->find(".card__posted-on") as $post_date ) {
		$article[$i]['date'] = trim($post_date->plaintext);
	}
	$i++;
}
}
echo '<pre>';
print_r($article); 
exit;
?>
Output of the above code-

As you can see in the given screenshot, we could scrape the titles and post data of news articles in an array. Similarly, we can scrap more information according to our requirements.

PHP Web Scraping



Related Articles

PHP program to reverse a string
Electricity bill program in PHP
PHP remove last character from string
PHP String Contains
PHP Fix: invalid argument supplied for foreach
Ajax live data search using jQuery PHP MySQL
Fetch data from database in PHP and display
How to store image in database using PHP
How to display PDF file in PHP from database
How to read CSV file in PHP and store in MySQL
Create And Download Word Document in PHP
PHP SplFileObject Standard Library
Simple File Upload Script in PHP
Sending form data to an email using PHP
Recover forgot password using PHP and MySQL
Php file based authentication
Simple PHP File Cache
How to get current directory, filename and code line number in PHP




Most Popular Development Resources
Retrieve Data From Database Without Page refresh Using AJAX, PHP and Javascript
-----------------
PHP Create Word Document from HTML
-----------------
How to get data from XML file in PHP
-----------------
Hypertext Transfer Protocol Overview
-----------------
PHP code to send email using SMTP
-----------------
Characteristics of a Good Computer Program
-----------------
How to encrypt password in PHP
-----------------
Create Dynamic Pie Chart using Google API, PHP and MySQL
-----------------
PHP MySQL PDO Database Connection and CRUD Operations
-----------------
Splitting MySQL Results Into Two Columns Using PHP
-----------------
Dynamically Add/Delete HTML Table Rows Using Javascript
-----------------
How to add multiple custom markers on google map
-----------------
How to get current directory, filename and code line number in PHP
-----------------
Fibonacci Series Program in PHP
-----------------
Get current visitor\'s location using HTML5 Geolocation API and PHP
-----------------
How to Sort Table Data in PHP and MySQL
-----------------
Simple star rating system using PHP, jQuery and Ajax
-----------------
Submit a form data using PHP, AJAX and Javascript
-----------------
jQuery loop over JSON result after AJAX Success
-----------------
How to generate QR Code in PHP
-----------------
Simple pagination in PHP
-----------------
Recover forgot password using PHP7 and MySQLi
-----------------
PHP MYSQL Advanced Search Feature
-----------------
PHP Server Side Form Validation
-----------------
PHP user registration and login/ logout with secure password encryption
-----------------
jQuery File upload progress bar with file size validation
-----------------
Simple PHP File Cache
-----------------
Simple File Upload Script in PHP
-----------------
Php file based authentication
-----------------
To check whether a year is a leap year or not in php
-----------------
Calculate distance between two locations using PHP
-----------------
PHP User Authentication by IP Address
-----------------
PHP Secure User Registration with Login/logout
-----------------
Simple way to send SMTP mail using Node.js
-----------------
How to print specific part of a web page in javascript
-----------------
Simple Show Hide Menu Navigation
-----------------
Detect Mobile Devices in PHP
-----------------
Polling system using PHP, Ajax and MySql
-----------------
PHP Sending HTML form data to an Email
-----------------
Google Street View API Example
-----------------
Get Visitor\'s location and TimeZone
-----------------
SQL Injection Prevention Techniques
-----------------
Preventing Cross Site Request Forgeries(CSRF) in PHP
-----------------
Driving route directions from source to destination using HTML5 and Javascript
-----------------
Convert MySQL to JSON using PHP
-----------------
Set and Get Cookies in PHP
-----------------
CSS Simple Menu Navigation Bar
-----------------
PHP Programming Error Types
-----------------
Date Timestamp Formats in PHP
-----------------
How to select/deselect all checkboxes using Javascript
-----------------
How to add google map on your website and display address on click marker
-----------------
Write a python program to print all even numbers between 1 to 100
-----------------
How to display PDF file in web page from Database in PHP
-----------------
PHP Getting Document of Remote Address
-----------------
File Upload Validation in PHP
-----------------


Most Popular Blogs
Most in demand programming languages
Best mvc PHP frameworks in 2019
MariaDB vs MySQL
Most in demand NoSQL databases for 2019
Best AI Startups In India
Kotlin : Android App Development Choice
Kotlin vs Java which one is better
Top Android App Development Languages in 2019
Web Robots
Data Science Recruitment of Freshers - 2019


Interview Questions Answers
Basic PHP Interview
Advanced PHP Interview
MySQL Interview
Javascript Interview
HTML Interview
CSS Interview
Programming C Interview
Programming C++ Interview
Java Interview
Computer Networking Interview
NodeJS Interview
ExpressJS Interview
R Interview


Popular Tutorials
PHP Tutorial (Basic & Advance)
MySQL Tutorial & Exercise
MongoDB Tutorial
Python Tutorial & Exercise
Kotlin Tutorial & Exercise
R Programming Tutorial
HTML Tutorial
jQuery Tutorial
NodeJS Tutorial
ExpressJS Tutorial
Theory of Computation Tutorial
Data Structure Tutorial
Javascript Tutorial






Learn Popular Language

listen
listen
listen
listen
listen

Blogs

  • Jan 3

    Stateful vs Stateless

    A Stateful application recalls explicit subtleties of a client like profile, inclinations, and client activities...

  • Dec 29

    Best programming language to learn in 2021

    In this article, we have mentioned the analyzed results of the best programming language for 2021...

  • Dec 20

    How is Python best for mobile app development?

    Python has a set of useful Libraries and Packages that minimize the use of code...

  • July 18

    Learn all about Emoji

    In this article, we have mentioned all about emojis. It's invention, world emoji day, emojicode programming language and much more...

  • Jan 10

    Data Science Recruitment of Freshers

    In this article, we have mentioned about the recruitment of data science. Data Science is a buzz for every technician...

Follow us

  • etutorialspoint facebook
  • etutorialspoint twitter
  • etutorialspoint linkedin
etutorialspoint youtube
About Us      Contact Us


  • eTutorialsPoint©Copyright 2016-2023. All Rights Reserved.