Using cURL to Read the Contents of a Web Page

16 Apr 2010

Categories

Recently I wrote about how to use the Yahoo! weather api with WordPress and in the comments I was asked how to use it without relying on WordPress. The answer – is cURL.

In this post I’m going to show you how to use cURL with PHP to read the contents of a front-end api, but it could just easily be used for scraping a web page, or accessing a REST api.

I’m going to assume that we are still using PHP since some programming is still needed. However you can also use cURL with other languages such as Ruby, Python, Perl, and also on the command line.

According to Wikipedia the name cURL comes from “Client for URLs” and it is essentially a command line interface for a web client. This means that you can access web content through a script on your site. This is most often used by websites when they access web apis such as Twitter, Flickr, or as in this case, the Yahoo! weather api.

Note: There’s actually loads of different commands and settings for cURL but we are only interested in a few. If you want to check them all out then you can view the docs on php.net.

Below is the code we will be using:

$ch = curl_init();
curl_setopt($ch, CURLOPT_URL, $file);
curl_setopt($ch, CURLOPT_RETURNTRANSFER, 1);
curl_setopt($ch, CURLOPT_REFERER, $_SERVER['REQUEST_URI']);
$result = curl_exec($ch);
curl_close($ch);

Broken down we have:

$ch = curl_init(); intiate the curl object
curl_setopt($ch, CURLOPT_URL, $file); specify the file or url to load
curl_setopt($ch, CURLOPT_RETURNTRANSFER, 1); tell it to return the raw content
curl_setopt($ch, CURLOPT_REFERER, $_SERVER[‘REQUEST_URI’]); Simulate the http referer
$result = curl_exec($ch); perform the cURL request
curl_close($ch); close the connection

A bit of rejigging from the original WordPress Yahoo! post and you end up with:

<?php
function bm_getWeather ($code = '', $temp = 'c') {
	$file = 'http://weather.yahooapis.com/forecastrss?p=' . $code . '&u=' . $temp;

	$ch = curl_init();
	curl_setopt($ch, CURLOPT_URL, $file);
	curl_setopt($ch, CURLOPT_RETURNTRANSFER, 1);
	curl_setopt($ch, CURLOPT_REFERER, $_SERVER['REQUEST_URI']);
	$result = curl_exec($ch);
	curl_close($ch);

	$output = array (
		'temperature' => bm_getWeatherProperties('temp', $result),
		'weather' => bm_getWeatherProperties('text', $result),
		'weather_code' => bm_getWeatherProperties('code', $result),
		'class' => 'weatherIcon-' . bm_getWeatherProperties('code', $result),
	);

	return $output;
}

function bm_getWeatherProperties ($needle, $data) {
	$regex = '<yweather:condition.*' . $needle . '="(.*?)".*/>';
	preg_match($regex, $data, $matches);
	return $matches[1];
}

Note: The weather API I used here is no longer available but the cURL code is still perfectly valid.

Let me know what you think on Mastodon, or BlueSky (or ~~Twitter~~ X if you must).

Link to this page

Thanks for reading. I'd really appreciate it if you'd link to this page if you mention it in your newsletter or on your blog.

<a href="/2010/04/curl-read-content-web-page/">Using cURL to Read the Contents of a Web Page</a>

Using cURL to Read the Contents of a Web Page

Categories

Link to this page

Related Posts

WordPress Http API – Read Content From Other Websites

WordPress tips and tricks – Custom Page Templates

WordPress 2.7 and Crazyhorse

Using the Yahoo Weather API (in your WordPress themes)

Easy WordPress Updates: Store FTP Info in wp-config.php

The Ethics of WordPress Automatic Content Aggregration (Autoblogs vs Splogs)