Using cookies with cURL in PHP

I wrote this up for Stack Overflow when they were trying out the concept of “Documentation” for all sorts of things. It didn’t work out, but I thought I’d post this somewhere.

cURL can keep cookies received in responses for use with subsequent requests. For simple session cookie handling in memory, this is achieved with a single line of code:

curl_setopt($ch, CURLOPT_COOKIEFILE, "");

In cases where you are required to keep cookies after the cURL handle is destroyed, you can specify the file to store them in:

curl_setopt($ch, CURLOPT_COOKIEJAR, "/tmp/cookies.txt");

Then, when you want to use them again, pass them as the cookie file:

curl_setopt($ch, CURLOPT_COOKIEFILE, "/tmp/cookies.txt");

Remember, though, that these two steps are not necessary unless you need to carry cookies between different cURL handles. For most use cases, setting CURLOPT_COOKIEFILE to the empty string is all you need.


Cookie handling can be used, for example, to retrieve resources from a web site that requires a login. This is typically a two-step procedure. First, POST to the login page.

<?php

# create a cURL handle
$ch  = curl_init();

# set the URL (this could also be passed to curl_init() if desired)
curl_setopt($ch, CURLOPT_URL, "https://www.example.com/login.php");

# set the HTTP method to POST
curl_setopt($ch, CURLOPT_POST, true);

# setting this option to an empty string enables cookie handling
# but does not load cookies from a file
curl_setopt($ch, CURLOPT_COOKIEFILE, "");

# set the values to be sent
curl_setopt($ch, CURLOPT_POSTFIELDS, [
    "username"=>"joe_bloggs",
    "password"=>"$up3r_$3cr3t",
]);

# return the response body
curl_setopt($ch, CURLOPT_RETURNTRANSFER, true);

# send the request
$result = curl_exec($ch);

The second step (after standard error checking is done) is usually a simple GET request. The important thing is to reuse the existing cURL handle for the second request. This ensures the cookies from the first response will be automatically included in the second request.

# we are not calling curl_init()

# simply change the URL
curl_setopt($ch, CURLOPT_URL, "https://www.example.com/show_me_the_foo.php");

# change the method back to GET
curl_setopt($ch, CURLOPT_HTTPGET, true);

# send the request
$result = curl_exec($ch);

# finished with cURL
curl_close($ch);

# do stuff with $result...

This is only intended as an example of cookie handling. In real life, things are usually more complicated. Often you must perform an initial GET of the login page to pull a login token that needs to be included in your POST. Other sites might block the cURL client based on its User-Agent string, requiring you to change it.

Using cURL in PHP

Basic Usage (GET Requests)

cURL is a tool for transferring data with URL syntax. It support HTTP, FTP, SCP and many others (when using curl >= 7.19.4). Remember, you need to install and enable the cURL extension to use it.

// a little script to check if the cURL extension is loaded or not
if(!extension_loaded("curl")) {
    die("cURL extension not loaded! Quit Now.");
}
 
// Actual script start
 
// create a new cURL resource
// $curl is the handle of the resource
$curl = curl_init();
 
// set the URL and other options
curl_setopt($curl, CURLOPT_URL, "http://www.example.com");
 
// execute and pass the result to browser
curl_exec($curl);
 
// close the cURL resource
curl_close($curl);

Using Cookies

cURL can keep cookies received in responses for use with subsequent requests. For simple session cookie handling in memory, this is achieved with a single line of code:

curl_setopt($ch, CURLOPT_COOKIEFILE, "");

In cases where you are required to keep cookies after the cURL handle is destroyed, you can specify the file to store them in:

curl_setopt($ch, CURLOPT_COOKIEJAR, "/tmp/cookies.txt");

Then, when you want to use them again, pass them as the cookie file:

curl_setopt($ch, CURLOPT_COOKIEFILE, "/tmp/cookies.txt");

Remember, though, that these two steps are not necessary unless you need to carry cookies between different cURL handles. For most use cases, setting CURLOPT_COOKIEFILE to the empty string is all you need.


Cookie handling can be used, for example, to retrieve resources from a web site that requires a login. This is typically a two-step procedure. First, POST to the login page.

<?php
 
# create a cURL handle
$ch = curl_init();
 
# set the URL (this could also be passed to curl_init() if desired)
curl_setopt($ch, CURLOPT_URL, "https://www.example.com/login.php");
 
# set the HTTP method to POST
curl_setopt($ch, CURLOPT_POST, true);
 
# setting this option to an empty string enables cookie handling
# but does not load cookies from a file
curl_setopt($ch, CURLOPT_COOKIEFILE, "");
 
# set the values to be sent
curl_setopt($ch, CURLOPT_POSTFIELDS, array(
    "username" => "joe_bloggs",
    "password" => "$up3r_$3cr3t",
));
 
# return the response body
curl_setopt($ch, CURLOPT_RETURNTRANSFER, true);
 
# send the request
$result = curl_exec($ch);

The second step (after standard error checking is done) is usually a simple GET request. The important thing is to reuse the existing cURL handle for the second request. This ensures the cookies from the first response will be automatically included in the second request.

# we are in the same scope, and not calling curl_init()

# simply change the URL
curl_setopt($ch, CURLOPT_URL, "https://www.example.com/show_me_the_foo.php");
 
# change the method back to GET
curl_setopt($ch, CURLOPT_HTTPGET, true);
 
# send the request
$result = curl_exec($ch);
 
# finished with cURL
curl_close($ch);
 
# do stuff with $result...

This is only intended as an example of cookie handling. In real life, things are usually more complicated. Often you must perform an initial GET of the login page to pull a login token that needs to be included in your POST. Other sites might block the cURL client based on its User-Agent string, requiring you to change it.

This content is copied from Stack Overflow Documentation, a beta program which ended in 2017. All content was authored solely by myself. https://web.archive.org/web/20170816194237/https://stackoverflow.com/documentation/php/701/using-curl-in-php#t=201708161942371592047

Using rrdtool with PHP

The PHP interface to rrdtool hasn’t been updated in 5 years and appears to have been deprecated by the developer, who doesn’t provide any documentation for it. Fortunately, there’s no functionality in the extension, so it won’t go out of date as long as the rrdtool library on your system is up to date. I’ve managed to figure out the functions by looking at the source code and thought it might be helpful for someone.
Continue reading “Using rrdtool with PHP”

Easy SVG grid

I needed a grid in the background while I was debugging an SVG image I was creating, something like Photoshop’s transparency grid. Here’s what I did.

<svg xmlns="http://www.w3.org/2000/svg" version="1.1" width="200" height="400">
  <defs> 
    <pattern id="grid" width="20" height="20" patternUnits="userSpaceOnUse">
      <rect fill="black" x="0" y="0" width="10" height="10" opacity="0.1"/>
      <rect fill="white" x="10" y="0" width="10" height="10"/>
      <rect fill="black" x="10" y="10" width="10" height="10" opacity="0.1"/>
      <rect fill="white" x="0" y="10" width="10" height="10"/>
    </pattern>
  </defs>
  <rect fill="url(#grid)" x="0" y="0" width="100%" height="100%"/>
</svg>

The New CBC Radio 3

I often listen to CBC Radio 3 at work. Recently they updated their website; while it’s mostly a change for the better (yay, the player doesn’t stop updating!) there were a couple of things bugging me about it. With the old design, you always had access to the player and the main navigation, but now they stay at the top of the page. Not helpful when you’re scrolling through comments and whatnot.

So I wrote a Greasemonkey script that keeps the player and the left navigation bar in place. It also clears out the CBC header at the top, as well as the CBC Radio header that sits below that, for a cleaner page.

Continue reading “The New CBC Radio 3”

Delete MediaWiki pages from the database

Deleting a page from the wiki doesn’t actually remove it, just hides it away. Here’s a procedure to permanently remove things from the database, and never ever see them again.

DROP PROCEDURE IF EXISTS delete_page;
DELIMITER //
 
CREATE PROCEDURE delete_page(IN page_id_var INT)
	LANGUAGE SQL
	NOT DETERMINISTIC
	MODIFIES SQL DATA
	SQL SECURITY INVOKER
	COMMENT 'permanently deletes pages from the database'
BEGIN
	DECLARE page_title_var VARCHAR(255);
	DECLARE page_namespace_var INT;
	SELECT page_title, page_namespace INTO page_title_var, page_namespace_var FROM page WHERE page_id = page_id_var;
	DELETE FROM redirect WHERE rd_from = page_id_var;
	DELETE FROM externallinks WHERE el_from = page_id_var;
	DELETE FROM langlinks WHERE ll_from = page_id_var;
	DELETE FROM searchindex WHERE si_page = page_id_var;
	DELETE FROM page_restrictions WHERE pr_page = page_id_var;
	DELETE FROM pagelinks WHERE pl_from = page_id_var;
	DELETE FROM categorylinks WHERE cl_from = page_id_var;
	DELETE FROM templatelinks WHERE tl_from = page_id_var;
	DELETE text.* FROM text LEFT JOIN revision ON (rev_text_id = old_id) WHERE rev_page = page_id_var;
	DELETE FROM revision WHERE rev_page = page_id_var;
	DELETE FROM imagelinks WHERE il_from = page_id_var;
	DELETE FROM recentchanges WHERE rc_namespace = page_namespace_var AND rc_title = page_title_var;
	DELETE text.* FROM text LEFT JOIN archive ON (ar_text_id = old_id) WHERE ar_namespace = page_namespace_var AND ar_title = page_title_var;
	DELETE FROM archive WHERE ar_namespace = page_namespace_var AND ar_title = page_title_var;
	DELETE FROM logging WHERE log_namespace = page_namespace_var AND log_title = page_title_var;
	DELETE FROM watchlist WHERE wl_namespace = page_namespace_var AND wl_title = page_title_var;
	DELETE FROM page WHERE page_id = page_id_var LIMIT 1;
END//
 
DELIMITER ;

Now you can look up your article ID, and then call the procedure with CALL delete_page(999);.

No route matches “…” with {:method=>:get}

I configured Ruby on Rails to run with Apache, because I’m not too worried about speed and didn’t want to mess with proxies. I also configured the app to run in a subdirectory, using Apache’s Alias directive to point to the app’s public directory. I’ll point out this is the first time I’ve ever looked at Ruby in my life, and my first time working with any MVC framework, although I’ve looked into them a bit.

I was getting the dreaded No route matches "/subdirectory/" with {:method=>:get} error and it seemed pretty clear what the problem was. The app didn’t know it was in a subdirectory; I’d probably need to edit the routes to tell it so. It seems this is the last thing the people in Google-land were needing to do, but I eventually figured it out. I’d need to do something like this with the routes:

map.connect 'subdirectory/:controller/:action/:id'

So I took a look at routes.rb and it was using resources, not traditional routes. So what do I do with that?

It took hours of searching before I found the answer to my problem — a testament to the quality of Rails’ documentation I suppose. The answer is path_prefix

map.resources :groups, :path_prefix => 'subdirectory/'

You can also use it for the root as well.

map.root :controller => 'start', :path_prefix => 'subdirectory/'

Now I just have to fix the fact that the author of the app hard coded all sorts of stuff with the assumption that the app wouldn’t be in a directory. Grrr.

Update: Turns out it’s even easier than that. I didn’t have to change routes.rb at all.

config.action_controller.relative_url_root = '/subdirectory'

This has the added advantage of fixing things like linked stylesheets and stuff as well.

WMI error 80041010 on performance counters

I recently was having problems with my WMI queries. Following some (bad) advice I rebuilt the repository. It didn’t solve my problem, and afterwards all the performance counter classes had disappeared. Win32_PerfRawData_* and Win32_PerfFormattedData_* were gone, reporting error 0x80041010 [“Invalid class”] (Instead of an error 0x80041010, MS says you might get error 0x80041002 [“Object could not be found”] or error 0x80041006 [“Insufficient memory”] when trying to connect to a nonexistent class.) All the rebuilding and troubleshooting and searching MOF files gave me nothing.

The answer? winmgmt /resyncperf rebuilds the performance counter classes in the repository. To be extra safe, winmgmt /clearadap clears the old data first.

Continue reading “WMI error 80041010 on performance counters”

ATT00000.txt files in Outlook

This information is for people who create email messages in programming languages, not Outlook users.

The separator (defined in the Content-Type header) is used to start a new part of the multipart MIME message. Standard practice (not sure if it’s RFC behaviour or not) is to place an instance of the separator at the end of the message. Outlook sees this as the start of a new attachment. Because it has no Content-Disposition information it names it automatically, and of course there’s no content so it’s an empty file. So by not placing the separator at the end of the message, you avoid the empty attachment.

Searching for the answer today I have seen loads of people asking about this online, nobody came up with an answer. Part of the problem is the frequency with which Outlook generates these ATT*.txt files; some people were seeing their attachments replaced with these empty text files, some were getting blank email bodies, but with attachments and the empty text files, etc.

This just came to me after spending the day trying to figure it out, and it works. The logic of the first paragraph is entirely guesswork on my part.

PHP fatal errors in imagepng()

I upgraded to PHP 5.2 from 4.3 recently and came across a couple of error messages: php[7028], PHP Warning: imagepng(): gd-png: fatal libpng error: zlib error in … followed by: php[7028], PHP Warning: imagepng(): gd-png error: setjmp returns error condition in …

Turns out the paramters for imagepng changed in PHP 5.1.3, and I’m not sure what the third argument used to be, but where I had imagepng($image, null, 100) it died, because the third argument (quality) is supposed to be 0 to 9 now.

I came across postings saying to replace DLL files and all this nonsense, but all I needed to do was change the 100 to a 9.

Using PHP to interface with WMI

Windows Management Instrumentation (WMI) is a Windows derivative of the WBEM standard allowing centralized management of a wide number of Windows functions. There is almost no mention of how to use it from PHP, although combined together they provide a powerful method of web-based management. This example shows how to connect to a remote server, update a single DNS record, then flush the DNS cache.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
< ?php
$host = 'www';
$ip = '192.168.1.1';
$domain = 'example.com';
$query = "SELECT * FROM MicrosoftDNS_AType WHERE DomainName='$domain' AND OwnerName='$host.$domain'"; 
try {
//create the object
	$rpc = new COM('WbemScripting.SWbemLocator');
//update DNS
	$wmi = $rpc->ConnectServer($rpchost, 'Root/MicrosoftDNS', $user, $pass);
	$hosts = $wmi->ExecQuery($query);
	foreach($hosts as $host) {
		echo "Updating $host->OwnerName from $host->IPAddress to $ip.";
		flush(); ob_flush();
		$result = new Variant(null);
		$host->Modify(null, $ip, $result);
	}
//flush the DNS cache by restarting the dnscache service
	$query = "SELECT * FROM Win32_Service WHERE Name='Dnscache'";
	$wmi = $rpc->ConnectServer($rpchost, 'Root/cimv2', $user, $pass);
	$services = $wmi->ExecQuery($query);
	foreach ($services as $service) {
		$service->StopService();
		sleep(2);
		$service->StartService();
	}
}
catch(Exception $e) {
	echo $e;
	exit;
}?>

A couple of points to note:

  1. this is PHP 5 code, it will not work in version 4.
  2. This code uses the COM functions, only available in Windows-based PHP installs.
  3. notice that even though I only pulled one record from the WMI server, I still have to use foreach to iterate through the result set. Like the query itself, the result set is treated the same as one from a database.
  4. I needed to reconnect after the DNS update to use a new namespace; where the DNS server management classes are in the Root/MicrosoftDNS namespace, the service management classes are in the default Root/cimv2 namespace.
  5. Microsoft’s WMI documentation is here. All the code samples are VBScript, but using the example above you should be able to figure things out.