Querying Google safe browsing API with PHP

Submitted by morf on Sat, 10/01/2011 - 5:50pm
php

Recently after some time i had opportunity to work on interesting project which included querying Google safe browsing API with PHP. It was simple and interesting project, so i would like to share my knowledge about it here.

Requirements:

  • Curl extension for PHP
  • Google account and safe browsing API key

I used curl, but if you know what you are doing, you can easily use some other PHP functions to send GET requests (for example sockets, or even file_get_contents(), etc.). There is lot information about the protocol in official developers guide, and in look-up guide. If you have Google account, you can go right to the key signup, where you can easily generate your Google safe browsing API key. And that's it - now you can use my gsba_lookup() function to get information about any URL. All additional information about code and usage are included in the source code you can see lower.

Don't forget to set GSBA_API_KEY constant value to your API key. Additional note - this function isn't suitable for large number of requests (cca 20000 according to Google), in that case you should use your own URL database and its statuses.

<?php
/**
 * Google Safe browsing API usage example. Please note all constants using prefix
 * GSBA, and all functions gsba_
 *
 * Requirements: curl, and google safe browsing API key
 *
 * Protocol Documentation:
 *   http://code.google.com/apis/safebrowsing/lookup_guide.html#HTTPGETRequest
 *   http://code.google.com/apis/safebrowsing/developers_guide_v2.html
 * 
 * API key sign-up:
 *   http://code.google.com/apis/safebrowsing/key_signup.html
 */


#
# CONSTANTS
#


# GSBA API key - see API key sign-up URL in script header
define('GSBA_API_KEY', '');

# GSBA client - this script name
define('GSBA_CLIENT', 'gsba');

# GSBA client version - this script version
define('GSBA_CLIENT_VERSION', '0.1');

# GSBA version - Safe Browsing API version - check documentation for current value
define('GSBA_PROTOCOL_VERSION', '3.0');

# GSBA url pattern - check documentation for current value
define('GSBA_URL', 'https://sb-ssl.google.com/safebrowsing/api/lookup?client=%s&appver=%s&pver=%s&apikey=%s&url=%s');

# GSBA return malware
define('GSBA_MALWARE', 'malware');

# GSBA return result ok - no malware or phising
define('GSBA_OK', 'ok');

# GSBA return phishing
define('GSBA_PHISHING', 'phishing');


#
# FUNCTIONS
#


/**
 * Query google safe browsing API. Trigger error if GSBA_API_KEY is empty, or on unexpected result.
 * 
 * @param string $url tested url
 * 
 * @return mixed boolean false on error, or string with result
 * @see GSBA_API_KEY, GSBA_MALWARE, GSBA_OK, GSBA_PHISHING
 */
function gsba_query($url)
{
	$constant = constant('GSBA_API_KEY');

	if (empty($constant)) {
		trigger_error('gsba_query() failed: GSBA_API_KEY constant is not set.', E_USER_WARNING);
		return FALSE;
	}

	# build query url
	$query_url = sprintf(GSBA_URL, 
			GSBA_CLIENT,
			GSBA_CLIENT_VERSION,
			GSBA_PROTOCOL_VERSION,
			GSBA_API_KEY,
			urlencode($url)			# we have to encode the url
		);
	
	# prepare and configure curl
	$c = curl_init();

	curl_setopt($c, CURLOPT_URL, $query_url);
	curl_setopt($c, CURLOPT_RETURNTRANSFER, true);
	
	$body = curl_exec($c);
	$info = curl_getinfo($c);

	$status = isset($info['http_code']) ? $info['http_code'] : null;
	$result = empty($body) ? GSBA_OK : $body;

	# check http status code
	if (!in_array($status, array(200,204))) {
		trigger_error(sprintf('gsba_query() failed: Service returned unexpected status %d.', $status), E_USER_WARNING);
		return FALSE;
	}
	
	# return
	return $result;
}

#
# USAGE
#

$urls = array(
	'http://google.com/',	# should be ok
	'http://gumblar.cn/'	# should contain malware
);

$eol = "\r\n";		# end of line, change to 
for html output foreach($urls as $url) { $result = gsba_query($url); switch($result) { case GSBA_MALWARE: printf('Domain %s contains malware.%s', $url, $eol); break; case GSBA_OK: printf('Domain %s doesn\'t contain malware neither phishing.%s', $url, $eol); break; case GSBA_PHISHING: printf('Domain %s contains phishing.%s', $url, $eol); break; default: printf('Request information about %s failed.%s', $url, $eol); break; } }