Using the Stack Exchange API with PHP (part 1)

      1 Comment on Using the Stack Exchange API with PHP (part 1)

Using the Stack Exchange API with PHP (part 1)

Question Time

If you’re a developer, chances are that you’ve been faced (many times) with cryptic error messages that left you scratching your head, or pieces of code that stubbornly refused to work despite looking absolutely pristine. I know I have…and, more often than not, I’ve found the solution to my problem in the developer-oriented Q&A site Stack Overflow.

As a collaborative Web site for programmers, Stack Overflow has all the geek tools you’ll need. It’s easy to search for and find relevant posts, and in the unlikely event you’re the first to encounter a specific problem, it’s just as easy to post a new question, and watch the answers pour in. Users can vote on whether an answer is useful or useless, or add comments to clarify a question or its answers. And of course, the more your contribute, the greater your reputation…just the thing to casually drop into a conversation with an attractive member of the opposite sex.

The thing about Stack Overflow, though, is that it has a geeky secret of its own. Like many Web 2.0 applications, it exposes its data to the public via the Stack Exchange Web service API, making it possible to develop customized applications that run on top of the base service. This API allows access to a number of important functions, including searching for questions, retrieving answers and comments, accessing user profiles, and working with tags and badges. It’s also pretty easy to integrate this API into a PHP application – and this two-part article will show you how!

Start Me Up

Before diving into the code, a few notes and assumptions. I’ll assume throughout this article that you’re familiar with HTML and JSON, and that you have a working Apache/PHP development environment with cURL and JSON support. I’ll also assume that you know the basics of working with classes and objects in PHP, as the components used in this article are written using OOP principles.

The Stack Exchange API is accessed by sending a GET request to the appropriate service endpoint. The method name and arguments are encoded into the request URL and the method response is returned as a JSON document. The JSON document usually contains one or more objects. Here’s an example of one such API request, which returns a list of questions:

Here’s an example of the JSON document returned by the API for the request above:

Methods and method parameters are fully documented in the online API documentation. This documentation is your roadmap to working with the API – it contains detailed information on available API calls, input parameters, error codes and return values. Keep it at hand, as you’ll be referring to it frequently throughout this tutorial.

It’s also a good idea to register for a Stack Exchange API key. Once you have an API key, you can make up to 10,000 requests per day to the API – a limit that should be suitable for most deployed applications. Note that an API key is not mandatory: you can still use the API even if you don’t have a key, but in this case you’re limited to 300 requests per day. Read more about this in the Stack Exchange FAQ.

It’s worth noting that the Stack Exchange API is currently read-only. This means that it’s possible to search and retrieve content through the API, but it isn’t (yet) possible to programmatically add questions, answers or comments through the API.

All clear? Let’s rock and roll.

Special Feature

Given that the Stack Exchange API returns a JSON document, and PHP offers a json_decode() function that can be used to read a JSON document into a PHP variable, it’s quite easy to build a simple PHP script that produces a list of featured questions. Here’s how:

This script begins by defining the API endpoint, the target method, and a list of parameters to be passed to the method. These values are then converted into a complete URL and passed to the file_get_contents() method, which requests the URL and retrieves the response as a single string. All API responses are compressed to save bandwidth and so, the next step is to decompress the response using the http_inflate() method. The resulting uncompressed JSON document is then passed to the json_decode() method, and converted into a PHP object, which looks like this:

Individual elements of the document, such as the list of questions, answer counts and tags, can now be accessed as object properties and formatted for display as a Web page. Here’s a sample of what the output might look like:

Stacking The Odds

An easier way to accomplish the same result is by using the StackPHP package. This package, currently in the throes of a PEAR proposal, offers an object-oriented interface to the Stack Exchange API, internally handling tasks such as service URL formulation and JSON decoding. This package is currently maintained by George Edison, is released under an MIT license, and can be freely downloaded from the package proposal page.

Once you’ve got the package installed, try using it to retrieve a list of featured questions:

You’ll see that this script is significantly shorter and easier to read than the previous version. It begins by including the StackPHP class files, and then initializes a Post_Exchange object, which serves as a central service object for all requests to the API. This service object is initialized with a set of three arguments: the site name, an API key, and a cache duration. To turn caching off, set the cache duration to 0.

Once the service object has been initialized, the /questions endpoint can be requested by calling the object’s questions() method. Additional parameters to the method are specified as an associative array of key-value pairs, and passed to the method as an argument. The service object then formulates the service endpoint URL, connects to it using cURL, and translates the response into a PHP object. As before, it’s now possible to iterate over this object and create a Web page from the results.

Digging Deeper

If you look at the previous examples, you’ll see that the return value of the /questions method only contains question titles and summary information. However, more often than not, you’ll also need the question body, and any answers submitted to that question. This is easily accomplished by adding two optional parameters to the request, specifying that both question and answer bodies should be returned. Here’s a revised example, which illustrates:

In this example, the response to the questions() method is a JSON document containing much more detail: not just the question titles, but also the question bodies, as well as a list of answers submitted for that question. Within the response, an outer array contains the questions with their bodies, owner information and statistics; each question has a sub-array containing answers, together with information on the answer owner, vote count, date and so on. To present all of this information, the script uses nested foreach() loops to process the outer and inner arrays and display the information contained therein as a Web page.

Here’s an example of what the output looks like:

Answering The Call

If you’re after a specific question and its answers, rather than an entire collection, you can instead use the question() and questionsAnswers() methods, which retrieve a specific question and its answers respectively, by calling the /questions/id and /questions/id/answers endpoints. You can also retrieve data from multiple questions, by specifying a semicolon-delimited list of question identifiers. Here’s an example, which illustrates the process:

This script creates a service object and then calls the question() and questionsAnswers() methods, passing both of them a unique question identifier. As before, it’s easy to iterate over the response objects and convert them into a Web page. Notice that it’s possible to restrict the list of answers by adding filters as additional parameters. This particular script filters the answer list to only include those answers posted within the last 24 hours.

Other possible filters include ‘todate’ (returns answers before a specific date), ‘min’ and ‘max’ (returns answers within a particular range), ‘sort’ (sorts answers by activity, views, votes or creation time), ‘page’ and ‘pagesize’ (return a specified page of answers) and ‘order’ (specifies how answers should be ordered).

Here’s what the output looks like:

If you need to retrieve a specific answer, you can use the answer() method, which accepts an answer identifier and returns an object containing associated data using the /answers/id API method. Here’s a cutaway of a script that illustrates it in action:

And here’s what the output looks like:

No Comment

Users can also post comments on a question or an answer, to clarify its meaning or add supplementary information. This information is made available via the /questions/id/comments and /answers/id/comments methods of the API, and reflected in the questionsComments() and answersComments() methods of the StackPHP package. These methods must be provided with an appropriate question or answer identifier as argument.

To illustrate, consider the next example, which uses the questionsComments() method to retrieve all the comments posted on a specific question:

This script is very similar to the previous one, except that instead of iterating over and printing the answers to a question, it iterates over and prints the comments on a question. Here’s what the output looks like:

There’s also a comment() method, which accepts a comment identifier and returns the associated data for that comment via the /comments/id API method. Here’s a script that illustrates it in action:

Here’s what the returned object looks like:

Tagged And Bagged

Stack Overflow also supports “tags”, which are short keywords that can be attached to a question. Tags are useful to quickly identify the key subject areas of a question; they can also be used to categorize questions or serve as filters for searches. The API offers a /tags method, which returns a complete list of available tags and usage counts for each. Here’s an example:

Here, the tags() method returns a list of available tags, together with a count of how often each tag is used. Here’s what the output looks like:

Tags can be used to filter the output of any method that returns a list of questions, via the ‘tagged’ parameter. Consider, for example, the next listing, which returns all unanswered questions tagged with ‘php’:

Here’s the output:

Here’s another one, this one returning a list of questions tagged with ‘java’ and ‘mysql’:

Notice that you can specify multiple tags by separating them with a semi-colon. Only questions matching all specified tags will be returned. Here’s the output:

Search Engine

The Stack Exchange API also provides a /search method, which makes it possible to search the question database by title keyword or tag, and retrieve a collection of matching questions with their associated answers. This method is reflected in the StackPHP library through the search() method, which filters results either by a title keyword (‘intitle’) or the presence or absence of tags (‘tagged’ and ‘nottagged’). Here’s a simple example, which illustrates:

This script creates a simple form for the user to enter one or more search terms. Once the form is submitted, the user input is passed to the search() method, which returns a list of questions matching the search term. The ‘intitle’ parameter is used to specify that the search term must apply in the question title, while the ‘sort’ parameter specifies that results should be sorted by votes.

It’s important to note that the search() method only returns questions, not associated answers. To obtain the related answers, it’s necessary to use the question() or questionsAnswers() methods and pass them a semicolon-delimited string of question identifiers, setting the ‘answers’ and ‘body’ parameters to ‘true’ to ensure that complete question and answer data is returned. Here’s a revision of the previous example that demonstrates this in action:

And now, when you try entering a search term, you should see not only a matching set of questions, but also their associated answers. Here’s what you might see:

Note that this script is actually performing two calls to the service API, and so will usually end up taking a little longer to execute. Since getting the data is an expensive operation, you might want to consider caching it to make subsequent searches for the same term(s) more efficient.

As these examples illustrate, the Stack Exchange API offers a full-featured API to search for questions matching a particular keyword or tag, as well as to retrieve question, answer and comment data. However, this is just the tip of the iceberg: the API also includes methods to work with user profiles, badges and system statistics, and the second part of this article will look at these methods in detail. Make sure you come back for that!