Accessing Drupal.org API

Accessing Drupal.org API

I am going to keep today’s DrupalFest post simple and talk about the API to access content on drupal.org. The Drupal.org API is a public API that allows you to access content such as projects (modules, themes, etc), issues, pages, and more. The API returns data as a simple JSON structure and has only limited features with regards to filtering and gathering nested data. In this post, I will describe more about this API and various ways to access it. It has been a tough day and it is difficult for me to write long posts or go deep in my thoughts. Regardless, I hope this quick post still proves useful.

API basics

The base endpoint to access the drupal.org API (henceforth, d.o API) is https://www.drupal.org/api-d7/. In fact, you can probably convert any canonical URL on drupal.org to its API equivalent. Simply prefix the path with “api-d7” and suffix “.json”. By canonical, I mean URL’s that use Drupal’s internal path such as node/2773581 or user/314031. These endpoints return JSON responses which are practically the same as Drupal internal representation. This means you will notice weird field names (almost all fields would begin with “field_”) and nesting that you might not expect from any other API’s. If you have programmed for Drupal, chances are you will feel right at home with the response data structure.

The API’s that return listing of entities are simply named such as node.json or user.json (and so on). The listing endpoints accept a variety of filters and allow pagination with simple query parameters. Most of the field names can be directly used as query parameters to filter the list on that field value. For example, https://www.drupal.org/api-d7/node.json?type=project_issue would return all the issues on drupal.org. Whereas, the URL https://www.drupal.org/api-d7/node.json?type=project_issue&field_project=3158507 would return only the issues in the preloader project (preloader’s node id on drupal.org is 3158507).

Read the documentation page for examples and more details such as field names and values.

Shortcomings in the API

As you might have surmised from the description above, this API is not designed for common consumption. Drupal Association maintains this on a best-effort basis and more sophisticated use cases (such as gathering nested data in a single request) are not supported. There is a plan to improve the API on the whole in the future but I don’t know when that might happen.

Practically speaking, this means that you have to make multiple API calls if you want to collect all the information about any entity. The first API call would be to the main entity about which you want information. And then you have to parse it to gather all the referenced ID’s and make API calls for each of them. If you wanted to build an efficient parser that needs to deal with a lot of nodes, you would probably have to persist all information in your application (which is what I did with DruStats).

The problem is not limited to just normal relationships such as terms and users, but also to entity reference fields. For example, if you want to find out a user’s organization, you would have to read the “field_organizations” property and make a request under “field_collection_item” endpoint with that ID. In a normal consumer-grade API, you would probably expect the information to be embedded right in the user response.

Using the API in your code

The API endpoints are straightforward and you can request data with a simple curl request or just the browser. However, if you were writing an application, you might often get frustrated with dealing with the filters and nested queries. This is where libraries come in.

The d.o API page lists two such libraries at the end of the documentation page. The first one listed is by Kris Vanderwater (EclipseGc) and it seems simple enough to use. The second one was written by me at the time when I was building DruStats. I needed something more sophisticated and I decided to write my own API wrapper. Since then, I also used this library for Contrib Tracker, a project which tracks contributions to drupal.org by multiple users. This library is reasonably documented on the Github page but improvements are always welcome. You can also look at examples in DruStats and Contrib Tracker. I am currently in the process of moving Contrib Tracker to a new home and I am not linking to the current URL right now. I am also planning a post on contrib tracker soon.

Using the CLI tool

Matt Glaman has written a CLI tool to interact with drupal.org issues. If you only want to automate your Drupal contribution workflow, then this CLI tool might be what you need. This tool allows you to simplify working with Drupal.org patches, create interdiffs, and even watch CI jobs. As always, the documentation on Github can guide you with the installation and usage of this CLI tool.