Thursday, July 21, 2016

Goal-oriented API

In the previous article Production rule systems were employed to define behavior of frontend UI. In this one I'll show how to use Hierarchical Task Networks (HTN) to define dynamic behavior of an API (or in other words workflows). Find out more about HTN in this free Game AI Pro chapter Exploring HTN Planners through Example.
So what is wrong (or missing) in most API definitions nowadays whether it is some contract defined via Java interface or REST API described via Swagger? They define operations but lack definition of workflows or how with the help of the operations provided achieve your goal. Find below how this might be done.

There were some attempts to solve this though. REST proposes to use HATEOAS - a hypermedia-driven site provides information to navigate the site's REST interfaces dynamically by including hypermedia links with the responses (See What is HATEOAS and why is it important for my REST API? from restcookbook.)

The idea is
A REST API must not define fixed resource names or hierarchies (an obvious coupling of client and server). Servers must have the freedom to control their own namespace. Instead, allow servers to instruct clients on how to construct appropriate URIs, such as is done in HTML forms and URI templates, by defining those instructions within media types and link relations. [Failure here implies that clients are assuming a resource structure due to out-of band information, such as a domain-specific standard, which is the data-oriented equivalent to RPC’s functional coupling]. (From famous rant of Roy T. Fielding, the author of REST-paradigm.)
This gained little adoption on the client side. Why? Consider following example. User A needs to send a message to user B, posting to Message resource fails due to a) user A is not authorized (401 Unauthorized HTTP response); b) user A must be in contacts of user B in order to send a message.
Preconditions of doing something are not addressed by HATEOAS or some other part of REST, client application must hard code reactions to specific error codes or client application must hard code the workflow: a) check user is logged in; b) check user is in contacts, if not send a contact request; c) if all preconditions met send the message. More generally speaking if you want to change state of Resource A to X and in order to do so Resource B must be in state Y this information is not exposed by REST and must be known on the client side beforehand.
And this is a coupling. Consider following change request. User can specify in profile that anybody can send him a message. So now all clients must update hardcoded workflow and add check to that toggle before sending a message.

Sidenote: REST can be awkward in other ways. If curious please find my presentation Frontend-backend communication evolution held on IT Weekend Kharkiv.

So how HTN can help with this? Take a look at "travel from home to the park" example for pyhop HTN planner to understand what methods and operations are. In our example there will be one method - send(fromUser, toUser), - and operations login(username, password), sendContactRequest(fromUser, toUser), waitForRequestAccepted(user), sendMessage(fromUser, toUser, message). The send-method can be split in operations by the planner in following ways (plans):
  1. login, sendContactRequest, waitForRequestAccepted, sendMessage
  2. sendContactRequest, waitForRequestAccepted, sendMessage - if user is logged in already
  3. login, sendMessage - if recipient specified in profile that anybody can send him a message
  4. sendMessage - if logged in and recipient specified in profile that anybody can send him a message
Thus clients operate in the following way: in order to achieve some goal represented by a method (for instance, send a message from User A to User B) client asks a server for a plan how to do this, server returns a plan (or a set of possible plans) - a list of operations to be performed, client then performs those operations which may require interaction with user or third party service. Note also that plan can change with time, so reevaluate plan upon error or periodically (for instance, in Killzone 2 plans are reevaluated at 5Hz).
In other words client becomes specialized browser that can do a set of functions (methods/goals) by implementing a set of operations and asking server which operations to be run in order to do given function (method/goal). In this way all workflow logic is kept in one place - server - and do not require additional documentation and to be implemented on client side.