Tag Archives: web services

Using Web Services in RapidMiner

The Enrich Data by Webservice operator of the RapidMiner Web Mining Extension allows you to interact with web services in your RapidMiner process.

A web service can be invoked for each example of an example set. (Note that this may be time-consuming.) All strings of the form <%attribute%> in a request will be automatically replaced with the corresponding attribute value of the current example. The operator provides several different methods to parse the response, including the use of regular expressions and XPath location paths. Parsing the result you can add new attributes to your example set.

For demonstration purposes we will use the Google Geocoding API. This web service also offers reverse geocoding functionality, i.e. provides a human-readable address for a geographical location. To see how it works, click on the following link: http://maps.googleapis.com/maps/api/geocode/xml?latlng=47.555214,21.621423&sensor=false. Notice that latitude and longitude values are passed to the service in the latlng query string parameter.

We will use this data file for our experiment. The file contains earthquake data that originates from the Earthquake Search service provided by the United States Geological Survey (USGS). Consider the following RapidMiner process that is available from here:

A RapidMiner process that uses the Enrich Data by Webservice operator to interact with a web service

A RapidMiner process that uses the Enrich Data by Webservice operator to interact with a web service

First, the data file is read by the Read CSV operator. Then the Sort and Filter Example Range operators are used to filter the 50 highest magnitude earthquakes. Finally, the Enrich Data by Webservice operator invokes the web service to retrieve country names for the geographical locations of these 50 earthquakes. (Only a small subset of the entire data is used to prevent excessive network traffic.)

The parameters of the Enrich Data by Webservice operator should be set as follows (see the figure below):

  • Set the value of the query type parameter to XPath
  • Set the value of the attribute type parameter to Nominal
  • Uncheck the checkbox of the assume html parameter
  • Set the value of the request method parameter to GET
  • Set the value of the url parameter to http://maps.googleapis.com/maps/api/geocode/xml?latlng=<%Latitude%&gt;,<%Longitude%>&sensor=false
Parameters of the Enrich Data by Webservice operator

Parameters of the Enrich Data by Webservice operator

Finally, click on the Edit List button next to the xpath queries parameter that will bring up an Edit Parameter List window. Enter the string Country into the attribute name field and the string //result[type = 'country']/formatted_address/text() into the query expression field.

Setting of the xpath queries parameter

Setting of the xpath queries parameter

That’s all! Unfortunately, running the process results in the following error:

Process Failed

Process Failed


Well, this is a bug that I have already reported to the developers. (See the bug report here.) The following trick solves the problem: set the request method parameter of the Enrich Data by Webservice operator to POST, enter some arbitrary text into the parameter service method, then set the request method parameter to GET again.

The figure below shows the enhanced example set that contains country names provided by the web service (see the Country attribute).

Enhanced example set with country names

Enhanced example set with country names

Advertisements
Tagged , ,