Match Call (Similarity Search)
Service Call:
http://api.semantichacker.com/TOKEN/match/INDEXID?
Overview
The match service call enables the ability to match the content provided in the call against a set of content contained within one of our indexes -- this is Similarity Search. The provided content is converted into a Semantic Signature® and compared against all Semantic Signatures® for the items contained within the specified index ID. The result returned by the service is a list of content items from the index that are most relevant to the content provided to the service. The items are listed by their "match score" from highest to lowest.
Each content item in the response match list includes a document ID, a match score, and the requested attributes for that content item. An optional fields parameter can be provided in the call to specify which attributes to return for each content item. If the fields parameter is not provided, then a default set of attributes will be used.
Available Content Indexes
The INDEXID node of the request path specifies what content index the matching call will attempt to match the provided content against. Following is a list of publicly accessible content indexes that we have made available. In addition to the public indexes we can support custom indexes that can be made available to you. Please contact us at: development@semantichacker.com for more information.
- If you have licensed an index that contains your content, there are no default fields that are provided back for a match call. You can specify which fields you want returned using the 'fields' parameter. If the 'fields' parameter is omitted for a custom index, all stored fields for each item will be returned.
- See the facets section on the index service page for details about the SEMANTIC_CATEGORY facet and the facet data types that are available.
Index Item Input Parameters
In addition to the the URI and content input mechanisms documented on the common request page, the match call supports using an item from the index as input for the match request. The item is retrieved by using either a content ID or an external ID and is then used as the input for the request.
The content ID (also known as document ID or ID) for an item is the value for the id attribute of a match element in a match call response. The value for the id attribute can be used as the value for the sourceContentId parameter in a match request using the index item input method. The external ID for an item is returned as the externalId element inside of a match element in a match call response. The value for the externalId element can be used as the value for the sourceExternalId parameter in a match request using the index item input method.
Examples
curl -F "sourceExternalId=B001P77X70" "http://api.semantichacker.com/TOKEN/match/amazon" curl -F "sourceContentId=1048" "http://api.semantichacker.com/TOKEN/match/rssnews"
Error Codes
There are several cases where error codes will be returned when using the index item input method.
Match Call Parameters
In addition to the common request parameters, the match call has the following optional parameters.
Match Call Facet Parameters
The match call supports several parameters, all optional, for using the facets that have been defined for the index. See the index service facets documentation for details about the facet types that are available and how to define facets for a custom index.
Facet Query Syntax
The syntax for specifying a facet query string is detailed below and you will see that it is similar to the "where" clause for SQL and includes support for parenthetical notation.
- EXPR
- EXPR: '(' EXPR 'AND' EXPR ')'
- EXPR: '(' EXPR 'OR' EXPR ')'
- EXPR: ATTR OP VALUES
- ATTR: string
- OP: '=' | '!=' | '<' | '>' | '<=' | '>='
- VALUES: VALUE [DELIM VALUE]*
- VALUE: number | string
- DELIM: string
Examples:
color != blue,green (prodColor = blue,green AND productGroup = shirts) SEMANTIC_CATEGORY != 568,452,340 (price >= 599.99 OR popularityFunc > 88.5) (prodColor = blue,green AND (price <= 50.00 AND popularityFunc > 95.0))
Notes:
- DELIM is the facet's multivalue delimiter string that was set in the facet's definition, or ',' (the default).
- Match calls with invalid facet query strings will be returned as illegal argument errors by the API.
- When constructing queries keep in mind that queries using 'OR' clauses may execute slower than queries with 'AND' clauses.
- Multiple values for an '=' clause indicate an implicit OR condition, e.g. color = blue,green means color = blue OR color = green.
- Multiple values for a '!=' clause indicates an implicit AND condition, e.g. SEMANTIC_CATEGORY != 568,452 means SEMANTIC_CATEGORY != 568 AND SEMANTIC_CATEGORY != 452.
- Queries using '>', '<', '>=', or '<=' against numeric facets with multiple values per item will match an item if just one of the item's values meet the query, regardless if the other values for the item do not.
- An argument string for '>', '<', '>=', or '<=' must be a single value. A multivalue argument string will be rejected.
- Queries using '!=' will match items that do not have a value for that facet.
- Queries using any operation other than '!=' can match an item only if that item has a value for that facet.
- If a match call uses a GET request, the facetQuery parameter value must be URL encoded.
- Expressions that work against strings are case sensitive.
Facet Only Queries
It is valid to submit a match call that only contains a facetQuery parameter and does not have a URI or text content included for relevance matching. In this case, items that match the facet query are collected and sorted according to the matchRank parameter. Since the value for the matchRank parameter must be the name of a single value facet, facet only queries are not possible on indexes for which only multivalue facets are defined.
An example use case for submitting a facet only query would be to query an index of timestamped items for the items in a specified date range, ordered by their timestamp.
Example Match Calls With Facet Parameters
Return index items that are relevant to the http://www.linux.org site, returning facet value counts ordered by document count:
curl -F "includeFacetValueCounts=true" -F "facetValueOrderBy=documentCount"
"http://api.semantichacker.com/TOKEN/match/INDEX?uri=http%3a%2f%2fwww.linux.org"
Return index items that have a price <= 25.00 and belong to the tshirts product group, and are relevant to the http://www.linux.org site:
curl -F "facetQuery=(price <= 25.00 AND productGroup = tshirts)"
"http://api.semantichacker.com/TOKEN/match/INDEX?uri=http%3a%2f%2fwww.linux.org"
Match Call XML Response Example
http://api.semantichacker.com/TOKEN/match/youtube?uri=http%3a%2f%2fwww.nfl.com&nMatches=3&fields=title,landingPageUrl,enclosureUrl
<?xml version="1.0" encoding="UTF-8"?>
<response xmlns="http://www.semantichacker.com/api">
<about>
<requestId>994E0EF89781BB3B0A71F930240B4E5F</requestId>
<docId>12B9CC519FA32800883C988C03ED726A</docId>
<systemType>match</systemType>
<contentType>text/html</contentType>
<contentDigest>E38F117A6ACDEA42CADE983E5666EC92</contentDigest>
<requestDate>2011-08-30T20:04:34+00:00</requestDate>
<systemVersion>2.1</systemVersion>
<sourceUri>http://www.nfl.com</sourceUri>
</about>
<contentMatch>
<contentMatchResponse>
<matches>
<match id="539774" score="0.47137815" indexId="youtube" >
<externalId>http://gdata.youtube.com/feeds/api/videos/kTgVYrOo4DA</externalId>
<attribute name="title">NFL 2011 Chicago Bears Versus New York Giants: Giants 3rd Field Goal HD.</attribute>
<attribute name="landingPageUrl">http://www.youtube.com/watch?v=kTgVYrOo4DA&feature=youtube_gdata</attribute>
<attribute name="enclosureUrl">http://www.youtube.com/v/kTgVYrOo4DA?f=videos&app=youtube_gdata</attribute>
</match>
<match id="539775" score="0.43644464" indexId="youtube" >
<externalId>http://gdata.youtube.com/feeds/api/videos/IxE7hUgk31Q</externalId>
<attribute name="title">NFL 2011 Chicago Bears Versus New York Giants: Bears 2nd Field Goal HD.</attribute>
<attribute name="landingPageUrl">http://www.youtube.com/watch?v=IxE7hUgk31Q&feature=youtube_gdata</attribute>
<attribute name="enclosureUrl">http://www.youtube.com/v/IxE7hUgk31Q?f=videos&app=youtube_gdata</attribute>
</match>
<match id="930248" score="0.40170527" indexId="youtube" >
<externalId>http://gdata.youtube.com/feeds/api/videos/GsvzLrgIX04</externalId>
<attribute name="title">D NATIONAL 2010 SEASON</attribute>
<attribute name="landingPageUrl">http://www.youtube.com/watch?v=GsvzLrgIX04&feature=youtube_gdata</attribute>
<attribute name="enclosureUrl">http://www.youtube.com/v/GsvzLrgIX04?f=videos&app=youtube_gdata</attribute>
</match>
</matches>
</contentMatchResponse>
</contentMatch>
</response>
Match Call JSON Response Example
http://api.semantichacker.com/TOKEN/match/youtube?uri=http%3a%2f%2fwww.nfl.com&nMatches=3&fields=title,landingPageUrl,enclosureUrl&format=JSON
{
"about": {
"requestId": "9012BEE686DC750914E5A14CD82DE530",
"docId": "12B9CC519FA32800883C988C03ED726A",
"systemType": "match",
"contentType": "text/html",
"contentDigest": "E38F117A6ACDEA42CADE983E5666EC92",
"requestDate": "2011-08-30T20:05:56+00:00",
"systemVersion": "2.1",
"sourceUri": "http://www.nfl.com"
},
"contentMatch": {"contentMatchResponse": {"matches": [
{
"id": "539774",
"score": "0.47137815",
"indexId": "youtube",
"externalId": "http://gdata.youtube.com/feeds/api/videos/kTgVYrOo4DA",
"attributes": [
{
"name": "title",
"value": "NFL 2011 Chicago Bears Versus New York Giants: Giants 3rd Field Goal HD."
},
{
"name": "landingPageUrl",
"value": "http://www.youtube.com/watch?v=kTgVYrOo4DA&feature=youtube_gdata"
},
{
"name": "enclosureUrl",
"value": "http://www.youtube.com/v/kTgVYrOo4DA?f=videos&app=youtube_gdata"
}
]
},
{
"id": "539775",
"score": "0.43644464",
"indexId": "youtube",
"externalId": "http://gdata.youtube.com/feeds/api/videos/IxE7hUgk31Q",
"attributes": [
{
"name": "title",
"value": "NFL 2011 Chicago Bears Versus New York Giants: Bears 2nd Field Goal HD."
},
{
"name": "landingPageUrl",
"value": "http://www.youtube.com/watch?v=IxE7hUgk31Q&feature=youtube_gdata"
},
{
"name": "enclosureUrl",
"value": "http://www.youtube.com/v/IxE7hUgk31Q?f=videos&app=youtube_gdata"
}
]
},
{
"id": "930248",
"score": "0.40170527",
"indexId": "youtube",
"externalId": "http://gdata.youtube.com/feeds/api/videos/GsvzLrgIX04",
"attributes": [
{
"name": "title",
"value": "D NATIONAL 2010 SEASON"
},
{
"name": "landingPageUrl",
"value": "http://www.youtube.com/watch?v=GsvzLrgIX04&feature=youtube_gdata"
},
{
"name": "enclosureUrl",
"value": "http://www.youtube.com/v/GsvzLrgIX04?f=videos&app=youtube_gdata"
}
]
}
]}}
}