So for example we have two documents like this:
{
"_id" : "1",
"type" : "parent",
"child_ids" [ "2" ]
}
and
{
"_id" : "2",
"type" : "child",
"parent_id" [ "1" ]
}
The first step is to get Elasticsearch to recognise these as two different types of documents. This can be achieved using the script filter function in the Elasticsearch CouchDB river plugin like this:
{
"type" : "couchdb",
"couchdb" : {
"host" : "localhost",
"port" : 5984,
"db" : "example",
"script" : "ctx._type = ctx.doc.type"
},
"index" : {
"index" : "example"
}
}
The simple script takes the the type field from the original CouchDB document and uses it to set the mapping type in Elasticsearch. To add the parent child information, change the script to this:
"script" : "ctx._type = ctx.doc.type; if (ctx._type == 'child') ctx._parent = ctx.doc.parent_id"
Now Elasticsearch has all the information it needs to support multiple document types and parent/child mappings.
One downside of this approach is that the documents in CouchDB must always have type information available. This isn't the case if you just use HTTP DELETE to remove documents as CouchDB will not retain anything but the ID and revision in that case. Instead you must use the bulk operations API to mark documents as deleted and retain type information. So to delete the above child document you would do as follows:
POST /example/_bulkdocs HTTP/1.1
{
"docs" : [{
"_id: 2,
"_rev" : "rev",
"_deleted" : true,
"type" : "child"
}]
}
Which will preserve the type information while still having the document appear as deleted.