In a collection with the following general structure:
{_id: 'id1', clientId: 'cid1', clientName:'Jon', item: 'item1', dateOfPurchase: '...'},
{_id: 'id2', clientId: 'cid1', clientName:'Jon', item: 'item2', dateOfPurchase: '...'},
{_id: 'id3', clientId: 'cid2', clientName:'Doe', item: 'itemX', dateOfPurchase: '...'}
... etc
The objective is to create a grouping by clientId
to calculate some simple statistics, e.g. total occurrences per clientId.
One way to achieve this using Node.js MongoDB Driver API Collection.group method is:
db.collection.group(
'clientId',
{},
{ count: 0 },
function(obj, prev) {
prev.count++;
},
true
}
The output of this for the sample data above would be similar to:
{clientId: 'cid1', count: 2}
{clientId: 'cid2', count: 1}
Question 1: what is the best way to pass some external values to the reducer
function? For example I may want to calculate different counts for purchases made before/after a specific date and want to pass this date as a parameter. I know that with mapReduce
I can use the scope
option for this purpose. I'm wondering if there's a way to do this with the group
function. I could use the iterator object but it feels hacky.
Question 2: is there a way to access the original document from inside the finalize
function in order to include some extra data in the results? i.e. project extra fields from the original documents such as clientName
:
{clientId: 'cid1', count: 2, clientName: 'Jon'}
{clientId: 'cid2', count: 1, clientName: 'Doe'}
Clarifications for Question 2, a) I could add the extra field inside the reducer
function but it feels redundant to include code which is not supposed to run on every iteration. b) I could use aggregate pipelines to achieve something like this but I'm wondering if I can do this with Collection.group
here
via gevou
No comments:
Post a Comment