Saturday, 13 May 2017

Mongo / Mongoose Aggregation - $redact and $cond issues

I was fortunate enough to get an awesome answer to another SO question Mongo / Mongoose - Aggregating by Date from @chridam which given a set of documents like:

{ "_id" : ObjectId("5907a5850b459d4fdcdf49ac"), "amount" : -33.3, "name" : "RINGGO", "method" : "VIS", "date" : ISODate("2017-04-26T23:00:00Z"), "importDate" : ISODate("2017-05-01T21:15:49.581Z"), "category" : "Not Set", "__v" : 0 }
{ "_id" : ObjectId("5907a5850b459d4fdcdf49ba"), "amount" : -61.3, "name" : "Amazon", "method" : "VIS", "date" : ISODate("2017-03-23T00:00:00Z"), "importDate" : ISODate("2017-05-01T21:15:49.592Z"), "category" : "Not Set", "__v" : 0 }
{ "_id" : ObjectId("5907a5850b459d4fdcdf49ce"), "amount" : -3.3, "name" : "Tesco", "method" : "VIS", "date" : ISODate("2017-03-15T00:00:00Z"), "importDate" : ISODate("2017-05-01T21:15:49.601Z"), "category" : "Not Set", "__v" : 0 }
{ "_id" : ObjectId("5907a5850b459d4fdcdf49cc"), "amount" : -26.3, "name" : "RINGGO", "method" : "VIS", "date" : ISODate("2017-03-16T00:00:00Z"), "importDate" : ISODate("2017-05-01T21:15:49.600Z"), "category" : "Not Set", "__v" : 0 }
{ "_id" : ObjectId("5907a5850b459d4fdcdf49f7"), "amount" : -63.3, "name" : "Sky", "method" : "VIS", "date" : ISODate("2017-03-02T00:00:00Z"), "importDate" : ISODate("2017-05-01T21:15:49.617Z"), "category" : "Not Set", "__v" : 0 }
{ "_id" : ObjectId("5907a5850b459d4fdcdf49be"), "amount" : -3.3, "name" : "RINGGO", "method" : "VIS", "date" : ISODate("2017-03-22T00:00:00Z"), "importDate" : ISODate("2017-05-01T21:15:49.593Z"), "category" : "Not Set", "__v" : 0 }

required a query that would aggregate the spend by vendor, year, month and week. The query is below and it almost works fantastically but as I hae used it in my application I have noticed a significant problem

db.statements.aggregate([
  { "$match": { "name": "RINGGO" } },
  {
  "$redact": {
      "$cond": [
          {
              "$and": [
                 { "$eq": [{ "$year": "$date" },  2017  ]}, // within my route this uses parseInt(req.params.year)
                 { "$eq": [{ "$month": "$date" }, 3 ]}, // within my route this uses parseInt(req.params.month)
                 { "$eq": [{ "$week": "$date" },  12  ]} // within my route this uses parseInt(req.params.week)
            ]
        },
        "$$KEEP",
        "$$PRUNE"
    ]
}
},{
    "$group": {
        "_id": {
            "name": "$name",
            "year": { "$year": "$date" },
            "month": { "$month": "$date" },
            "week": { "$week": "$date" }
        },
        "total": { "$sum": "$amount" }
    }
},
{
    "$group": {
        "_id": {
            "name": "$_id.name",
            "year": "$_id.year"
        },
        "YearlySpends": { "$push": "$total" },
        "totalYearlyAmount": { "$sum": "$total" },
        "data": { "$push": "$$ROOT" }
    }
},
{ "$unwind": "$data" },
{
    "$group": {
        "_id": {
            "name": "$_id.name",
            "month": "$data._id.month"
        },
        "YearlySpends": { "$first": "$YearlySpends" },
        "totalYearlyAmount": { "$first": "$totalYearlyAmount" },
        "MonthlySpends": { "$push": "$data.total" },
        "totalMonthlyAmount": { "$sum": "$data.total" },
        "data": { "$push": "$data" }
    }
},
{ "$unwind": "$data" },
{
    "$group": {
        "_id": {
            "name": "$_id.name",
            "week": "$data._id.week"
        },
        "YearlySpends": { "$first": "$YearlySpends" },
        "totalYearlyAmount": { "$first": "$totalYearlyAmount" },
        "MonthlySpends": { "$first": "$MonthlySpends" },
        "totalMonthlyAmount": { "$first": "$totalMonthlyAmount" },
        "WeeklySpends": { "$push": "$data.total" },
        "totalWeeklyAmount": { "$sum": "$data.total" },
        "data": { "$push": "$data" }
    }
},
{ "$unwind": "$data" },
{
    "$group": {
        "_id": "$data._id",
        "YearlySpends": { "$first": "$YearlySpends" },
        "totalYearlyAmount": { "$first": "$totalYearlyAmount" },
        "MonthlySpends": { "$first": "$MonthlySpends" },
        "totalMonthlyAmount": { "$first": "$totalMonthlyAmount" },
        "WeeklySpends": { "$first": "$WeeklySpends" },
        "totalWeeklyAmount": { "$first": "$totalWeeklyAmount" }
    }
}
])

Running this query returns

{ "_id" :
 { "name" : "RINGGO", 
   "year" : 2017, 
   "month" : 3, 
   "week" : 12 }, 
   "YearlySpends" : [ -9.6 ], 
   "totalYearlyAmount" : -9.6, 
   "MonthlySpends" : [ -9.6 ], 
   "totalMonthlyAmount" : -9.6, 
   "WeeklySpends" : [ -9.6 ], 
   "totalWeeklyAmount" : -9.6 
}

And when I change to wanting to see the month's spending

"$cond": [
          {
            "$and": [
                 { "$eq": [{ "$year": "$date" },  2017  ]},
                 { "$eq": [{ "$month": "$date" }, 3 ]}
            ]
          },
        "$$KEEP",
        "$$PRUNE"
      ]

I get

{ "_id" : { "name" : "RINGGO", "year" : 2017, "month" : 3, "week" : 12 }, "YearlySpends" : [ -3.3, -9.6, -9.6, -3.3 ], "totalYearlyAmount" : -25.799999999999997, "MonthlySpends" : [ -3.3, -9.6, -9.6, -3.3 ], "totalMonthlyAmount" : -25.799999999999997, "WeeklySpends" : [ -9.6 ], "totalWeeklyAmount" : -9.6 }
{ "_id" : { "name" : "RINGGO", "year" : 2017, "month" : 3, "week" : 9 }, "YearlySpends" : [ -3.3, -9.6, -9.6, -3.3 ], "totalYearlyAmount" : -25.799999999999997, "MonthlySpends" : [ -3.3, -9.6, -9.6, -3.3 ], "totalMonthlyAmount" : -25.799999999999997, "WeeklySpends" : [ -3.3 ], "totalWeeklyAmount" : -3.3 }
{ "_id" : { "name" : "RINGGO", "year" : 2017, "month" : 3, "week" : 11 }, "YearlySpends" : [ -3.3, -9.6, -9.6, -3.3 ], "totalYearlyAmount" : -25.799999999999997, "MonthlySpends" : [ -3.3, -9.6, -9.6, -3.3 ], "totalMonthlyAmount" : -25.799999999999997, "WeeklySpends" : [ -9.6 ], "totalWeeklyAmount" : -9.6 }
{ "_id" : { "name" : "RINGGO", "year" : 2017, "month" : 3, "week" : 13 }, "YearlySpends" : [ -3.3, -9.6, -9.6, -3.3 ], "totalYearlyAmount" : -25.799999999999997, "MonthlySpends" : [ -3.3, -9.6, -9.6, -3.3 ], "totalMonthlyAmount" : -25.799999999999997, "WeeklySpends" : [ -3.3 ], "totalWeeklyAmount" : -3.3 }

However when I run a simple db.statements.find({"name":"RINGGO"}) I get

{ "_id" : ObjectId("5907a5850b459d4fdcdf49ac"), "amount" : -3.3, "name" : "RINGGO", "method" : "VIS", "date" : ISODate("2017-03-26T23:00:00Z"), "importDate" : ISODate("2017-05-01T21:15:49.581Z"), "category" : "Not Set", "__v" : 0 }
{ "_id" : ObjectId("5907a5850b459d4fdcdf49ba"), "amount" : -6.3, "name" : "RINGGO", "method" : "VIS", "date" : ISODate("2017-03-23T00:00:00Z"), "importDate" : ISODate("2017-05-01T21:15:49.592Z"), "category" : "Not Set", "__v" : 0 }
{ "_id" : ObjectId("5907a5850b459d4fdcdf49ce"), "amount" : -3.3, "name" : "RINGGO", "method" : "VIS", "date" : ISODate("2017-03-15T00:00:00Z"), "importDate" : ISODate("2017-05-01T21:15:49.601Z"), "category" : "Not Set", "__v" : 0 }
{ "_id" : ObjectId("5907a5850b459d4fdcdf49cc"), "amount" : -6.3, "name" : "RINGGO", "method" : "VIS", "date" : ISODate("2017-03-16T00:00:00Z"), "importDate" : ISODate("2017-05-01T21:15:49.600Z"), "category" : "Not Set", "__v" : 0 }
{ "_id" : ObjectId("5907a5850b459d4fdcdf49f7"), "amount" : -3.3, "name" : "RINGGO", "method" : "VIS", "date" : ISODate("2017-03-02T00:00:00Z"), "importDate" : ISODate("2017-05-01T21:15:49.617Z"), "category" : "Not Set", "__v" : 0 }
{ "_id" : ObjectId("5907a5850b459d4fdcdf49be"), "amount" : -3.3, "name" : "RINGGO", "method" : "VIS", "date" : ISODate("2017-03-22T00:00:00Z"), "importDate" : ISODate("2017-05-01T21:15:49.593Z"), "category" : "Not Set", "__v" : 0 }

So you can see that there is a different number of items in MonthlySpends in previous output compared to that shown in the output from the find by name. Also you can see that some of the values are being summed together in MonthlySpends when they shouldn't be.

Ideally I'm looking to get to an output which: when I have $redact containing:

"$cond": [
        {
            "$and": [
                 { "$eq": [{ "$year": "$date" },  2017  ]}, 
                 { "$eq": [{ "$month": "$date" }, 3 ]}, 
                 { "$eq": [{ "$week": "$date" },  12  ]} 
            ]
        },
        "$$KEEP",
        "$$PRUNE"
    ]

returns

{ "_id" : { "name" : "RINGGO", "year" : 2017, "month" : 3, "week" : 12 }, "YearlySpends" : [ -3.3, -9.6, -9.6, -3.3 ], "totalYearlyAmount" : -25.799999999999997, "MonthlySpends" : [ -3.3, -9.6, -9.6, -3.3 ], "totalMonthlyAmount" : -25.799999999999997, "WeeklySpends" : [ -9.6 ], "totalWeeklyAmount" : -9.6 }

when I have $redact containing:

"$cond": [
        {
            "$and": [
                 { "$eq": [{ "$year": "$date" },  2017  ]}, 
                 { "$eq": [{ "$month": "$date" }, 3 ]},
            ]
        },
        "$$KEEP",
        "$$PRUNE"
        ]

returns

{ "_id" : { "name" : "RINGGO", "year" : 2017, "month" : 3 }, "YearlySpends" : [ -3.3, -9.6, -9.6, -3.3 ], "totalYearlyAmount" : -25.799999999999997, "MonthlySpends" : [ -3.3, -9.6, -9.6, -3.3 ], "totalMonthlyAmount" : -25.799999999999997 }

when I have $redact containing:

"$cond": [
        {
            "$and": [
                 { "$eq": [{ "$year": "$date" },  2017  ]}
            ]
        },
        "$$KEEP",
        "$$PRUNE"
        ]

returns

{ "_id" : { "name" : "RINGGO", "year" : 2017 }, "YearlySpends" : [ -3.3, -9.6, -9.6, -3.3 ], "totalYearlyAmount" : -25.799999999999997}

Any help in this much required. I've tried tinkering with the query but I'm afraid I just don't understand it enough to modify it correctly.



via Stuart Brown

No comments:

Post a Comment