In
It uses
#mongodb #mongo #duplicates #duplication
MongoDB
you can remove duplicate documents based on a specific field:db.yourCollection.aggregate([
{ "$group": {
"_id": { "yourDuplicateKey": "$yourDuplicateKey" },
"dups": { "$push": "$_id" },
"count": { "$sum": 1 }
}},
{ "$match": { "count": { "$gt": 1 } }}
]).forEach(function(doc) {
doc.dups.shift();
db.yourCollection.remove({ "_id": {"$in": doc.dups }});
});
It uses
aggregation
to group by based on the given key then add its _id
into dups
field and its count in count
field. It will project fields with count of more than 1 using $match
. At the end loops over each document and remove all duplicate fields except the first one (`shift` will cause this behaviour).#mongodb #mongo #duplicates #duplication