In this post, we’re going to look at how we can remove duplicates from a list on Google Sheets and also store those duplicate rows on a separate sheet.
Google Sheets has a remove duplicates feature but it only removes the duplicates and sometimes you also want to have a list of those. So, here we’re going to use a little bit of Apps Script to both produce a list of unique rows and also a list of the duplicate ones.
As an example, we have a list of students who have attended a number of courses on different dates. What I want is a list of the latest courses they attended, and a separate list with their previous courses.
To get the latest dates, I first sort the list by their name and then by the course date, with the latest one being first.
The sheet contains 3 sheets: ALL, UNIQUE, and DUPLICATES.
Code
Now let’s look at the script to do this.
1. function removeAndSeparateDuplicates() {
2. const ss = SpreadsheetApp.getActiveSpreadsheet();
3. const shALL = ss.getSheetByName('ALL');
4. const shUnique = ss.getSheetByName('UNIQUE');
5. const shDuplicates = ss.getSheetByName('DUPLICATES');
L1: Open the function.
L2: Get the active spreadsheet.
L3-5: Get the sheets, called ALL, UNIQUE, and DUPLICATES.
7. const data = shALL.getRange(2, 1, shALL.getLastRow() - 1, shALL.getLastColumn()).getValues();
8. let uniqueSs = [];
9. let uniqueRows = [];
10. let duplicateRows = [];
L7: Get the data on the ALL sheet, ignoring the header row.
L8: Set up 3 empty arrays, ready to add data to later. The uniqueSs array we’ll use to see if the student’s name is unique or not.
L9: The uniqueRows array we’ll use to store the latest rows for each of the students.
L10: The duplicateRows array we’ll store the rows which are duplicates and aren’t the latest ones.
12. data.forEach((row) => {
13. let studentName = row[0];
14. if (!uniqueSs.includes(studentName)) {
15. uniqueSs.push(studentName);
16. uniqueRows.push(row);
17. }
18. else {
19. duplicateRows.push(row);
20. }
21. });
L12: Loop through the data array.
L13: Assign the first value in the row to the variable studentName, just for clarity.
L14: Check to see if the array uniqueSs doesn’t include the student’s name.
L15: If it doesn’t, push the student’s name into the uniqueSs array.
L16: Also, push the whole row into the uniqueRows array.
L17: Close the if statement.
L18-20: If the student’s name is in the uniqueSs array, then it’s a duplicate, and we push the row into the duplicateRows array.
L21: Close the forEach loop.
Finally, we add the contents of the uniqueRows and duplicateRows arrays to the UNIQUE and DUPLICATES sheets respectively.
23. shUnique.getRange(2, 1, uniqueRows.length, uniqueRows[0].length).setValues(uniqueRows);
24. shDuplicates.getRange(2, 1, duplicateRows.length, duplicateRows[0].length).setValues(duplicateRows);
25.}
L23: Get the UNIQUE sheet and add the uniqueRows array to it.
L24: Get the DUPLICATES sheet and add the duplicateRows array to it.
As you can see we end up with two lists, one with the latest course the students attended.
And the other with the previous courses they had attended.
I used this to go though a list of over 30,000 students but within seconds it created the two lists I needed for the data migration we were doing. It’s easy to do and this script can be easily adapted depending on your needs.
Make a copy of the file which contains the code here.
If you don’t fancy using code to do this, check out my post on removing duplicates here and on filtering duplicates and unique values here.
Want to learn more about Google Workspace and Apps Script? The books below are available on Amazon. Just click on a book! (Affiliate links).
a