Skip to main content

Remove Duplicate Rows in Excel in C# and VB.NET

When an Excel file contains a huge amount of records, there might be duplicate records as well. In this blog, I am going to show you how to remove the duplicate rows in an Excel file programmatically in C# and VB.NET.

The library I used:

Free Spire.XLS for .NET

Free Spire.XLS for .NET is a feature-rich Excel API offered by E-iceblue. It can be easily integrated in your .NET (C#, VB.NET, ASP.NET, .NET Core) applications to create, read, edit, convert and print Excel files without using Microsoft Office.

Before coding, you need to get Free Spire.XLS for .NET by installing it via NuGet or downloading it via the official website.

C# Code

  1. using Spire.Xls;  
  2. using System.Linq;  
  3.   
  4. namespace RemoveDuplicateRows  
  5. {  
  6.     class Program  
  7.     {  
  8.         static void Main(string[] args)  
  9.         {  
  10.             //Create a Workbook instance  
  11.             Workbook workbook = new Workbook();  
  12.             //Load the Excel file  
  13.             workbook.LoadFromFile("TEST.xlsx");  
  14.   
  15.             //Get the first worksheet  
  16.             Worksheet sheet = workbook.Worksheets[0];  
  17.   
  18.             //Specify the range that you want to remove duplicate records from.  
  19.             var range = sheet.Range["A1:A" + sheet.LastRow];  
  20.             //Get the duplicated row numbers    
  21.             var duplicatedRows = range.Rows  
  22.                    .GroupBy(x => x.Columns[0].DisplayedText)  
  23.                    .Where(x => x.Count() > 1)  
  24.                    .SelectMany(x => x.Skip(1))  
  25.                    .Select(x => x.Columns[0].Row)  
  26.                    .ToList();  
  27.   
  28.             //Remove the duplicate rows & blank rows if any             
  29.             for (int i = 0; i < duplicatedRows.Count; i++)  
  30.             {  
  31.                 sheet.DeleteRow(duplicatedRows[i] - i);  
  32.             }  
  33.   
  34.             //Save the result file  
  35.             workbook.SaveToFile("output.xlsx", ExcelVersion.Version2013);  
  36.         }  
  37.     }  
  38. }  

VB.NET Code

  1. Imports Spire.Xls  
  2. Imports System.Linq  
  3.   
  4. Namespace RemoveDuplicateRows  
  5.     Class Program  
  6.         Private Shared Sub Main(ByVal args As String())  
  7.             Dim workbook As Workbook = New Workbook()  
  8.             workbook.LoadFromFile("TEST.xlsx")  
  9.             Dim sheet As Worksheet = workbook.Worksheets(0)  
  10.             Dim range = sheet.Range("A1:A" & sheet.LastRow)  
  11.             Dim duplicatedRows = range.Rows.GroupBy(Function(x) x.Columns(0).DisplayedText).Where(Function(x) x.Count() > 1).SelectMany(Function(x) x.Skip(1)).[Select](Function(x) x.Columns(0).Row).ToList()  
  12.   
  13.             For i As Integer = 0 To duplicatedRows.Count - 1  
  14.                 sheet.DeleteRow(duplicatedRows(i) - i)  
  15.             Next  
  16.   
  17.             workbook.SaveToFile("output.xlsx", ExcelVersion.Version2013)  
  18.         End Sub  
  19.     End Class  
  20. End Namespace 

The input Excel file:


 The output Excel file:


 


Comments

Popular posts from this blog

3 Ways to Generate Word Documents from Templates in Java

A template is a document with pre-applied formatting like styles, tabs, line spacing and so on. You can quickly generate a batch of documents with the same structure based on the template. In this article, I am going to show you the different ways to generate Word documents from templates programmatically in Java using Free Spire.Doc for Java library. Prerequisite First of all, you need to add needed dependencies for including Free Spire.Doc for Java into your Java project. There are two ways to do that. If you use maven, you need to add the following code to your project’s pom.xml file. <repositories>               <repository>                   <id>com.e-iceblue</id>                   <name>e-iceblue</name> ...

Insert and Extract OLE objects in Word in Java

You can use OLE (Object Linking and Embedding) to include content from other programs, such as another Word document, an Excel or PowerPoint document to an existing Word document. This article demonstrates how to insert and extract embedded OLE objects in a Word document in Java by using Free Spire.Doc for Java API.   Add dependencies First of all, you need to add needed dependencies for including Free Spire.Doc for Java into your Java project. There are two ways to do that. If you use maven, you need to add the following code to your project’s pom.xml file.     <repositories>               <repository>                   <id>com.e-iceblue</id>                   <name>e-iceblue</name>     ...