public static IFileInfo[][] FindDuplicates(DedupContext dup, Action <int, int, string> reportProgress = null) { List <IFileInfo> files = new List <IFileInfo>(); reportProgress?.Invoke(0, 100, "building list.."); foreach (var d in dup.Dirs) { Stuff.GetAllFiles(d, files); } files.AddRange(dup.Files); reportProgress?.Invoke(25, 100, "filtering"); files = files.Where(z => z.Exist && z.Length > 0).ToList(); reportProgress?.Invoke(50, 100, "grouping 1"); var grp1 = files.GroupBy(z => z.Length).Where(z => z.Count() > 1).ToArray(); List <IFileInfo[]> groups = new List <IFileInfo[]>(); foreach (var item in grp1) { reportProgress?.Invoke(75, 100, "grouping 2"); var arr0 = item.GroupBy(z => Stuff.CalcPartMD5(z, 1024 * 1024)).ToArray(); var cnt0 = arr0.Count(z => z.Count() > 1); if (cnt0 == 0) { continue; } groups.AddRange(arr0.Select(z => z.ToArray()).ToArray()); } //todo: binary compare candidates return(groups.ToArray()); }
public void SetGroups(DedupContext ctx, IFileInfo[][] groups) { Context = ctx; listView1.Items.Clear(); foreach (var fileInfo in groups.OrderByDescending(z => z.First().Length *z.Length)) { listView1.Items.Add(new ListViewItem(new string[] { fileInfo.First().Name, Stuff.CalcPartMD5(fileInfo.First(), 1024 * 1024), fileInfo.Length + "", Stuff.GetUserFriendlyFileSize((fileInfo.First().Length *(fileInfo.Length - 1))) }) { Tag = fileInfo }); } label1.Text = "Total repeats groups: " + groups.Length; label2.Text = "Total memory overhead: " + Stuff.GetUserFriendlyFileSize(groups.Sum(z => z.First().Length *(z.Length - 1))); listView1.AutoResizeColumns(ColumnHeaderAutoResizeStyle.ColumnContent); listView1.AutoResizeColumns(ColumnHeaderAutoResizeStyle.HeaderSize); }