public async Task ConsumeThread(Thread thread, string board) { var dbCommands = GetPreparedStatements(board); var hashObject = new ThreadHashObject(board, thread.OriginalPost.PostNumber); if (!ThreadHashes.TryGetValue(hashObject, out var threadHashes)) { // Rebuild hashes from database, if they exist var hashes = await dbCommands.WithAccess(connection => dbCommands.GetHashesOfThread(hashObject.ThreadId)); threadHashes = new SortedList <ulong, int>(); foreach (var hashPair in hashes) { threadHashes.Add(hashPair.Key, hashPair.Value); } } List <Post> postsToAdd = new List <Post>(thread.Posts.Length); foreach (var post in thread.Posts) { if (threadHashes.TryGetValue(post.PostNumber, out int existingHash)) { if (post.GenerateAsagiHash() != existingHash) { // Post has changed since we last saved it to the database Program.Log($"[Asagi] Post /{board}/{post.PostNumber} has been modified"); await dbCommands.WithAccess(() => dbCommands.UpdatePost(post, false)); threadHashes[post.PostNumber] = post.GenerateAsagiHash(); } else { // Post has not changed } } else { // Post has not yet been inserted into the database if (post.FileMd5 != null) { string timestampString = post.TimestampedFilename.ToString(); string radixString = Path.Combine(timestampString.Substring(0, 4), timestampString.Substring(4, 2)); if (Config.FullImagesEnabled) { Directory.CreateDirectory(Path.Combine(ImageDownloadLocation, board, radixString)); string fullImageFilename = Path.Combine(ImageDownloadLocation, radixString, post.TimestampedFilenameFull); string fullImageUrl = $"https://i.4cdn.org/{board}/{post.TimestampedFilenameFull}"; await DownloadFile(fullImageUrl, fullImageFilename); } if (Config.ThumbnailsEnabled) { Directory.CreateDirectory(Path.Combine(ThumbDownloadLocation, board, radixString)); string thumbFilename = Path.Combine(ThumbDownloadLocation, radixString, $"{post.TimestampedFilename}s.jpg"); string thumbUrl = $"https://i.4cdn.org/{board}/{post.TimestampedFilename}s.jpg"; await DownloadFile(thumbUrl, thumbFilename); } } postsToAdd.Add(post); } } if (threadHashes.Count == 0) { // We are inserting the thread for the first time. await dbCommands.WithAccess(() => dbCommands.InsertPosts(thread.Posts)); } else { if (postsToAdd.Count > 0) { await dbCommands.WithAccess(() => dbCommands.InsertPosts(postsToAdd)); } } foreach (var post in postsToAdd) { threadHashes[post.PostNumber] = post.GenerateAsagiHash(); } Program.Log($"[Asagi] {postsToAdd.Count} posts have been inserted from thread /{board}/{thread.OriginalPost.PostNumber}"); List <ulong> postNumbersToDelete = new List <ulong>(thread.Posts.Length); foreach (var postNumber in threadHashes.Keys) { if (thread.Posts.All(x => x.PostNumber != postNumber)) { // Post has been deleted Program.Log($"[Asagi] Post /{board}/{postNumber} has been deleted"); await dbCommands.WithAccess(() => dbCommands.DeletePostOrThread(postNumber)); postNumbersToDelete.Add(postNumber); } } // workaround for not being able to remove from a collection while enumerating it foreach (var postNumber in postNumbersToDelete) { threadHashes.Remove(postNumber, out _); } if (thread.OriginalPost.Archived == true) { // We don't need the hashes if the thread is archived, since it will never change // If it does change, we can just grab a new set from the database ThreadHashes.TryRemove(hashObject, out _); } }
/// <inheritdoc/> public async Task <IList <QueuedImageDownload> > ConsumeThread(Thread thread, string board) { var hashObject = new ThreadPointer(board, thread.OriginalPost.PostNumber); List <QueuedImageDownload> imageDownloads = new List <QueuedImageDownload>(thread.Posts.Length); async Task ProcessImages(Post post) { if (!Config.FullImagesEnabled && !Config.ThumbnailsEnabled) { return; // skip the DB check since we're not even bothering with images } if (post.FileMd5 != null) { MediaInfo mediaInfo = await GetMediaInfo(post.FileMd5, board); if (mediaInfo?.Banned == true) { Program.Log($"[Asagi] Post /{board}/{post.PostNumber} contains a banned image; skipping"); return; } if (Config.FullImagesEnabled) { string fullImageName = mediaInfo?.MediaFilename ?? post.TimestampedFilenameFull; string radixString = Path.Combine(fullImageName.Substring(0, 4), fullImageName.Substring(4, 2)); string radixDirectory = Path.Combine(ImageDownloadLocation, board, radixString); Directory.CreateDirectory(radixDirectory); string fullImageFilename = Path.Combine(radixDirectory, fullImageName); string fullImageUrl = $"https://i.4cdn.org/{board}/{post.TimestampedFilenameFull}"; imageDownloads.Add(new QueuedImageDownload(new Uri(fullImageUrl), fullImageFilename)); } if (Config.ThumbnailsEnabled) { string thumbImageName; if (post.ReplyPostNumber == 0) // is OP { thumbImageName = mediaInfo?.PreviewOpFilename ?? $"{post.TimestampedFilename}s.jpg"; } else { thumbImageName = mediaInfo?.PreviewReplyFilename ?? $"{post.TimestampedFilename}s.jpg"; } string radixString = Path.Combine(thumbImageName.Substring(0, 4), thumbImageName.Substring(4, 2)); string radixDirectory = Path.Combine(ThumbDownloadLocation, board, radixString); Directory.CreateDirectory(radixDirectory); string thumbFilename = Path.Combine(radixDirectory, thumbImageName); string thumbUrl = $"https://i.4cdn.org/{board}/{post.TimestampedFilename}s.jpg"; imageDownloads.Add(new QueuedImageDownload(new Uri(thumbUrl), thumbFilename)); } } } if (!ThreadHashes.TryGetValue(hashObject, out var threadHashes)) { // Rebuild hashes from database, if they exist var hashes = await GetHashesOfThread(hashObject.ThreadId, board); threadHashes = new SortedList <ulong, int>(); foreach (var hashPair in hashes) { threadHashes.Add(hashPair.Key, hashPair.Value); var currentPost = thread.Posts.FirstOrDefault(post => post.PostNumber == hashPair.Key); if (currentPost != null) { await ProcessImages(currentPost); } } ThreadHashes.TryAdd(hashObject, threadHashes); } List <Post> postsToAdd = new List <Post>(thread.Posts.Length); foreach (var post in thread.Posts) { if (threadHashes.TryGetValue(post.PostNumber, out int existingHash)) { int hash = CalculateAsagiHash(post, true); if (hash != existingHash) { // Post has changed since we last saved it to the database Program.Log($"[Asagi] Post /{board}/{post.PostNumber} has been modified"); await UpdatePost(post, board, false); threadHashes[post.PostNumber] = hash; } else { // Post has not changed if (post.ReplyPostNumber == 0) { // OP post await UpdatePostExif(post, board); } } } else { // Post has not yet been inserted into the database postsToAdd.Add(post); await ProcessImages(post); } } if (threadHashes.Count == 0) { // We are inserting the thread for the first time. await InsertPosts(thread.Posts, board); } else { if (postsToAdd.Count > 0) { await InsertPosts(postsToAdd, board); } } foreach (var post in postsToAdd) { threadHashes[post.PostNumber] = CalculateAsagiHash(post, true); } Program.Log($"[Asagi] {postsToAdd.Count} posts have been inserted from thread /{board}/{thread.OriginalPost.PostNumber} ({imageDownloads.Count} media items enqueued)"); List <ulong> postNumbersToDelete = new List <ulong>(thread.Posts.Length); foreach (var postNumber in threadHashes.Keys) { if (thread.Posts.All(x => x.PostNumber != postNumber)) { // Post has been deleted Program.Log($"[Asagi] Post /{board}/{postNumber} has been deleted"); await SetUntracked(postNumber, board, true); postNumbersToDelete.Add(postNumber); } } // workaround for not being able to remove from a collection while enumerating it foreach (var postNumber in postNumbersToDelete) { threadHashes.Remove(postNumber, out _); } threadHashes.TrimExcess(); if (thread.OriginalPost.Archived == true) { // We don't need the hashes if the thread is archived, since it will never change // If it does change, we can just grab a new set from the database ThreadHashes.TryRemove(hashObject, out _); } return(imageDownloads); }