private static void HandleAKAs(JMoviesEntities entities, Production production, Production savedProduction) { Movie movie = production as Movie; Movie savedMovie = savedProduction as Movie; if (movie != null && movie.AKAs != null) { foreach (AKA aka in movie.AKAs.ToArray()) { EntityEntry entry = null; bool saved = false; if (savedMovie != null) { AKA savedAKA = entities.AKA.FirstOrDefault(e => e.Name == aka.Name && e.Description == aka.Description && aka.ProductionID == savedMovie.ID); if (savedAKA != null) { aka.ID = savedAKA.ID; entry = CommonDBHelper.MarkEntityAsUpdated(entities, aka); saved = true; } } aka.ProductionID = production.ID; if (!saved) { aka.ID = CommonDBHelper.GetNewID <AKA>(entities, e => e.ID); entry = entities.AKA.Add(aka); } entities.SaveChanges(); CommonDBHelper.DetachAllEntries(entities); } } }
/// <summary> /// Method responsible for parsing the details section of the movie page /// </summary> /// <param name="movie">Movie instance to be populated</param> /// <param name="detailsSection">HTML Node containing the Details section</param> public static void ParseDetailsSection(Movie movie, HtmlNode detailsSection) { foreach (HtmlNode detailBox in detailsSection.QuerySelectorAll(".txt-block")) { HtmlNode headerNode = detailBox.QuerySelector("h4"); if (headerNode != null) { string headerContent = headerNode.InnerText.Prepare(); if (IMDbConstants.OfficialSitesHeaderRegex.IsMatch(headerContent)) { List <OfficialSite> officialSites = new List <OfficialSite>(); Parallel.ForEach(detailBox.QuerySelectorAll("a"), (HtmlNode officialSiteLink) => { try { string url = IMDbConstants.BaseURL + officialSiteLink.Attributes["href"].Value; HttpWebRequest request = (HttpWebRequest)WebRequest.Create(url); request.AllowAutoRedirect = false; using (HttpWebResponse response = (HttpWebResponse)request.GetResponse()) { string redirectURL = response.Headers["Location"]; officialSites.Add(new OfficialSite { Title = officialSiteLink.InnerText.Prepare(), URL = redirectURL }); } } catch { //simply ignore official site errors } }); movie.OfficialSites = officialSites; } else if (IMDbConstants.CountriesHeaderRegex.IsMatch(headerContent)) { List <ProductionCountry> countries = new List <ProductionCountry>(); foreach (HtmlNode countryLink in detailBox.QuerySelectorAll("a")) { Match countryMatch = IMDbConstants.CountryOfOriginRegex.Match(countryLink.OuterHtml); if (countryMatch.Success) { Country country = new Country { Identifier = countryMatch.Groups[1].Value, Name = countryLink.InnerText.Prepare() }; countries.Add(new ProductionCountry { Country = country, Production = movie }); } } movie.Countries = countries; } else if (IMDbConstants.LanguagesHeaderRegex.IsMatch(headerContent)) { List <ProductionLanguage> languages = new List <ProductionLanguage>(); foreach (HtmlNode languageLink in detailBox.QuerySelectorAll("a")) { Match languageMatch = IMDbConstants.PrimaryLanguageRegex.Match(languageLink.OuterHtml); if (languageMatch.Success) { Language language = new Language { Identifier = languageMatch.Groups[1].Value, Name = languageLink.InnerText.Prepare() }; languages.Add(new ProductionLanguage() { Language = language, Production = movie }); } } movie.Languages = languages; } else if (IMDbConstants.ReleaseDateHeaderRegex.IsMatch(headerContent)) { //Release Dates are fetched from release info page seperately } else if (IMDbConstants.AKAHeaderRegex.IsMatch(headerContent)) { AKA aka = new AKA { Name = headerNode.NextSibling.InnerText.Prepare() }; movie.AKAs = new List <AKA>() { aka }; } else if (IMDbConstants.FilmingLocationsHeaderRegex.IsMatch(headerContent)) { List <string> filmingLocations = new List <string>(); foreach (HtmlNode locationLinkNode in headerNode.ParentNode.QuerySelectorAll("a")) { Match locationLinkMatch = IMDbConstants.LocationsLinkRegex.Match(locationLinkNode.OuterHtml); if (locationLinkMatch.Success) { filmingLocations.Add(locationLinkMatch.Groups[1].Value.Prepare()); } } movie.FilmingLocations = filmingLocations; } else if (IMDbConstants.BudgetHeaderRegex.IsMatch(headerContent)) { movie.Budget = new Budget(); movie.Budget.Amount = headerNode.NextSibling.InnerText.Prepare().ToAmount(); movie.Budget.Description = string.Empty; foreach (HtmlNode attributeNode in headerNode.ParentNode.QuerySelectorAll(".attribute")) { Match attributeMatch = GeneralRegexConstants.PharantesisRegex.Match(attributeNode.InnerText.Prepare()); if (attributeMatch.Success) { if (!string.IsNullOrEmpty(movie.Budget.Description)) { movie.Budget.Description += " "; } movie.Budget.Description += attributeMatch.Groups[1].Value; } } } else if (IMDbConstants.ProductionCompanyHeaderRegex.IsMatch(headerContent)) { List <Company> productionCompanies = new List <Company>(); foreach (HtmlNode productionCompanyNode in headerNode.ParentNode.QuerySelectorAll("a")) { Match companyIDMatch = IMDbConstants.ProductionCompanyLinkRegex.Match(productionCompanyNode.Attributes["href"].Value); if (companyIDMatch.Success) { Company productionCompany = new Company(); productionCompany.Name = productionCompanyNode.InnerText.Prepare(); productionCompany.ID = companyIDMatch.Groups[1].Value.ToLong(); productionCompanies.Add(productionCompany); } } movie.ProductionCompanies = productionCompanies; } else if (IMDbConstants.RuntimeHeaderRegex.IsMatch(headerContent)) { movie.Runtime = headerNode.ParentNode.QuerySelector("time").Attributes["datetime"].Value.ToHtmlTimeSpan(); } } } }