Posted By

denakitan on 11/05/12


Tagged

fetch parse html Net c# WebClient HtmlAgilityPack


Versions (?)

Who likes this?

2 people have marked this snippet as a favorite

Priestd09
jjarvis4


.NET - C# - WebClient and HtmlAgilityPack - Fetching and Parsing HTML


 / Published in: C#
 

URL: http://htmlagilitypack.codeplex.com/

Shows how to use WebClient class to retrieve HTML from an URL and then to use HtmlAgilityPack to parse it.

  1. using System;
  2. using System.Net;
  3. using System.Collections.Generic;
  4.  
  5. using System.Linq;
  6. using HtmlAgilityPack;
  7.  
  8. namespace PixarWebClient
  9. {
  10. public class PixarWebClient
  11. {
  12. public static void Main(string[] args)
  13. {
  14. using (WebClient client = new WebClient())
  15. {
  16. // fetching HTML
  17. string pixarHtml = client.DownloadString("http://en.wikipedia.org/wiki/List_of_Pixar_films");
  18.  
  19. HtmlDocument document = new HtmlDocument();
  20. document.LoadHtml(pixarHtml);
  21.  
  22. HtmlNode pixarTable = (from d in document.DocumentNode.Descendants()
  23. where d.Name == "table" && d.Attributes["class"].Value == "sortable wikitable"
  24. select d).First();
  25.  
  26. IEnumerable<HtmlNode> pixarRows = from d in pixarTable.Descendants() where d.Name == "tr" select d;
  27.  
  28. // removing first row that contains header information
  29. pixarRows.ElementAt(0).Remove();
  30.  
  31. foreach (HtmlNode row in pixarRows)
  32. {
  33. IEnumerable<HtmlNode> columns = from d in row.Descendants() where d.Name == "td" select d;
  34.  
  35. int count = 0;
  36. string title = string.Empty;
  37.  
  38. foreach (HtmlNode column in columns)
  39. {
  40. if (count > 1)
  41. break;
  42.  
  43. if (count == 0) {
  44. title = column.Element("i").Element("a").InnerText;
  45. } else {
  46. Console.WriteLine(column.InnerText + " - " + title);
  47. }
  48.  
  49. count++;
  50. }
  51. }
  52. }
  53. }
  54. }
  55. }

Report this snippet  

Comments

RSS Icon Subscribe to comments
Posted By: yousafzian7 on June 19, 2013

Hi denakitan i came to and registered here to ask u question.. i really need ur help i m stuck in my final year project. in the above code u type [ where d.Name == "table" && d.Attributes["class"].Value == "sortable wikitable" ] but the main problem is i want to get the data from this link http://www.ise.com.pk/TradeScreen.asp but if u check the course code there is not CLASS in the table. So please help me out how can i get that data using c sharp and dump it in my database. please reply on my email [email protected] i registered this email but not getting the conformation email. so i get login using facebook

Posted By: anke909 on December 25, 2018

You just check why this is s comfortable for the use and what are the other function make it credential fix connections to wireless displays in windows 10 so this new technology would be very useful for you in the time of bluetooth connectivity uses. Thanks a lot

You need to login to post a comment.