Análisis de HTMLAgillityPack

I am trying to parse the following data from an HTML document using HTMLAgillityPack:

<a href="">abilene</a> <br>
<a href=""><b>albany</b></a> <br>
<a href="">amarillo</a> <br>

I would like parse out the URL and the name of the city into 2 separate files.




Esto es lo que tengo hasta ahora:

        public void ParseHtml()
        //Clear text box 

        //managed wrapper around the HTML Document Object Model (DOM). 
        HtmlAgilityPack.HtmlDocument hDoc = new HtmlAgilityPack.HtmlDocument();

        //Load file

            //Execute the input XPath query from text box
            foreach (HtmlNode hNode in hDoc.DocumentNode.SelectNodes(xpathText.Text))
                    textBox1.Text += hNode.InnerHtml + "\r\n";

        catch (NullReferenceException nre)
            textBox1.Text += "Can't process XPath query, modify it and try again.";

Any help would be greatly appreciated! Thanks guys!

preguntado el 10 de marzo de 12 a las 08:03

Creo que el este can be use full for you -

Perfect! Got all 500 URLs in 30 seconds... -

I still need to get the cities from the HTML. -

No Node Have the value of a? -

1 Respuestas

I get it that you want to parse them from
Así es como lo haría.

List<string> links = new List<string>();
List<string> names = new List<string>();
HtmlDocument doc = new HtmlDocument();
//Load the Html
doc.Load(new WebClient().OpenRead(""));
//Get all Links in the div with the ID = 'list' that have an href-Attribute
HtmlNodeCollection linkNodes = doc.DocumentNode.SelectNodes("//div[@id='list']/a[@href]");
//or if you have only the links already saved somewhere
//HtmlNodeCollection linkNodes = doc.DocumentNode.SelectNodes("//a[@href]");
if (linkNodes != null)
  foreach (HtmlNode link in linkNodes)
    links.Add(link.GetAttributeValue("href", ""));
    names.Add(link.InnerText);//Get the InnerText so you don't get any Html-Tags
//Write both lists to a File
File.WriteAllText("urls.txt", string.Join(Environment.NewLine, links.ToArray()));
File.WriteAllText("cities.txt", string.Join(Environment.NewLine, names.ToArray()));

respondido 11 mar '12, 14:03

Wow, perfect! Thank you very much! - John

No es la respuesta que estás buscando? Examinar otras preguntas etiquetadas or haz tu propia pregunta.