He utilizado this tutorial to fetch all the content of some webpage via c# code.

I now want to gather into an IEnumerable collection all the strings which are decorated in the following text pattern: (i.e. MY-TEXT)

data-address=" MY-TEXT "></

How can I do that? I tried using "string.split()" but got to many "white noises".

¿Alguna idea?

What webpage is that? Is it HTML (which doesn't have any data-address attribute AFAIK)? Or XML? -

Una mejor solución es usar HtmlAgilityPack and let it handle the parsing/scraping for you. Here is an example:

var web = new HtmlWeb();
var doc = web.Load("http://www.stackoverflow.com");

var nodes = doc.DocumentNode.SelectNodes("//[@data-address]");

foreach (var node in nodes)

This will fetch stackoverflow.com, find all elements which has a data-address attribute and then print the value of that attribute.

few questions:1) I got the following error:"Expression must evaluate to a node-set". What went wrong? 2)how did you get to this opensource dll? just for me to know for the next time. - Elad Benda

If the page is well formed I'd load the content into an XDocument and query over it with LINQ to XML.

You (probably) can't load HTML into an XDocument, event if it is well-formed. - svick

@alexn is right. A small correction though:

  var nodes = doc.DocumentNode.SelectNodes("//*[@data-address]");

added the *

