NSXMLDocument, busque con nodesForXPath:

I need to search through an HTML document for two specific strings of text in cocoa. I am creating an NSXMLDocument with the web page: Page Example Then I am trying to search it for the app title, and the url of the icon. I am currently using this code to search for the specific strings:

NSString *xpathQueryStringTitle = @"//div[@id='desktopContentBlockId']/div[@id='content']/div[@class='padder']/div[@id='title' @class='intro has-gcbadge']/h1";
NSString *xpathQueryStringIcon = @"//div[@id='desktopContentBlockId']/div[@id='content']/div[@class='padder']/div[@id='left-stack']/div[@class='lockup product application']/a";
NSArray *titleItemsNodes = [document nodesForXPath:xpathQueryStringTitle error:&error];
if (error)
    {
        [[NSAlert alertWithError:error] runModal];
        return;
    }
error = nil;
NSArray *iconItemsNodes = [document nodesForXPath:xpathQueryStringIcon error:&error];
    if (error)
    {
        [[NSAlert alertWithError:error] runModal];
        return;
    }

When I try to search for these strings I get the error: "XQueryError:3 - "invalid token (@) - ./*/div[@id='desktopContentBlockId']/div[@id='content']/div[@class='padder']/div[@id='title' @class='intro has-gcbadge']/h1" at line:1"

I am loosely following this tutoriales.

I also tried this without all of the @ symbols in the xPath, and it also returns an error. My syntax is obviously wrong for the xPath. What would the basic syntax be for this path. I've seen plenty of examples with a basic XML tree, but not html.

preguntado el 08 de noviembre de 11 a las 19:11

2 Respuestas

I suspect it's that part near then end where you have a test for two attributes

/div[@id='title' @class='intro has-gcbadge']/h1";

Intente cambiarlo a:

/div[@id='title'][@class='intro has-gcbadge']/h1";

respondido 08 nov., 11:23

That fixed the problem! Thank you. This returns what I need, but I need to modify the returned strings. For the first string, i get "<h1>App Title</h1>, what would I add to get just the text inside the <h1>? On the second string, the i get the entire <img width="111" src="link"> how would I return the value of link from the src tag? - Brandon Mcq

OP's additional questions (from comments):

but I need to modify the returned strings. For the first string, i get "<h1>App Title</h1>, what would I add to get just the text inside the <h1>?

Utilizan:

/div[@id='title' and @class='intro has-gcbadge']/h1/text()

o usar:

string(/div[@id='title' and @class='intro has-gcbadge']/h1)

On the second string, the i get the entire <img width="111" src="link"> how would I return the value of link from the src etiqueta?

Utilizan:

YorSecond-Not-Shown-Expression/@src

o usar:

string(YorSecond-Not-Shown-Expression/@src)

respondido 09 nov., 11:09

No es la respuesta que estás buscando? Examinar otras preguntas etiquetadas or haz tu propia pregunta.