May 09 2008

Not very much fun at all

I don't know if I've mentioned this before, but I work in the webby world, and a lot of what I do is move sites from one format to another. What I have before me at this moment is a very, very, very large site. Bonus: it's fairly well-structured, with named divs and such. Now, all I need from it is the stuff that's in a div of a particular class.

Easy, right? Just start at <div id="whatever"> and stop at </div>!

Well, no. Because it's possible, and is often the case, that there's something like:

<div id="whatever">
Stuff stuff stuff
<div id="something-else">
Stuff stuff stuff
More stuff

There can be, theoretically, infinite divs nested in the div of interest. So I would need to keep track of them, and make sure that I didn't stop pulling text until the exactly correct . I can imagine doing this, and I'm sure that with enough time I actually could accomplish this feat, but I know for a fact I would not come out of it without a headache. So back to Google I go, hoping that someone else has invented this particular wheel (and slapped it on a web page, described in a fashion that I can accurately search for).