Page object doesn't expose @body
Reported by Noah Gibbs | December 24th, 2009 @ 02:10 PM
I'd like to use Anemone to mirror chunks of multiple web sites, and crawl to do it. I don't see any good way to get the actual page contents as text (rather than a Nokogiri document), and I don't see any obvious way to get the original text from a Nokogiri document.
If you added an attr_accessor for :body in the Page object, I think that would fix the problem.
Comments and changes to this ticket
-
Noah Gibbs December 24th, 2009 @ 03:34 PM
Actually, it should probably be attr_reader rather than attr_accessor. Changing the body makes questionable sense in the first place, and doing it without changing the Nokogiri document makes no sense at all.
-
chris (at chriskite) January 22nd, 2010 @ 08:13 PM
- State changed from new to open
- Assigned user set to chris (at chriskite)
-
chris (at chriskite) January 22nd, 2010 @ 09:40 PM
- State changed from open to resolved
Please Sign in or create a free account to add a new ticket.
With your very own profile, you can contribute to projects, track your activity, watch tickets, receive and update tickets through your email and much more.
Create your profile
Help contribute to this project by taking a few moments to create your personal profile. Create your profile ยป
Anemone is a Ruby library that makes it quick and painless to write programs that spider a website. It provides a simple DSL for performing actions on every page of a site, skipping certain URLs, and calculating the shortest path to a given page on a site.
People watching this ticket
Referenced by
- 10 following <img> tag I plan to add an attr_accessor for Page body as part of t...