#18 open
Kadvin

Support Multiple Encoding other than latin

Reported by Kadvin | December 28th, 2009 @ 01:49 AM

The anemone should support multiple encoding, it involves:
1. URL like: http://www.china.com/tag/中国 should be supported
PS: URI(str_uri) will throw exception for such url

  1. Specifiy the page enconding according to the http response or page meta directive or global settings when parsing the page body with Nokogiri

Comments and changes to this ticket

Please Sign in or create a free account to add a new ticket.

With your very own profile, you can contribute to projects, track your activity, watch tickets, receive and update tickets through your email and much more.

New-ticket Create new ticket

Create your profile

Help contribute to this project by taking a few moments to create your personal profile. Create your profile ยป

Anemone is a Ruby library that makes it quick and painless to write programs that spider a website. It provides a simple DSL for performing actions on every page of a site, skipping certain URLs, and calculating the shortest path to a given page on a site.

Shared Ticket Bins

People watching this ticket

Attachments

Pages