Detecting based on header's charset and html meta charset. Automatically convert to UTF-8.

v0.8.2 2015-07-19 15:53 UTC


Automatically convert to UTF-8.

Master: Build Status Coverage Status

Detecting based on header's charset & html meta charset.

(handling several charset more carefully - SJIS-win, TIS-620 and others..)

This library aims to used in web-scraping.


  • PHP 5.3 or over
  • mbstring and iconv


  1. wrap response object:
use Diggin\Http\Charset\WrapperFactory;
$client = new Zend\Http\Client($url);
$response = $client->send();
$response = WrapperFactory::factory($response); // then, response getBody() return with converted UTF-8.

Please see more at demos/Diggin/Http/Charset .

Guzzle & Goutte

guzzle-plugin-AutoCharsetEncodingPlugin supports for using with Guzzle3.

Usage of with Behat by @MugeSo

Technical Information

Diggin_Http_Charset is based on HTMLScraping.


Diggin_Http_Charset is licensed under LGPL(GNU Lesser General Public License).

Similar library


  • handling non text/html content types.
  • better APIs & according ZF2 coding standard.
  • struggle in more charset :-\