PHPプログラムに関する各種メモ書き

PHPでSafariの「リーダー表示」のように記事を抜き出す

● PHPでSafariの「リーダー表示」のように記事を抜き出す

https://packagist.org/search/?q=Readability

● j0k3r/php-readability

https://packagist.org/packages/j0k3r/php-readability

composer require j0k3r/php-readability
use Readability\Readability;

$url = 'http://www.medialens.org/index.php/alerts/alert-archive/alerts-2013/729-thatcher.html';

// you can use whatever you want to retrieve the html content (Guzzle, Buzz, cURL ...)
$html = file_get_contents($url);

$readability = new Readability($html, $url);
// or without Tidy
// $readability = new Readability($html, $url, 'libxml', false);
$result = $readability->init();

if ($result) {
    // display the title of the page
    echo $readability->getTitle()->textContent;
    // display the *readability* content
    echo $readability->getContent()->textContent;
} else {
    echo 'Looks like we couldn\'t find the content. :(';
}

関連エントリー

No.1093
07/08 00:19

edit

composer