peterujah / email-crawl
PHP Email Web Crawler. using curl and command line interface to extract emails from website.
- php: ^7.0 || ^8.0
PHP Email Web Crawler, is a simple and easy to use class that uses curl & command line interface to extract email address from websites. It also has the feature to deep extract email from website link which is found from the initial target website.
Installation is super-easy via Composer:
composer require peterujah/email-crawl
Basic Usage
Initalize email crawl instance
$craw = new EmailCrawl("", 200);
Star email crawling scan
Get scanned response and return CrawlResponse instance
$response = $craw->getResponse();
Get response emails separate in a new line
$data = $response->inLine();
Get response emails separate with a comma
$data = $response->withComma();
Get response emails as an array
$data = $response->asArray();
Print response email
Save response emails to file. This will save result as json string
Save response emails to file. If string data is passed it will save it, els it will save result as json string
$response->saveAs("/path/save/craw/", $data);
Create a file name it craw.php, inside the file add this example code.
With this example you can run your craw directly from command line, browser or php shell_exec
error_reporting(E_ALL); ini_set('display_errors', '1'); require __DIR__ . '/plugins/autoload.php'; use Peterujah\NanoBlock\EmailCrawl; $target = ""; $limit = 50; if(!empty($argv[1])){ if(filter_var($argv[1], FILTER_VALIDATE_URL)){ $target = $argv[1]; $limit = $argv[2]??50; }else{ $req = unserialize(base64_decode($argv[1])); $target = $req["target"]; $limit = $req["max"]??50; } } $craw = new EmailCrawl($target, $limit); $response = $craw->craw()->getResponse(); $data = $response->inLine(); $response->printCommandResult($data)->saveAs(__DIR__ . "/craw/", $data);
Execute craw through command line interface, run the below command
php craw.php 50
Execute craw through php shell_exec, create a file call exec.php and add below example script.
to your php executable path.
Once done navigate to
define("PHP_SHELL_EXECUTION_PATH", "path/to/php"); $crawOptions = array( 'target' => '', 'max' => 50, ); $crawRequest = base64_encode(serialize($crawOptions)); $crawScript = __DIR__ . "/craw.php"; $crawLogs = __DIR__ . "/craw_logs.log"; shell_exec(PHP_SHELL_EXECUTION_PATH . " " . $crawScript . " " . $crawRequest ." 'alert' >> " . $crawLogs . " 2>&1");
Is advisable to run this code in command line interface for be better performance.