akdr/selma

Library for easier use of PHP-Webdriver

v0.6.0 2022-12-06 18:58 UTC

README

A PHP-Webdriver wrapper trying to simplify the usage of web-scraping.

Usage

To use the wrapper you need to have a Selenium Hub up and running.

To set it up, google or use the docker-compose.yml in the docker directory. Instructions how to start it once you have docker installed is in the file.

Navigation

Navigation handles the browser navigation and manipulation. Its used by the Element class and needs to be started before you try to scrape.

// Example of starting a navigation. The first argument is the location of Selenium Hub and 
// the second is Chrome-options.
use ComparicoAB\Selma\Navigation;
$nav = new Navigation('http://localhost:4444/wd/hub', ["window-size=1920,4000", "--headless", "--disable-gpu", "--no-sandbox"]);
Available methods:

Element

The Element class handles everything DOM-related. It searches for DOM-elements, extracts text, filling in inputs and clicking elements.

// Example of using the Element to fill out a form and then clicking the submit button.
use Akdr\Selma\Element;
use Akdr\Selma\Navigation;

$nav = new Navigation('http://localhost:4444/wd/hub', ["window-size=1920,4000", "--headless", "--disable-gpu", "--no-sandbox"]);

// First time we need to initiate the Element to use our browser, 
// later we can keep using it with the method Set.

// Enter the text "Selma is being used" into the input.
$element = new Element($nav, [
    'selector' => '#form-input',
    'input' => "Selma is being used"
]);

// Click the submit button
$element->set([
    'selector' => '#submit-button',
    'click' => true,
    'delay' => 400000
]);

//Select the response, which is a span without a class or id inside a container.

$container = $element->set([
    'selector' => '.container'
]);

$response = $element->set([
    'element' => $container->element,
    'selector' => 'span',
    'attribute' => 'text'
]);

// Finally, read the response and get the integer while removing the rest of the text.
$response->getValue('int');
Available methods

To get an Element and manipulate it, you can either use the construct or the set() method. They both takes an array as argument and is executed in the order being presented. The selector key has to be set and always (with the exception of 'element') be first.

To get the value from attribute, chain ->getValue('int'|'float'|null) on to the construct or set().

Public properties

If you need to retrieve the RemoteWebElement, call for the property inside the class named 'element'.

If you need to retrieve the class-bool, call for the property 'hasClass'.

Example:

use ComparicoAB\Selma\Navigation;
use ComparicoAB\Selma\Element;

// Setup the browser and initiate the element class.
$nav = new Navigation('http://localhost:4444/wd/hub', ["window-size=1920,4000", "--headless", "--disable-gpu", "--no-sandbox"]);
$element = new Element($nav, []);

// <a href="https://comparico.se class="title">Title Number 3.14</a>
$title = $element->set([
    'selector' => 'a',
    'hasClass' => 'title',
    'attribute' => 'href'
]);

// Returns the RemoteWebElement
$title->element; // Facebook\WebDriver\Remote\RemoteWebElement

// Returns the class-bool
$title->hasClass; // true

// Fetch the attribute
$title->getValue(); // Title Number 3.14
$title->getValue('int'); // 314
$title->getValue('float'); // 3.14