Google Scholar Profile Parser is a PHP library which parses the HTML of a scholar's profile page from Google Scholar website and transforms its data into a regular PHP data structure.
The parsed data from a scholar is:
- his/her list of publications (title, link, authors, publisher details, citations)
- his/her citations' statistics (number of citations, h-index, i10-index)
As explained by this Wikipedia page:
Google Scholar is a freely accessible web search engine that indexes the full text or metadata of scholarly literature across an array of publishing formats and disciplines.Google Scholar is a website which indexes scholars' publications and citations.
Unfortunately Google Scholar website doesn't provide an API and I needed a way to fetch a scholar's data.
So, while I was looking for a PHP library which parses a profile page from Google Scholar website, I only found Scholar parser from Daniel Schreij. But I was unhappy with this library's dependency upon PhantomJS which development is suspended (and will likely not resume, leaving users without support). So I decided to rewrite this library redesigning it to depend only on PHP, and no more Javascript.
As stated in composer.json, it requires:
- PHP 7.1+
- PHP DOM extension
To run this library on PHP 5.6+, install its version 1.x.
Use Composer to download and install this library as well as its dependencies.
composer require bborrel/google-scholar-profile-parser
See the examples in the library's documentation.
This library use SemVer for versioning. For available versions, see the tags on this repository. For feature changes, see the CHANGELOG.md file for details.
The code of this library:
- follows the PSR-1 and PSR-12 coding standards
- follows the PSR-4 autoloading standard
- is statically analysed with PHPQA (which wraps several tools, notably PHPCS, PHPMD, PHPStan and Psalm), and by Code Climate (which is setup with plugins Phan, PHPMD, SonarPHP)
- is unit tested with PHPUnit (code coverage on Coveralls)
- is mutation tested with Infection
- is tested for compatibility with different versions of PHP (see .travis.yml for details)
- has some of its dependencies (those listed by the PHP Security Advisories Database) checked for known security issues
- is continuously integrated on TravisCI
These tools are installed with the library as long as you do not specify the option --no-dev
when running the
install
or update
Composer commands.
To run the static analysis tools and the unit tests via PHPQA:
./vendor/bin/phpqa --analyzedDirs=. --ignoredDirs=build,tests,vendor --report
To see the reports generated by PHPQA use a browser to open the file ./build/phpqa.html
.
This library is licensed under the GPL-3.0-only License, see the LICENSE.md file for details.