DBLP Citation Reporter

Creates a report of impactful Computer Science conferences in the DBLP, based on citations.

Based on the DBLPParser by Isaac Changhau, a python parser for the dataset of the DBLP computer science bibliography. The XML format of DBLP data can be downloaded as a .dtd file from here.

Software needed: make to run scripts easily, wget to download dataset from DBLP, gzip to uncompress dataset, Python 3 to parse xml. Windows users can install make, wget, gzip all from here.

Useful make targets and commands:

TODO: make report will output a small report of the most cited conferences
make dataset will download and decompress the most recent dataset from DBLP automatically (may take some time)
make clean will remove the dataset so that a fresh and updated dataset can be downloaded
make venv will create the python virtual environment and setup the pip package requirements
make clean-venv will remove the python virtual environment so it can be recreated from scratch

The DBLP Dataset Parser

The parser requires dtd file, so make sure you have both dblp-XXX.xml (dataset) and dblp-XXX.dtd files. Note that you also should guarantee that both xml and dtd files are in the same directory, and the name of dtd file shoud same as the name given in the <!DOCTYPE> tag of the xml file. Such information can be easily accessed through head dblp-XXX.xml command. As shown below

<?xml version="1.0" encoding="ISO-8859-1"?>
<!DOCTYPE dblp SYSTEM "dblp-2017-08-29.dtd">
<dblp>
<phdthesis mdate="2016-05-04" key="phd/dk/Heine2010">
<author>Carmen Heine</author>
<title>Modell zur Produktion von Online-Hilfen.</title>
...

A sample to use the parser:

def main():
    dblp_path = 'dataset/dblp.xml'
    save_path = 'article.json'
    try:
        context_iter(dblp_path)
        log_msg("LOG: Successfully loaded \"{}\".".format(dblp_path))
    except IOError:
        log_msg("ERROR: Failed to load file \"{}\". Please check your XML and DTD files.".format(dblp_path))
        exit()
    parse_article(dblp_path, save_path, save_to_csv=False)  # default save as json format

Name		Name	Last commit message	Last commit date
Latest commit History 14 Commits
dataset		dataset
img		img
src		src
.gitignore		.gitignore
LICENSE		LICENSE
Makefile		Makefile
README.md		README.md
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

DBLP Citation Reporter

The DBLP Dataset Parser

About

Contributors 3

Languages

License

sublime09/DBLPciteReporter

Folders and files

Latest commit

History

Repository files navigation

DBLP Citation Reporter

The DBLP Dataset Parser

About

Topics

Resources

License

Stars

Watchers

Forks

Contributors 3

Languages