Skip to content
Rüdiger Gleim edited this page Feb 8, 2018 · 6 revisions

WikiDragon is a Java framework to build and work on Wikimedia Wikis offline. The typical workflow is to

  • Download an XML dump from Wikimedia foundation (e.g. https://dumps.wikimedia.org/simplewiki/20180201/simplewiki-20180201-pages-meta-current.xml.bz2)
  • Import it into a local database using WikiDragon (note that you can import multiple Wikis into one database).
  • Optionally pre-parse HTML content using XOWA integration.
  • Optionally create PageTiers which represent time layers in a Wiki
  • Browse through Wikis, Namespaces, Pages, Revisions and Users using WikiDragon's Java API
  • Use export functions to export graph structures to graphml or contents to TEI P5

Getting started

  • Checkout a working copy. All relevant libraries are covered by maven dependencies except for XOWA which is used to render MediaWiki Markup into HTML.
  • Create directories xowa_win, xowa_linux or xowa_mac depending on your operating system in the project root.
  • Download http://xowa.org/home/wiki/Help/Download_XOWA.html and unzip it into the xowa_win/xowa_linux/xowa_maxc directory
  • Your resulting directory structure should look something like this:
    • xowa_linux
      • bin
      • user
      • readme.txt
      • xowa_linux_64.jar
      • xowa_linux_64.sh
  • Build the project using maven in your favourite Java IDE
Clone this wiki locally