Puppet module to manage Druid based on the Imply.io stack. This module manage all the Druid daemons and Pivot.
Some modifications will be implemented to support the Druid.io version in a near future.
This module will deploy the Imply.io tarball (See: http://imply.io/download) and will give you the possibility to start the different Druid services but also Pivot.
More information about the Imply.io bundle here: http://imply.io.
Files managed by this module:
- Deploy the imply tarball using Archive: puppet-archive
- Modify configuration in (by default):
/opt/imply/conf
- Manage all Druid and Pivot services:
/etc/init.d/druid-*
If asked, the module will also deploy Java and Nodejs.
Deploy the version 1.1.0 of the Imply bundle:
class { 'druid':
imply_version => '1.1.0'
}
If you also want to install Java:
class { 'druid':
install_java => true,
}
By default, the package 'openjdk-8-jdk' from the PPA ppa:openjdk-r/ppa' will be deployed. You can override this configuration.
Configure a Master node:
class { 'druid': }
class { 'druid::coordinator': }
class { 'druid::overlord': }
Configure a Data node:
class { 'druid': }
class { 'druid::middle_manager': }
class { 'druid::historical': }
Configure a Query node:
class { 'druid': }
class { 'druid::broker': }
class { 'druid::pivot': }
By default the class druid::pivot
will not deploy Nodejs. You can use another Puppet module to deploy it before starting Pivot or use the install_nodejs
parameter:
class { 'druid::pivot':
install_nodejs => false,
}
Here is an example with MySQL as a Metadata Storage and Statsd emitter for the performance metrics:
class { 'druid':
java_classpath_extensions => [
'io/druid/extensions/mysql-metadata-storage/0.8.2/mysql-metadata-storage-0.8.2.jar',
'mysql/mysql-connector-java/5.1.34/mysql-connector-java-5.1.34.jar'
],
common_config => {
'extensions' => {
'localRepository' => 'dist/druid/extensions-repobla',
'coordinates' => [],
},
'metadata' => {
'storage' => {
'type' => 'mysql',
'connector' => {
'connectURI' => 'jdbc:mysql://db.example.com:3306/druid',
'user' => 'foo'
'password' => 'bar',
}
},
},
'emitter' => 'statsd',
'emitter.statsd.hostname' => 'localhost',
'emitter.statsd.port' => 8125,
}
}
Deploy the coordinator with some specific configuration:
class { 'druid::coordinator':
config => {
'coordinator' => {
'period' => 'PT30S',
'period.indexingPeriod' => 'PT900S',
}
}
}
Logstash:
This module allows one to optionally add a second log4j2 appender that writes to a json_lines enabled logstash TCP socket (using https://github.com/DNSBelgium/log4j-jsonevent-layout)
class { 'druid':
logstash_server => 'log-endpoint.server.rocks',
logstash_port => 4561,
logstash_user_fields => "servertype:druid, ip:${::ipaddress}",
}
It is recommended to add:
-DLog4jContextSelector=org.apache.logging.log4j.core.async.AsyncLoggerContextSelector
to your jvm options when using the logstash output so as to avoid blocking.
Fetch and deploy the Imply.io tarball.
This class will also deploy the Druid common configuration. See: Configuring Druid
Parameters within druid
:
Version of the Imply.io tarball to deploy. See: http://imply.io/download
Default: 1.2.1
Define the installation method. For now, only tarball is supported.
Default: tarball
Where to deploy the tarball
Default: /opt
Name of the destination link
Default: imply
If true, the module will try to install Java. This parameter is used with java_ppa
and java_package
.
For the moment, only Debian link distribution are supported.
Default: false
Define the name of the Ubuntu PPA which will be used to deploy Java
Requirement: $osfamily == Debian
Default: ppa:openjdk-r/ppa
Package name of Java
Default: openjdk-8-jdk
Java Home directory
Default: /usr/lib/jvm/java-8-openjdk-amd64
Druid configuration directory
Default: /opt/imply/conf/druid
Druid distribution directory
Default: /opt/imply/dist/druid
Druid username
Default: druid
Druid group name
Default: druid
If true, the module will start the Druid services and restarts them when configuration changes are applied
Default: true
Define where Druid will find all the JAR
Default: /opt/imply/dist/druid/lib/*
Define the list of Java extensions to load at the Druid services start
Example, If you want to use MySQL as your metadata storage:
druid::java_classpath_extensions:
- 'io/druid/extensions/mysql-metadata-storage/0.8.2/mysql-metadata-storage-0.8.2.jar'
- 'mysql/mysql-connector-java/5.1.34/mysql-connector-java-5.1.34.jar'
Default: []
Log directory
Default: /var/log/druid
Hash defining the Druid Common configuration
See: http://druid.io/docs/latest/configuration/index.html
Default: {}
Hostname or IP of a logstash server listening for json_lines via TCP. Enables the appender when defined.
Default: undef
Port for the above server.
Default: 4561
String of key:value pairs separated by commas. Allows one to define custom fields in the json being sent to logstash.
Ex: "hostname:druidbox01, region:us-east-1"
Default: ''
Each Druid Node (See http://druid.io/docs/latest/design/design.html) has its own Puppet class.
Each of these classes will use the Puppet Type druid::node
to define the configuration and the daemon to start.
Parameters within druid::coordinator,
druid::overlord,
druid::historical,
druid::middle_manager,
druid::broker`:
Name of the Druid Node
Default: name of the class. For druid::coordinator, $service == 'coordinator'
Listening host of the Druid Node
Default: localhost
Listening port of the Druid Node
Default: 8083
Java options for the Java daemon
Default: []
Hash defining the configuration of the Druid Node
Default: {}
This class will deployed and configure pivot.
Parameters within druid::pivot`:
String setting the home directory for the imply-ui distribution
Default: /opt/imply/dist/imply-ui
String setting the configuration directory of the imply-ui distribution
Default: /opt/imply/conf/pivot
Hash defining the configuration of the state storage options for the imply-ui
Default: {}
Port of Pivot
Default: 9095
Broker host used by Pivot
Default: localhost:8082
Print logs to stdout
Default: true
Enable file logging
Default: true
Location for Pivot log files
Default: /var/log/pivot
Location for Pivot license source
Default: undef
Max number of worker processes
Default: 0
If true, use a segment metadata query instead of a GET request to /druid/v2/datasources to determine datasource dimensions and metrics.
Default: false
Check for new dataSources periodically. Set to 0 to disable background introspection
Default: 0
Checks for new dataSources every time Pivot is loaded
Default: false
If true, the module will install NodeJS
Default: false
Version of NdeJS to install
Default: latest
This module has only been tested with Ubuntu 14.04 and Puppet 3.8.x but should work with any other Linux distribution.
Since the module uses a Launchpad PPA if java_ppa
is not set as undef
, you will have to change the default value if you are not on Debian-like OS.
See CONTRIBUTING.md