Skip to content

1904labs/nifi-standardize-date-bundle

Repository files navigation

nifi-standardize-date-bundle

NiFi processor to standardize date fields in a FlowFile.

Deploy Bundle

Clone this repository

git clone https://github.com/1904labs/nifi-standardize-date-bundle

Build the bundle

cd nifi-standardize-date-bundle
mvn initialize
mvn clean package

Copy Nar file to $NIFI_HOME/lib

cp nifi-standardize-date-bundle/target/nifi-standardize-date-nar-$version.nar $NIFI_HOME/lib/

Start/Restart Nifi

$NIFI_HOME/bin/nifi.sh start

Processor properties

FlowFile Format Specify the format of the incoming FlowFile. If AVRO, output is automatically Snappy compressed.

Avro Schema Specify the schema if the FlowFile format is Avro.

Invalid Dates JSON Object of key/value pairs with name of field in FlowFile as key and type of date as value. For example: {"my_date_field": "MM/dd/yyyy"}

Timezone The originating timezone of the date fields in the FlowFile. Short or standard IDs accepted (i.e. 'CST' or 'America/Chicago')

Notes

  • The incoming FlowFile is expected to be one JSON per line.
  • If the Invalid Dates property is not set, the processor automatically sends the FlowFile to the bypass relationship.
  • Avro is always Snappy compressed on output.
  • This processor uses a custom Avro library in order to handle Avro's union types. Until this issue is resolved, it will continue to use the custom library.

TODO

  • Use drop-down (with custom option) for timezone
  • Allow choice of Avro compression (Snappy, bzip2, etc.)
  • Infer Avro schema if not passed in
  • Better unit tests for Avro