This library allows you to read and write a list of elements (even from different types, but with the same parent) item by item from and to an XML file. The goal is to avoid loading a huge amount of data into memory when processing large files.
The dependency is available in maven central (see badge for version):
<dependency>
<groupId>com.chavaillaz</groupId>
<artifactId>jaxb-stream</artifactId>
</dependency>
You can find the following example in the StreamingTest
class. Note that this library also works with JAXB classes
generated by an XSD file. In that case, have a look at the StreamingXsdTest
class.
You are storing different types of metrics in an XML file. Because of memory constraints and the number of entries, you cannot load them at once as it would be done with JAXB by unmarshalling the file into the container class.
In order to process them anyway, you can use this library to read or write them, item by item.
In this example, an interface Metric
is implemented by multiple metric types:
- Disk metrics (class
DiskMetric
, XML elementdisk
) - Memory metrics (class
MemoryMetric
, XML elementmemory
) - Processor metrics (class
ProcessorMetric
, XML elementprocessor
)
Each metric defines an XML element by using the annotation @XmlRootElement
. Those metrics would usually be stored in
the container MetricsList
, representing a list of metrics (container). This list also defines an XML element, in that
case metrics
, the XML tag for that container.
Below an XML file from that example:
<?xml version="1.0" ?>
<metrics>
<disk>
<disk>/</disk>
<freePartitionSpace>688865050624</freePartitionSpace>
<usablePartitionSpace>544384016384</usablePartitionSpace>
<totalCapacity>700001001472</totalCapacity>
</disk>
<memory>
<freeMemory>521889952</freeMemory>
<maxMemory>8589934592</maxMemory>
<totalMemory>536870912</totalMemory>
</memory>
<processor>
<systemLoad>0.25</systemLoad>
<processLoad>0.18</processLoad>
<availableProcessors>16</availableProcessors>
</processor>
...
</metrics>
For example, to write two metrics (memory and processor metrics), the following code can be used:
try (StreamingMarshaller marshaller = new StreamingMarshaller(MetricsList.class)) {
marshaller.open(new FileOutputStream(fileName));
marshaller.write(MemoryMetric.class, new MemoryMetric());
marshaller.write(ProcessorMetric.class, new ProcessorMetric());
...
}
Note that you can also give the root element tag name instead of giving MetricsList.class
.
For example, to read the written metrics (memory and processor metrics), the following code can be used:
try (StreamingUnmarshaller unmarshaller = new StreamingUnmarshaller(MemoryMetric.class, ProcessorMetric.class)) {
unmarshaller.open(new FileInputStream(fileName));
unmarshaller.iterate((type, element) -> doWhatYouWant(element));
}
or by iterating over each element by yourself:
try (StreamingUnmarshaller unmarshaller = new StreamingUnmarshaller(MemoryMetric.class, ProcessorMetric.class)) {
unmarshaller.open(new FileInputStream(fileName));
while (unmarshaller.hasNext()) {
doWhatYouWant(unmarshaller.next(YourObject.class));
}
}
Note that if the classes given to the StreamingUnmarshaller
do not have the XmlRootElement
annotation
(for example if they are generated by XJC from an XSD), you can give the tag names with the classes using a Map
.
If the XML file you would like to create or read has a complex structure (meaning the stream of elements to read is not present right after the root tag), you have the possibility to extends both marshaller and unmarshaller and override the following methods:
createDocumentStart
inStreamingMarshaller
to write the start of the XML file before the stream of elementsclose
inStreamingMarshaller
to write the end of the XML file (note that tags are closed automatically)skipDocumentStart
inStreamingUnmarshaller
to reach the stream of elements in the document
If you have a feature request or found a bug, you can:
- Write an issue
- Create a pull request
If you want to contribute then
- Please write tests covering all your changes
- Ensure you didn't break the build by running
mvn test
- Fork the repo and create a pull request
This project is under Apache 2.0 License.