import excel spreadsheets easily +

Our customers tend to send us files that they want us to import automatically or periodically into the database.

We use the agimatec-dbimport framework to process such files. The framework supports easy import of XML, comma-separted text files, fixed-column-length files and MS Excel spreadsheets (.xls) files.

An import is a small java programm or – even simpler – groovy script. The framework handles the file-io, parsing and can optionally store information into the database (number of rows, number of errors, status information).

To import excel spreadsheets is as easy as importing other formats. Some examples (for more refer to the homepage of dbimport)

Groovy example to print the columns in a spreadsheet:

// Import excel file spreadsheet.xls
LineImporterSpecGroovy spec = new LineImporterSpecGroovy({ processor ->
   def row = processor.currentRow;
   processor.log('row: ' + processor.rowCount + '; data: ' + row);
  // access a spefic column by header-name: row['FirstName'].value
});

spec.setHeaderSpec(LineImporterSpec.Header.FIRST); // first line contains column header names
spec.setLineTokenizerFactory(new ExcelRowTokenizerFactory());
Importer importer = new Importer(spec);
importer.importFrom(new FileInputStream("Spreadsheet.xls"));

The framework invokes the closure for each row in the file. To handle different data formats, just choose the adequate LineTokenizerFactory.

XML is not record-based (line-by-line), but the framework offers a similar handling when importing data from a XML file:

def spec = new XmlSlurperSpec(
{ doc -> doc.user },
{ processor ->
  def user = processor.current;
  processor.log('element: ' + processor.rowCount + ' with: ' + user);
  // access a specific element: user.address.country.text()
});

The Groovy XmlSlurper makes xml parsing easy (no schema, no model classes required). The first closure is responsible to return an iterator over the elements. The second closure is the one to process each element.

Leave a Reply

XHTML: You can use these tags: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <strike> <strong>