PSI DataWorkshop 2006
Minutes and Presentations
Mon., Feb. 13th: Data management
Talk (Joerg)
File Archives & NetCDF
A general overview as introduction to the theme:
- Compare storage of data in file archives/ data bases
- Metadata storage
- Filesize comparison NetCDF-Grib
Talk(Hannes)
CERA Data Base and NetCDF
How CERA data base stores data and how NetCDF will fit:
- Overview CERA data base
- CERA interface
- NetCDF storage vs. Grib storage
Talk(Veronika)
NetCDF and CLM
Adaptation of CLM Model Output to the NetCDF Climate and Forecast (CF) Metadata Conventions:
- additional variables for regional models
- Variable standarisation for cf compliment
Talk(Ag)
NetCDF-enabling an Oracle database using Java
Work in progress by Lost Wax: NetCDF inside Oracle for the DEWS project:
- storing NetCDF data using the Java NC Libraries
- NetLobs – NetCDF ‘SmartLobs’
- Structure of “NetLobs’
Discussion:
A catalogue of model experiment descriptions, which should be searchable according to model configuration, experimental design and achievements is desired for local use at modelling sites as well as experiment inventory federations between sites. Different types of metadata with different roles can be separated:
- Discovery metadata for experiments and results
- Usage metadata including grid descriptions
- Access metadata
Action item: The existing cooperation between BADC, PSI, GO-ESSP, ESMF, NERC Data-Grid and C3Grid should be maintained.
Data storage of ESM results is realised on disks or on tapes and in database systems or in flat files. Different classes of applications can be identified:
- Specification of forcing for subsequent models
- Consideration of volume processes like up-/down-welling, convection, clouds, radiative transfer or chemical reactions
- Changes in near surface climatology
These application classes correspond to reasonable data storage structures of increasing granularity:
- Coarse granularity: 4D time series
- Mid granularity: 3D time series
- Fine granularity: 2D time series.
A most likely data storage structure could not be defined because PSI applications are too diverse and could yet not be assessed. Currently PSI data structures show both, 2D time series in a web accessible tape archive (WDC Climate) and 4D time series in a disk resident flat file archive in connection with data extraction processing (BADC). It would be useful to store checksums (md5sum) with the data.
Action item: Application performance and usability should be further inferred and the SRE (Standard Runtime Environment) should be expanded by a data archive interface which supports all three level of data granularity.
The standard data format at least for data exchange is NetCDF/CF. The corresponding list of standard variable names is widely accepted. This list has been enlarged by new variables by integration of new model types into the PRSIM SCE (Standard Compile Environment) und SRE.
Action item: The maintenance of NetCDF/CF variables list should be supported by PSI in cooperation with the development of the data storage standard NetCDF/CF itself.
Tue., Feb. 14th: Postprocessing
Talk(Jamie)
Overview coco
Where coco (CDMS overloaded for CF objects) belongs to and what coco has been used for:
- coco and cdat
- coco use at MetOffice
- IPCC AR4 data conversion via coco with python as glue
Coco development has been frozen.
Talk(Ag)
cdat at BADC
Building tools on cdat and pro’s and con’s:
- about the BADC
- cdat components
- BADC builts on cdat
- cf checker
- cases for cdat
- cases against cdat
Discussion:
CDAT has been recommended in the FP5 project PRISM. CDAT (Climate Data Analysis Tools) is a set of tools for most climate data processing, including graphics (via the VCS ( Visualisation and Control System) package for example) but some diffiulties have been pointed out:
- difficult to install,
- python language maybe too powerful,
- unstable GUI,
- how to influence the development?,
- first experience often negative,
- list of OS CDAT works is hard to find.
CDAT is developed and maintained by PCMDI. The package has been successfully installed at BADC, MetOffice and IPSL. The installation failed on DKRZ's data servers. Even PCMDI were not able to install it in 2005. The problem has been identified as connected to the Unix 64 bit architecture of the server. PCMDI infers the installation problem further.
Action item (added by Ag):
- Ag: MPI interested in putting their metadata in a Browse catalogue and interface. Ag suggested MOLES (Metadata Objects and Linkages in the Environmental Sciences); ask Bryan about this and how we can help them use it. Then feed back to Michael.
- Charles: Get 10 common Ferret and GrADS scripts from web and write CDAT versions.
- Charles: To create a web page for all the successful systems/versions that have installed properly. And any information on systems that people had problems on. And list of system requirements.
- Charles: Need a test script for checking everything has installed properly.
Talk (Joerg)
cdo -Climate Data Operators
An easy to install alternative and addition to the existing diagnostic tools:
- introduction
- install and use
- operators
- examples for use
- cdi Climate Data Interface
Discussion:
The CDO package is easy to install on most Unix platforms and a flexible to use data processing package. CDO use an own C-version of SCRIP for regridding. The CDO's provide interfaces to graphic tools GRADS and GMT but include no graphics. Support is given by MPI-M for development and by M&D for application support.
Connected to the CDO's is the CDI (climate data interface). This interface provides a library for model input/output of different formats and can also be used by other applications or C or FORTRAN written programs. Grib1, NetCDF/CF and some local formats are supported. The inclusion of GRIB2 and NETCDF-4 is planned.
Action item: At this stage it is hard decide on one PSI supported data processing package. The installation question of CDAT should tried to be solved together with PCMDI and in parallel the CDO's are offered as an alternative and an addition. CDI– cdunif – NetCDF4 should be interrogated in the direction of a common I/O data interface for ESM. Open problems like data compression, performance measurements and assurance of data integrity should also considered in future.
Talk(Joerg)
SRE monitoring
Showing key-variables of a model run during run time on a website:
- part of IMDI
- LE-Graphics
- LE-grads
- tt4www.py
Discussion
Required is a monitoring tool which allows for graphical inspection of long running modelling experiments. The CDAT prototype from the FP5 PRISM project is presently not portable and not stable. Due to the installation problems of CDAT at DKRZ an alternative monitoring tool based on CDO's and GrADS or GMT has been prototyped.
Action item: The SRE monitoring based on CDO and GrADS or GMT should be included in the SRE.
List of participants
| Name | Institute | |
|---|---|---|
| Rosalyn Hatcher | (CGAM) | r.s.hatcher(at)reading.ac.uk |
| Sophie Valcke | (CERFACS) | Sophie.Valcke(at)cerfacs.fr |
| Veronika Gayler | (M&D) | veronika.gayler(at)zmaw.de |
| Ag Stephens | (BADC) | a.stephens(at)rl.ac.uk |
| Jamie Kettlebourough | (MetOffice) | jamie.kettleborough(at)metoffice.gov.uk |
| Jean-Yves Peterschmitt | (LSCE) | Jean-Yves.Peterschmitt(at)cea.fr |
| Uwe Schulzweida | (MPI) | uwe.schulzweida(at)zmaw.de |
| Frank Toussiant | (M&D) | frank.toussiant(at)zmaw.de |
| Joerg Wegner | (M&D) | joerg.wegner(at)zmaw.de |
| Michael Lautenschlager | (M&D) | michael.lautenschlager(at)zmaw.de |
| Charles Doutriaux | (PCMDI) | charles.doutriaux(at)cea.fr |
| Hannes Thiemann | (M&D) | hannes.thiemann(at)zmaw.de |
| Stephanie Legutke | (M&D) | stephanie.legutke(at)zmaw.de |
| Reinhard Budich | (MPI) | reinhard.budich(at)zmaw.de |
| Luis Kornblüh | (MPI) | luis.kornblueh(at)zmaw.de |

