$Header$ -*-text-*-

The netCDF Operators NCO version 4.9.4 have arrived.

http://nco.sf.net (Homepage, Mailing lists, Help)
http://github.com/nco (Source Code, Issues, Releases, Developers)

What's new?
Version 4.9.4 contains new features focused on de-interleaving time
coordinates and per-record weighting for ncra, high-freqency (e.g.,
diurnally-resolved) climatologies and new defaults for ncclimo, a
new distance-weight extrapolation algorithm for ncremap, per-file
weights for nces. New NCO-wide features include:  support for
unbuffered I/O that can speed-up I/O for large record variables,
more precise quantization than Bit Grooming, and faster arithmetic
that takes advantage of SIMD directives on OpenMP-enabled builds.
Overall release is packed with interesting new features...read on.

Work on NCO 4.9.5 has commenced and will improve high-freqency
splitting and climo capabilities, more SIMD acceleration, and
more GPU offloading support.

Enjoy,
Charlie

NEW FEATURES (full details always in ChangeLog):

A. NCO may now be configured with --enable-gpu at build-time to
   offload certain arithmetically intensive computations to the GPUs
   with select architectures and compilers. This feature currently
   has no speed benefits, and needs a volunteer to lead development.
   Please contact me if interested.

B. All operators now support unbuffered I/O with netCDF3 files when 
   invoked with the flag --uio or longer synonyms --unbuffered_io or
   --share_all. This flag invokes the netCDF library NC_SHARE flag
   which enables unbuffered (non-cached) I/O. Unbufferend I/O may
   significantly reduce throughput time when large record variables
   are written or read. Performance improvements may depend on netCDF
   version. Thanks to Barron Henderson for this suggestion.
   ncks       -v T in.nc out.nc # Default, buffered I/O
   ncks --uio -v T in.nc out.nc # Unbuffered I/O
   ncra --uio -v T,Q,U,V in*.nc out.nc
   http://nco.sf.net/nco.html#uio

C. The default quantization method has changed from Bit Grooming to
   Bit Rounding, contributed by Rostislav Kouznetsov with suggestions
   from Milan Klower. As implemented, Bit Rounding will substantially
   improve the precision for the specified number of significant
   digits to retain. A future version of NCO will re-tune the number
   of bits per retained digit to turn this precision advantage into
   a compression advantage. An article submitted by R. Kouznetsov to
   GMD describes Bit Rounding more...precisely.
   http://nco.sf.net/nco.html#bg

D. The multi-file, multi-record operators, ncra and ncrcat now
   support interleaved time-coordinates in groups of records. 
   Interleaving (or de-interleaving, depending on one's perspective)
   means altering the order of records in a group to be processed.
   Specifically, the interleaving feature causes the operator to treat
   as sequential records those that are separated by multiples of the
   specified interleave parameter within a group of records.
   Specify the interleave parameter as the fifth hyperslab argument.
   The interleave feature sequences records with respect to their
   position relative to the beginning of each sub-cycle.
   Records a multiple of interleave from sub-cycle beginning
   are first extracted (by ncrcat) or reduced (by ncra), then records
   offset from these by one, two, et cetera up to interleave-1. 
   Thus interleaving allows deconvolution of periodic phenomena within
   a time-series. Some examples to reify the abstract:

   Let in1.nc = [1..10], in2.nc = [11..20], and in12.nc = [1..20].
   ncra -d time,,,,10,5 in1.nc ~/foo.nc # 3.5, 4.5, 5.5, 6.5, 7.5
   ncrcat -d time,0,4,,6,2 in1.nc ~/foo.nc 1, 3, 5, 2, 4, 6 (+WARNING)
   ncrcat -d time,2,,10,4,2 in12.nc ~/foo.nc # 3, 5, 4, 6, 13, 15, 14, 16
   ncra   -d time,2,,10,4,2 in12.nc ~/foo.nc # 4, 5, 14, 15
   ncra -d time,,,,10,2 in1.nc in2.nc ~/foo.nc # 5, 6, 15, 16
   ncra -d time,,,,10,2 in12.nc ~/foo.nc # 5, 6, 15, 16
   http://nco.sf.net/nco.html#interleave
   http://nco.sf.net/nco.html#ilv

E. nces now supports the -w (or --wgt) option for per-file weights.
   This option is similar to the ncra --wgt option. The nces version
   also accepts a variable name that contains a scalar per-file-weight. 
   Per-file weights are useful when computing statistics of ensembles
   whose members should be weighted unevenly. Hence these three
   commands produce the same answers, though the second and third
   are much more flexible and can have non-integral weights: 
   nces   1.nc 2.nc 2.nc out.nc
   nces -w 1,2 1.nc 2.nc out.nc
   nces -w var 1.nc 2.nc out.nc
   http://nco.sf.net/nco.html#nces
   http://nco.sf.net/nco.html#xmp_nces

F. ncra now supports the --per_record_weights (or --prw) option to
   utilize command-line weights specified by -w (or --wgt) for
   per-record weights instead of per-file-weights. This is useful when
   computing weighted averages with cyclically varying weights, since
   the weight given on the command line will be repeated for the
   length of the timeseries. Consider, for example, a CMIP6 timeseries
   of historical monthly mean emissions that one wishes to convert to
   an timeseries of annual-mean emissions. One can weight each month
   by its number of days via: 
   ncra --per_record_weights --mro -d time,,,12,12 --wgt \
        31,28,31,30,31,30,31,31,30,31,30,31 ~/monthly.nc ~/annual.nc
   Thanks to Philip Cameron-Smith of LLNL for this suggestion.
   http://nco.sf.net/nco.html#ncra
   http://nco.sf.net/nco.html#xmp_ncra
   http://nco.sf.net/nco.html#prw

G. ncra accepts the new flag --promote_ints (or --prm_ints) to output
   statistics of integer-valued input variables in floating-point
   precision in the output file. By default, NCO arithmetic operators
   such as ncra auto-promote integers to double-precision prior to
   arithmetic, then conduct the arithmetic, then demote the values
   back to integers for final output. This default behavior quantizes
   the mantissa of the values and prevents, e.g., turning statisitical
   means of boolean (0 or 1-valued) input data into floating point
   probabilities. The --promote_ints flag causes the statistical means
   of integer (including NC_BYTE) inputs to be output as
   single-precision floating point (NC_FLOAT) variables. This allows
   use arithmetic to be performed on Boolean values stored in the
   space-conserving NC_BYTE (single byte) format in input files.
   ncra --prm_ints in*.nc out.nc
   Thanks to Paul Ullrich of UC Davis for this suggestion.
   http://nco.sf.net/nco.html#prm_ints
   http://nco.sf.net/nco.html#promote_ints

H. ncremap understands new dimensions used in DOE E3SM MPAS BGC
   simulations. ncremap also supports the new --pdq_opt to
   override internal presets and to future-proof itself against
   unexpected new dimensions from any model input.
   ncremap -P mpasseaice --map=map.nc in.nc out.nc
   ncremap --pdq='-a Time,new_dim,nCells'  --map=map.nc in.nc out.nc
   ncremap --pdq='-a time,new_dim,lat,lon' --map=map.nc in.nc out.nc
   Thanks to Ahmed Elshall for reporting the new dimensions.
   http://nco.sf.net/nco.html#pdq_opt

I. ncremap allows access to a new regridding algorithm based on
   distance-weighted extrapolation (DWE). DWE is similar to the ESMF
   nearestidavg extrapolation alorithm, and accepts the same two
   parameters as input: --xtr_xpn sets the (absolute value of) the
   exponent used in distance weighting (default is 2.0), and --xtr_nsp
   sets the number of source points used in the extrapolation (default
   is 8). ncremap can apply DWE to the entire destination grid, or
   just to points with missing/masked values.

   ncremap --alg_typ=nco_dwe -s src.nc -d dst.nc -m map.nc
   ncremap -a nco_dwe --xtr_xpn=1.0 -s src.nc -d dst.nc -m map.nc
   ncremap -a nco_dwe --xtr_nsp=1   -s src.nc -d dst.nc -m map.nc
   Thanks to Henry Butowsky for implementing the new method.
   http://nco.sf.net/nco.html#dwe

J. The ncks --dt_fmt option now applies equally well to JSON and XML
   output as to CDL output:
   % ncks -d time,0 -v time --cdl --dt_fmt=3 ~/nco/data/in.nc
   ...
   time = "1964-03-13T21:09:0.000000" ;
   ...
   % ncks -d time,0 -v time --json --dt_fmt=3 ~/nco/data/in.nc
   ...
   "data": ["1964-03-13T21:09:0.000000"]
   ...
   % ncks -d time,0 -v time --xml --dt_fmt=3 ~/nco/data/in.nc
   ...
   <ncml:values separator="*">1964-03-13T21:09:0.000000</ncml:values>
   ...

   Thanks to Troy Mare for this suggestion.
   http://nco.sf.net/nco.html#dt_fmt
   http://nco.sf.net/nco.html#json
   http://nco.sf.net/nco.html#xml

K. ncra, nces, and ncrcat introduce the --clm_nfo (or --cb) option to
   produce CF-conformant climatological times and bounds.
   This option takes a comma-separated argument list of five relevant 
   input parameters: --cb=yr_srt,yr_end,mth_srt,mth_end,tpd, 
   where yr_srt is the climatology start-year, yr_end is the
   climatology end-year, mth_srt is the climatology start-month (in
   [1..12] format), mth_end is the climatology end-month (in [1..12]
   format), and tpd is the number of timestpes per day (with the
   special exception that tpd=0 indicates monthly data, not
   diurnally-resolved data. A seasonal summer climatology created from
   monthly mean input data spanning June, 2000 to August, 2020 should
   call ncra with --clm_bnd=2000,2020,6,8,0, whereas a diurnally
   resolved climatology of the same period with 6-hourly input data
   resolution would use --clm_bnd=2000,2020,6,8,4. 
   ncra --cb=2014,2016,1,1,0 2014_01.nc 2015_01.nc 2016_01.nc clm_JAN.nc
   http://nco.sf.net/nco.html#cb
   http://nco.sf.net/nco.html#ncra

L. ncclimo has changed default settings for two parameters.
   As of this verion, ncclimo sets the options "-a sdd
   --no_amwg_links" by default. For seasonally contiguous DJF climos 
   one must now explicitly set "-a scd". To create symbolic links
   to climatology files with AMWG names, one must now explicitly
   request --amwg_links.
   http://nco.sf.net/nco.html#dec_md
   http://nco.sf.net/nco.html#lnk_flg

M. ncclimo now supports the high-frequency climos and splitting.
   Access these capabilities by specifying the climatology-mode
   options hfc and hfs, respectively. In both cases (climos and
   splitting) the input file(s) name will not be constructed
   automatically and must be provided via stdin, positional
   command-line arguments, or a director to glob. For climos,
   ncclimo will detect the number of timesteps per day (tpd) in the 
   input data, and compute the climatological mean diurnal cycle
   from the input data. The output is similar to monthly climos,
   except each climatological monthly, seasonal, or annual output
   file will contain tpd timesteps to represent the diurnal cycle.

   # Split high-frequency timeseries into CMIP-like timeseries
   cd ${drc_in};ls ${caseid}*.h4.nc | ncclimo --clm_md=hfs -v=T \
     --ypf=1 --yr_srt=56 --yr_end=76 --drc_out=${HOME}
   # Generate diurnal climos from high-frequency CMIP6 timeseries
   cd ${drc_in};ls ${caseid}*.h4.nc | ncclimo --clm_md=hfc
     -c ${caseid} --yr_srt=2001 --yr_end=2002 --drc_out=${HOME}
   http://nco.sf.net/nco.html#ncclimo
   http://nco.sf.net/nco.html#clm_md   

N. ncclimo now outputs more CF-conformant climatological times and
   bounds for all climatologies. Previously, ncclimo output a
   time-centered value for climatological bounds, now it outputs
   an initial YYYYMMDD format, as recommended by CF examples such as
   http://cfconventions.org/Data/cf-conventions/cf-conventions-1.8/cf-conventions.html#climatological-statistics
   Example 7.13

BUG FIXES:

A. ncremap/ncks: Fix vertical interpolation from hybrid-to-hybrid
   files when surface pressure is in vertical grid file.
   This capability worked up to 4.9.1, and was inadvertently broken in
   4.9.2 and 4.9.3. The workaround is to use 4.9.1, or move
   the desired PS field from the gridfile to the input file.
   The solution is to upgrade. Thanks to Wuyin Lin for reporting.

Full release statement at http://nco.sf.net/ANNOUNCE

KNOWN PROBLEMS DUE TO NCO:

   This section of ANNOUNCE reports and reminds users of the
   existence and severity of known, not yet fixed, problems. 
   These problems occur with NCO 4.9.4 built/tested under
   MacOS 10.15.6 with netCDF 4.7.4 on HDF5 1.10.2 and with
   Linux with netCDF 4.8.0-development (2020715) on HDF5 1.8.19.

A. NOT YET FIXED (NCO problem)
   Correctly read arrays of NC_STRING with embedded delimiters in ncatted arguments

   Demonstration:
   ncatted -D 5 -O -a new_string_att,att_var,c,sng,"list","of","str,ings" ~/nco/data/in_4.nc ~/foo.nc
   ncks -m -C -v att_var ~/foo.nc

   20130724: Verified problem still exists
   TODO nco1102
   Cause: NCO parsing of ncatted arguments is not sophisticated
   enough to handle arrays of NC_STRINGS with embedded delimiters.

B. NOT YET FIXED (NCO problem?)
   ncra/ncrcat (not ncks) hyperslabbing can fail on variables with multiple record dimensions

   Demonstration:
   ncrcat -O -d time,0 ~/nco/data/mrd.nc ~/foo.nc

   20140826: Verified problem still exists
   20140619: Problem reported by rmla
   Cause: Unsure. Maybe ncra.c loop structure not amenable to MRD?
   Workaround: Convert to fixed dimensions then hyperslab

KNOWN PROBLEMS DUE TO BASE LIBRARIES/PROTOCOLS:

A. NOT YET FIXED (netCDF4 or HDF5 problem?)
   Specifying strided hyperslab on large netCDF4 datasets leads
   to slowdown or failure with recent netCDF versions.

   Demonstration with NCO <= 4.4.5:
   time ncks -O -d time,0,,12 ~/ET_2000-01_2001-12.nc ~/foo.nc
   Demonstration with NCL:
   time ncl < ~/nco/data/ncl.ncl   
   20140718: Problem reported by Parker Norton
   20140826: Verified problem still exists
   20140930: Finish NCO workaround for problem
   20190201: Possibly this problem was fixed in netCDF 4.6.2 by https://github.com/Unidata/netcdf-c/pull/1001
   Cause: Slow algorithm in nc_var_gets()?
   Workaround #1: Use NCO 4.4.6 or later (avoids nc_var_gets())
   Workaround #2: Convert file to netCDF3 first, then use stride
   Workaround #3: Compile NCO with netCDF >= 4.6.2

B. NOT YET FIXED (netCDF4 library bug)
   Simultaneously renaming multiple dimensions in netCDF4 file can corrupt output

   Demonstration:
   ncrename -O -d lev,z -d lat,y -d lon,x ~/nco/data/in_grp.nc ~/foo.nc # Completes but produces unreadable file foo.nc
   ncks -v one ~/foo.nc

   20150922: Confirmed problem reported by Isabelle Dast, reported to Unidata
   20150924: Unidata confirmed problem
   20160212: Verified problem still exists in netCDF library
   20160512: Ditto
   20161028: Verified problem still exists with netCDF 4.4.1
   20170323: Verified problem still exists with netCDF 4.4.2-development
   20170323: https://github.com/Unidata/netcdf-c/issues/381
   20171102: Verified problem still exists with netCDF 4.5.1-development
   20171107: https://github.com/Unidata/netcdf-c/issues/597
   20190202: Progress has recently been made in netCDF 4.6.3-development
   More details: http://nco.sf.net/nco.html#ncrename_crd

C. NOT YET FIXED (would require DAP protocol change?)
   Unable to retrieve contents of variables including period '.' in name
   Periods are legal characters in netCDF variable names.
   Metadata are returned successfully, data are not.
   DAP non-transparency: Works locally, fails through DAP server.

   Demonstration:
   ncks -O -C -D 3 -v var_nm.dot -p http://thredds-test.ucar.edu/thredds/dodsC/testdods in.nc # Fails to find variable

   20130724: Verified problem still exists. 
   Stopped testing because inclusion of var_nm.dot broke all test scripts.
   NB: Hard to fix since DAP interprets '.' as structure delimiter in HTTP query string.

   Bug tracking: https://www.unidata.ucar.edu/jira/browse/NCF-47

D. NOT YET FIXED (would require DAP protocol change)
   Correctly read scalar characters over DAP.
   DAP non-transparency: Works locally, fails through DAP server.
   Problem, IMHO, is with DAP definition/protocol

   Demonstration:
   ncks -O -D 1 -H -C -m --md5_dgs -v md5_a -p http://thredds-test.ucar.edu/thredds/dodsC/testdods in.nc

   20120801: Verified problem still exists
   Bug report not filed
   Cause: DAP translates scalar characters into 64-element (this
   dimension is user-configurable, but still...), NUL-terminated
   strings so MD5 agreement fails 

"Sticky" reminders:

A. Reminder that NCO works on most HDF4 and HDF5 datasets, e.g., 
   HDF4: AMSR MERRA MODIS ...
   HDF5: GLAS ICESat Mabel SBUV ...
   HDF-EOS5: AURA HIRDLS OMI ...

B. Pre-built executables for many OS's at:
   http://nco.sf.net#bnr

