CS 677 Big Data

NCDC Data Dictionary

Straight from the documentation:

DATA FIELDS / FORMATS

Each station file contains fixed-width formatted fields with a single set of 
subhourly (5-minute) data per line. A summary table of the fields and a 
detailed listing of field definitions/column formats are shown below. 

Please be sure to refer to the "Important Notes" section below for essential 
information.

All subhourly data are calculated over the 5-minute period *ending* at the 
UTC/LST times shown. Please note that the stations Local Standard Time is 
always used, regardless of its Daylight Savings status.  

Field#  Name                           Units
---------------------------------------------
   1    WBANNO                         XXXXX
   2    UTC_DATE                       YYYYMMDD
   3    UTC_TIME                       HHmm
   4    LST_DATE                       YYYYMMDD
   5    LST_TIME                       HHmm
   6    CRX_VN                         XXXXXX
   7    LONGITUDE                      Decimal_degrees
   8    LATITUDE                       Decimal_degrees
   9    AIR_TEMPERATURE                Celsius
   10   PRECIPITATION                  mm
   11   SOLAR_RADIATION                W/m^2
   12   SR_FLAG                        X
   13   SURFACE_TEMPERATURE            Celsius
   14   ST_TYPE                        X
   15   ST_FLAG                        X
   16   RELATIVE_HUMIDITY              %
   17   RH_FLAG                        X
   18   SOIL_MOISTURE_5                m^3/m^3
   19   SOIL_TEMPERATURE_5             Celsius
   20   WETNESS                        Ohms
   21   WET_FLAG                       X
   22   WIND_1_5                       m/s
   23   WIND_FLAG                      X

   1    WBANNO  [5 chars]  cols 1 -- 5 
          The station WBAN number.

   2    UTC_DATE  [8 chars]  cols 7 -- 14 
          The UTC date of the observation.

   3    UTC_TIME  [4 chars]  cols 16 -- 19 
          The UTC time at the end of the 5-minute observation period. For example, 
          0420 designates the observational period starting just after 0415 
          and ending at 0420; and 0000 designates the last 5-minute period 
          of the previous day.

   4    LST_DATE  [8 chars]  cols 21 -- 28 
          The Local Standard Time (LST) date of the observation.

   5    LST_TIME  [4 chars]  cols 30 -- 33 
          The Local Standard Time (LST) time at the end of the 5-minute period 
          (see UTC_TIME description).

   6    CRX_VN  [6 chars]  cols 35 -- 40 
          The version number of the station datalogger program that was in 
          effect at the time of the observation. Note: This field should be 
          treated as text (i.e. string).

   7    LONGITUDE  [7 chars]  cols 42 -- 48 
          Station longitude, using WGS-84.

   8    LATITUDE  [7 chars]  cols 50 -- 56 
          Station latitude, using WGS-84.

   9    AIR_TEMPERATURE  [7 chars]  cols 58 -- 64 
          Average temperature, in degrees C. See Notes F and G.

   10   PRECIPITATION  [7 chars]  cols 66 -- 72 
          Total amount of precipitation, in mm. See Notes F and H.

   11   SOLAR_RADIATION  [6 chars]  cols 74 -- 79 
          Average global solar radiation received, in watts/meter^2.

   12   SR_FLAG  [1 chars]  cols 81 -- 81 
          QC flag for the average global solar radiation measurement. See Note 
          I.

   13   SURFACE_TEMPERATURE  [7 chars]  cols 83 -- 89 
          Average infrared surface temperature, in degrees C. See Note J.

   14   ST_TYPE  [1 chars]  cols 91 -- 91 
          The type of infrared surface temperature measurement: 'R' denotes 
          raw (uncorrected); 'C' denotes corrected; and 'U' is shown if the 
          type is unknown/missing. See Note J.

   15   ST_FLAG  [1 chars]  cols 93 -- 93 
          QC flag for the surface temperature measurement. See Note I.

   16   RELATIVE_HUMIDITY  [5 chars]  cols 95 -- 99 
          Relative humidity average, as a percentage. See Note K.

   17   RH_FLAG  [1 chars]  cols 101 -- 101 
          QC flag for the relative humidity measurement. See Note I.

   18   SOIL_MOISTURE_5  [7 chars]  cols 103 -- 109 
          Average soil moisture (volumetric water content in m^3/m^3) at 5 
          cm below the surface. See Note M.

   19   SOIL_TEMPERATURE_5  [7 chars]  cols 111 -- 117 
          Average soil temperature at 5 cm below the surface, in degrees C. 
          See Note M.

   20   WETNESS  [5 chars]  cols 119 -- 123 
          The presence or absence of moisture due to precipitation, in Ohms. 
          High values (>= 1000) indicate an absence of moisture.  Low values 
          (< 1000) indicate the presence of moisture.

   21   WET_FLAG  [1 chars]  cols 125 -- 125 
          QC flag for the wetness measurement. See Note I.

   22   WIND_1_5  [6 chars]  cols 127 -- 132 
          Average wind speed, in meters per second, at a height of 1.5 meters.

   23   WIND_FLAG  [1 chars]  cols 134 -- 134 
          QC flag for the wind speed measurement. See Note I.

    IMPORTANT NOTES:
        A.  All fields are separated from adjacent fields by at least one space.
        B.  Leading zeros are omitted.
        C.  Missing data are indicated by the lowest possible integer for a 
            given column format, such as -9999.0 for 7-character fields with 
            one decimal place or -99.000 for 7-character fields with three
            decimal places.
        D.  Subhourly data are calculated over the 5-minute period which *ends*
            at the time shown.
        E.  There are no quality flags for these derived quantities. When the 
            raw data are flagged as erroneous, these derived values are not 
            calculated, and are instead reported as missing. Therefore, these 
            fields may be assumed to always be good (unflagged) data, except 
            when they are reported as missing.
        F.  The 5-minute values reported in this dataset are calculated using 
            multiple independent measurements for temperature and precipitation. 
        G.  USCRN/USRCRN stations have multiple co-located temperature sensors 
            that make 10-second independent measurements used for the average. 
        H.  USCRN/USRCRN stations use a weighing bucket gauge outfitted with 
            three redundant, but independent, load cell sensors to monitor gauge
            depth. As a supplement, a disdrometer (wetness sensor) is used to 
            detect wetness. 
        I.  Quality control flags indicate the following: 0 denotes good data, 
            1 denotes field-length overflow, and 3 denotes erroneous data.
        J.  On 2013-01-07 at 1500 UTC, USCRN began reporting corrected surface 
            temperature measurements for some stations. These changes  
            impact previous users of the data because the corrected values 
            differ from uncorrected values. To distinguish between uncorrected 
            (raw) and corrected surface temperature measurements, a surface 
            temperature type field was added to the data product. The 
            possible values of the this field are "R" to denote raw surface 
            temperature measurements, "C" to denote corrected surface 
            temperature measurements, and "U" for unknown/missing.
        K.  All USCRN stations now report 5-minute relative humidity averages, 
            however the two Asheville, NC stations reported only hourly RH 
            values until 2007-02-22.
        L.  USRCRN stations do not measure solar radiation, surface temperature,
            relative humidity, wind speed or soil variables, so those fields 
            are shown as missing data.
        M.  USCRN stations have multiple co-located soil sensors that record 
            independent measurements. The soil values reported in this dataset 
            are calculated from these multiple independent measurements. Soil 
            moisture is the ratio of water volume over sample volume 
            (m^3 water/m^3 soil).
        N.  In accordance with Service Change Notice 14-25 from the National 
            Weather Service, NCDC stopped providing data from the 72 
            Southwest Regional Climate Reference Network (USRCRN) stations on 
            June 1, 2014. The historical data for these stations remain 
            available.