TOC 
syslog Working GroupR. Gerhards
Internet-DraftAdiscon GmbH
Expires: March 25, 2005September 24, 2004

The syslog Protocol

draft-ietf-syslog-protocol-06.txt

Status of this Memo

This document is an Internet-Draft and is subject to all provisions of section 3 of RFC 3667. By submitting this Internet-Draft, each author represents that any applicable patent or other IPR claims of which he or she is aware have been or will be disclosed, and any of which he or she become aware will be disclosed, in accordance with RFC 3668.

Internet-Drafts are working documents of the Internet Engineering Task Force (IETF), its areas, and its working groups. Note that other groups may also distribute working documents as Internet-Drafts.

Internet-Drafts are draft documents valid for a maximum of six months and may be updated, replaced, or obsoleted by other documents at any time. It is inappropriate to use Internet-Drafts as reference material or to cite them other than as "work in progress."

The list of current Internet-Drafts can be accessed at http://www.ietf.org/ietf/1id-abstracts.txt.

The list of Internet-Draft Shadow Directories can be accessed at http://www.ietf.org/shadow.html.

This Internet-Draft will expire on March 25, 2005.

Copyright Notice

Copyright (C) The Internet Society (2004).

Abstract

This document describes the syslog protocol which is used to convey event notification messages. It describes a layered architecture for an easily extensible syslog protocol. It also describes the basic message format and structured elements used to provide meta-information about the message.



Table of Contents

1.  Introduction
2.  Conventions Used in This Document
3.  Definitions
    3.1  Example Deployment Scenarios
4.  Transport Layer Protocol
    4.1  Minimum Required Transport Mapping
5.  Required syslog Format
    5.1  Message Length
    5.2  HEADER
        5.2.1  VERSION
        5.2.2  FACILITY
        5.2.3  SEVERITY
        5.2.4  TIMESTAMP
        5.2.5  HOSTNAME
        5.2.6  SENDER-NAME
        5.2.7  SENDER-INST
    5.3  STRUCTURED-DATA
        5.3.1  STR-DATA-ELT
        5.3.2  Examples
    5.4  MSG
    5.5  Examples
6.  Structured Data IDs
    6.1  time
        6.1.1  tzknown
        6.1.2  issynced
        6.1.3  syncaccuracy
        6.1.4  Examples
    6.2  origin
        6.2.1  ip
        6.2.2  enterpriseID
        6.2.3  software
        6.2.4  sw-version
        6.2.5  Example
7.  Security Considerations
    7.1  Diagnostic Logging
    7.2  Control Characters
    7.3  More than Maximum Message Length
    7.4  Message Length
    7.5  Message Truncation
    7.6  Single Source to a Destination
    7.7  Multiple Sources to a Destination
    7.8  Multiple Sources to Multiple Destinations
    7.9  Replaying
    7.10  Reliable Delivery
    7.11  Message Integrity
    7.12  Message Observation
    7.13  Misconfiguration
    7.14  Forwarding Loop
    7.15  Load Considerations
    7.16  Denial of Service
    7.17  Covert Channels
8.  Notice to RFC Editor
9.  IANA Considerations
    9.1  Version
    9.2  SD-IDs
10.  Authors and Working Group Chair
11.  Acknowledgments
12.  References
12.1  Normative
12.2  Informative
§  Author's Address
A.  Implementor Guidelines
    A.1  Message Length
    A.2  HEADER Parsing
    A.3  SEVERITY Values
    A.4  time-secfrac Precision
    A.5  Leap Seconds
    A.6  Syslog Senders Without Knowledge of Time
    A.7  Additional Information on SENDER-INST
    A.8  Notes on the time SD-ID
    A.9  Recommendation for Diagnostic Logging
§  Intellectual Property and Copyright Statements




 TOC 

1. Introduction

This document describes a layered architecture for syslog. The goal of this architecture is to separate functionality into different layers and thus provide easy extensibility.

This document describes the semantics of the syslog protocol, outlines the concept of transport mappings and provides a standard format for all syslog messages. It also describes structured data elements, which can be used to transmit easy parsable, structured information.



 TOC 

2. Conventions Used in This Document

The keywords "MUST", "MUST NOT", "REQUIRED", "SHOULD", "SHOULD NOT", and "MAY" that appear in this document are to be interpreted as described in RFC2119 [6]Bradner, S., Key words for use in RFCs to Indicate Requirement Levels, March 1997..



 TOC 

3. Definitions

The following definitions will be used in this document:

Please note that a single application can have multiple roles at the same time.

The following principles apply to syslog communication:

3.1 Example Deployment Scenarios

The following deployment scenarios shown in Diagram 1 are valid while the first one has been known to be the most prevalent. Other arrangements of these examples are also acceptable. As noted, in the following diagram relays may pass along all or some of the messages that they receive along with passing along messages that they internally generate. The boxes represent syslog-enabled applications.


         +------+         +---------+
         |Sender|---->----|Collector|
         +------+         +---------+

         +------+         +-----+         +---------+
         |Sender|---->----|Relay|---->----|Collector|
         +------+         +-----+         +---------+

         +------+     +-----+            +-----+     +---------+
         |Sender|-->--|Relay|-->--..-->--|Relay|-->--|Collector|
         +------+     +-----+            +-----+     +---------+

         +------+         +-----+         +---------+
         |Sender|---->----|Relay|---->----|Collector|
         |      |-+       +-----+         +---------+
         +------+  \
                    \     +-----+         +---------+
                     +->--|Relay|---->----|Collector|
                          +-----+         +---------+

         +------+         +---------+
         |Sender|---->----|Collector|
         |      |-+       +---------+
         +------+  \
                    \     +-----+         +---------+
                     +->--|Relay|---->----|Collector|
                          +-----+         +---------+

         +------+         +-----+            +---------+
         |Sender|---->----|Relay|---->-------|Collector|
         |      |-+       +-----+        +---|         |
         +------+  \                    /    +---------+
                    \     +-----+      /
                     +->--|Relay|-->--/
                          +-----+

         +------+         +-----+               +---------+
         |Sender|---->----|Relay|---->----------|Collector|
         |      |-+       +-----+            +--|         |
         +------+  \                        /   +---------+
                    \     +--------+       /                 
                     \    |+------+|      /
                      +->-||Relay ||->---/
                          |+------||    /
                          ||Sender||->-/
                          |+------+|
                          +--------+
				

Diagram 1. Some possible syslog deployment scenarios.



 TOC 

4. Transport Layer Protocol

This document does not specify any transport layer protocol. Instead, it describes the format of a syslog message in a transport layer independent way. This will require that syslog transports be defined in other documents. The first transport is defined in [13]Okmianski, A., Transmission of syslog messages over UDP, August 2004. and is consistent with the traditional UDP transport.

Other transport mappings must ensure that all messages MUST be transmitted unaltered to the destination. If the mapping needs to perform temporary transformations, it MUST be guaranteed that the message received at the final destination is an exact copy of the message sent from the initial originator. Otherwise cryptographic verifiers (like signatures) will be broken.

4.1 Minimum Required Transport Mapping

As noted, all implementations MUST have a UDP-based transport as described in [13]Okmianski, A., Transmission of syslog messages over UDP, August 2004.. This is to ensure interoperability between all systems implementing the protocol described in this document.



 TOC 

5. Required syslog Format

The syslog message has the following ABNFCrocker, D. and P. Overell, Augmented BNF for Syntax Specifications: ABNF, November 1997.[8] definition:

   ; The general syslog message format
	
   SYSLOG-MSG      = HEADER SP STRUCTURED-DATA SP MSG
   
   HEADER          = VERSION SP SP FACILITY SP SEVERITY SP
                     TIMESTAMP SP HOSTNAME SP SENDER-NAME SP
                     SENDER-INST
   VERSION         = NONZERO-DIGIT 0*2DIGIT
   FACILITY        = "0" / (NONZERO-DIGIT 0*9DIGIT)
                     ; range 0..2147483648
   SEVERITY        = "0" / "1" / "2" / "3" / "4" / "5" /
                     "6" / "7"
   HOSTNAME        = 1*255PRINTUSASCII  ; a FQDN

   SENDER-NAME     = 1*48VISUAL
   SENDER-INST     = "-" / 1*16VISUAL
   VISUAL          = (%d33-57/%d59-126) ; all but SP
   
   TIMESTAMP       = full-date "T" full-time
   date-fullyear   = 4DIGIT
   date-month      = 2DIGIT  ; 01-12
   date-mday       = 2DIGIT  ; 01-28, 01-29, 01-30, 01-31 based on
                             ; month/year
   time-hour       = 2DIGIT  ; 00-23
   time-minute     = 2DIGIT  ; 00-59
   time-second     = 2DIGIT  ; 00-58, 00-59, 00-60 based on leap
                             ; second rules
   time-secfrac    = "." 1*6DIGIT
   time-offset     = "Z" / time-numoffset
   time-numoffset  = ("+" / "-") time-hour ":" time-minute

   partial-time    = time-hour ":" time-minute ":" time-second
                     [time-secfrac]
   full-date       = date-fullyear "-" date-month "-" date-mday
   full-time       = partial-time time-offset

   STRUCTURED-DATA = *STR-DATA-ELT
   STR-DATA-ELT    = "[" SD-ID 0*(1*SP SD-PARAM) "]"
   SD-PARAM        = PARAM-NAME "=" %d34 PARAM-VALUE %d34
   SD-ID           = SD-NAME
   PARAM-NAME      = SD-NAME
   PARAM-VALUE     = UTF-8-STRING
   SD-NAME         = 1*32OCTET ; VALID UTF-8 String
                     ; except '=', SP, ']', %d34 (")

   MSG             = *UTF-8-STRING
   UTF-8-STRING    = *OCTET ; Any VALID UTF-8 String

   OCTET           = %d00..255
   SP              = %d32
   PRINTUSASCII    = %d33-126
   NONZERO-DIGIT   = "1" / "2" / "3" / "4" / "5" /
                     "6" / "7"
   DIGIT           = "0" / NONZERO-DIGIT
			

5.1 Message Length

The maximum length of any syslog message is 16,777,216 octets. Any receiver receiving a larger message MUST discard the message. A diagnostic message SHOULD be logged in this case.

A receiver MUST be able to receive messages of a length of 480 octets or less. A receiver SHOULD be able to receive messages of a length of 65,535 octets or less. It is RECOMMENDED that receivers have the ability to receive messages up to the maximum message length.

If a receiver receives messages within the maximum length, but with a length larger than it handles, the receiver MAY discard or truncate it.

5.2 HEADER

The code set used in the HEADER MUST be seven-bit ASCII in an eight-bit field as described in RFC 2234Crocker, D. and P. Overell, Augmented BNF for Syntax Specifications: ABNF, November 1997.[8]. These are the ASCII codes as defined in "USA Standard Code for Information Interchange" ANSI.X3-4.1968American National Standards Institute, USA Code for Information Interchange, 1968.[1].

If the header is not syntactically correct, the receiver MUST NOT try to parse some of the header fields in order to guess an interpretation. It MAY assume it is a RFC 3164Lonvick, C., The BSD Syslog Protocol, August 2001.[11] compliant message and MAY decide to process it as such.

5.2.1 VERSION

The VERSION field denotes the version of the syslog protocol specification. The version number MUST be incremented for each new syslog protocol specification that changes the format. The value specified for the value in this document is version "1". Some additional information about this is specified in Section 9IANA Considerations.

5.2.2 FACILITY

FACILITY is an integer that can be used for filtering by the receiver. There exist some traditional FACILITY code semantics for the codes in the range from 0 to 23. These semantics are not closely followed by all senders. Therefore, no specific semantics for FACILITY codes are implied in this document.

In order to avoid confusion with RFC 3164Lonvick, C., The BSD Syslog Protocol, August 2001.[11], facility codes below 24 SHOULD NOT be used by a sender. If the sender uses them, their usage SHOULD be limited to the spirit of the semantics defined in RFC 3164Lonvick, C., The BSD Syslog Protocol, August 2001.[11].

5.2.3 SEVERITY

The SEVERITY field is used to indicate the severity that the sender of a message assigned to it. It contains one of these values:

        Numerical         Severity
          Code

           0       Emergency: system is unusable
           1       Alert: action must be taken immediately
           2       Critical: critical conditions
           3       Error: error conditions
           4       Warning: warning conditions
           5       Notice: normal but significant condition
           6       Informational: informational messages
           7       Debug: debug-level messages
	   			

5.2.4 TIMESTAMP

The TIMESTAMP field is a formalized timestamp derived from RFC 3339Klyne, G. and C. Newman, Date and Time on the Internet: Timestamps, July 2002.[12].

While RFC 3339Klyne, G. and C. Newman, Date and Time on the Internet: Timestamps, July 2002.[12] makes allowances for multiple syntaxes, this document REQUIRES a restricted set. The TIMESTAMP MUST follow this restrictions:

5.2.4.1 Syslog Senders Without Knowledge of Time

A syslog sender being incapable of obtaining system time MUST use the following TIMESTAMP:

2000-01-01T00:00:60Z

This TIMESTAMP is in the past and it shows a time that never existed, because 1 January 2000 had no leap second. It can never have existed in a valid syslog message of a time-aware sender. A receiver receiving that TIMESTAMP MUST treat it as being well-formed.

5.2.4.2 Examples

Example 1

     1985-04-12T23:20:50.52Z
					

This represents 20 minutes and 50.52 seconds after the 23rd hour of 12 April 1985 in UTC.

Example 2

     1985-04-12T18:20:50.52-06:00
					

This represents the same time as in example 1, but expressed in the eastern US time zone (daylight savings time being observed).

Example 3

     2003-10-11T22:14:15.003Z
					

This represents 11 October 2003 at 10:14:15pm, 3 milliseconds into the next second. The timestamp is in UTC. The timestamp provides millisecond resolution. The creator may have actually had a better resolution, but by providing just three digits for the fractional settings, it does not tell us.

Example 4

      2003-08-24T05:14:15.000003-09:00
					

This represents 24 August 2003 at 05:14:15am, 3 microseconds into the next second. The microsecond resolution is indicated by the additional digits in time-secfrac. The timestamp indicates that its local time is -9 hours from UTC. This timestamp might be created in the US Pacific time zone during daylight savings time.

Example 5 - An Invalid TIMESTAMP

      2003-08-24T05:14:15.000000003-09:00
					

This example nearly the same as Example 4, but it is specifying time-secfrac in nanoseconds. This will result in time-secfrac to be longer than the allowed 6 digits, which invalidates it.

5.2.5 HOSTNAME

The HOSTNAME field identifies the machine that originally sent the syslog message.

The HOSTNAME field SHOULD contain the host name and the domain name of the originator in the format specified in STD 13Mockapetris, P., Domain names - concepts and facilities, November 1987.[3]. This format will be referred to in this document as FQDN.

If the FQDN is not known to the originator, but it knows its IP address and knows that address is statically assigned, it SHOULD use that IP address.

If the sender does not know its IP address or if it is statically assigned and the FQDN is not known, it SHOULD specify its host name without domain name.

If the FQDN and the host name are not known, and the IP address is not statically assigned or it is not known if it is statically assigned, the sender SHOULD specify the IPv4 or IPv6 address it knows, even if it may be dynamically assigned.

If the sender does not know any identifying information, it SHOULD provide the value "0:0:0:0:0:0:0:0".

If an IPv4 address is used, it MUST be in the format of the dotted decimal notation as used in STD 13Mockapetris, P., Domain names - implementation and specification, November 1987.[4]. If an IPv6 address is used, a valid textual representation described in RFC 2373Hinden, R. and S. Deering, IP Version 6 Addressing Architecture, July 1998.[9], Section 2 MUST be used.

If a sender has multiple IP addresses, it SHOULD use a consistent value in the HOSTNAME field. This consistent value MUST be one of its actual IP addresses. If a sender is running on a machine which has both statically and dynamically assigned addressed, then that consistent value SHOULD be from the statically assigned addresses. As an alternative, the sender MAY use the IP address of the interface that is used to send the message.

5.2.6 SENDER-NAME

The SENDER-NAME SHOULD identify the device or application that generated the message. It is a string without further semantics. It is intended for filtering messages on the receiver.

SENDER-NAME is similar to the TAG field described in [11]Lonvick, C., The BSD Syslog Protocol, August 2001., but without the instance description that often could be found in TAG.

5.2.7 SENDER-INST

The SENDER-INST SHOULD identify a specific instance of the sender. It is RECOMMENDED that SENDER-INST contains the operating system process ID, together with a thread ID, if these things exist. No specific format is REQUIRED.

The dash character ("-") is a reserved character that MUST only be used to indicate an unidentified instance.

5.3 STRUCTURED-DATA

STRUCTURED-DATA transports data in a well defined, easily parsable and interpretable format. There are multiple usage scenarios. For example, it may transport meta-information about the syslog message or application-specific information such as traffic counters or IP addresses.

STRUCURED-DATA contains none, one or multiple structured data elements, which are referred to as "STR-DATA-ELT" in this document.

The code set used in STRUCTURED-DATA must be UNICODE, encoded in UTF-8 as specified in RFC 2279Yergeau, F., UTF-8, a transformation format of ISO 10646, January 1998.[7]. A sender MAY issue any valid UTF-8 sequence. A receiver MUST accept any valid UTF-8 sequence. It MUST NOT fail if control characters are present in the STRUCTURED-DATA part.

If STRUCTURED-DATA is malformed, a diagnostic entry SHOULD be logged. It is RECOMMNEDED that STRUCTURED-DATA is considered to be non-existing in such cases. A receiver MAY also discard the message.

5.3.1 STR-DATA-ELT

A STR-DATA-ELT consists of a name and parameter name-value pairs. The name is referred to as SD-ID. It is case-sensitive and uniquely identifies the type and purpose of the element. The name-value pairs are referred to as "SD-PARAM".

5.3.1.1 SD-ID

SD-IDs MUST NOT contain SP or the characters '=', '"', or ']'. IANA controls ALL SD-IDs without a hyphen ('-') in the second character position. Experimental or vendor-specific SD-IDs MUST start with "x-". Values with a hyphen on the second character position and the first character position not being a lower case "x" are undefined and SHOULD NOT be used. Receivers MAY accept them.

If a receiver receives a well-formed but unknown SD-ID, it SHOULD ignore the element.

5.3.1.2 SD-PARAM

Each SD-PARAM consist of a name, referred to as PARAM-NAME, and a value, referred to as PARAM-VALUE.

PARAM-NAME is case-sensitive and MUST NOT contain SP or the characters '=', '"', or ']'.

Inside PARAM-VALUE, the characters '"', '\' and ']' MUST be escaped. This is necessary to avoid parsing errors. Escaping ']' would not strictly be necessary but is REQUIRED by this specification to avoid parser implementation errors. Each of these three characters MUST be escaped as '\"', '\\' and '\]' respectively.

A backslash ('\') followed by none of the three described characters is considered an invalid escape sequence. Upon reception of such an invalid escape sequence, the receiver MUST replace the two-character sequence with only the second character received. It is RECOMMENDED that the receivers logs a diagnostic in this case.

5.3.2 Examples

All examples in this section only show the structured data part of the message. Examples should be considered to be on one line. They are wrapped on multiple lines for readability purposes only. A description is given after each example.

Example 1 - Valid

        [x-example-iut iut="3" EventSource="Application"
        EventID="1011"]
					

This example is a structured data element with an experimental SD-ID of type "x-example-iut" which has three parameters.

Example 2 - Valid

        [x-example-iut iut="3" EventSource="Application"
        EventID="1011"][x-example-priority class="high"]
					

This is the same example as in 1, but with a second structured data element. Please note that the structured data element immediately follows the first one.

Example 3 - Invalid

        [x-example-iut iut="3" EventSource="Application"
        EventID="1011"] [x-example-priority class="high"]
					

This is nearly the same example as 2, but it has a subtle error. Please note that there is a SP character between the two structured data elements ("]SP["). This is invalid. It will cause the STRUCTURED-DATA field to end after the first element. The second element will be interpreted as part of the MSG field.

Example 4 - Invalid

        [ x-example-iut iut="3" EventSource="Application"
        EventID="1011"][x-example-priority class="high"]
					

This example again is nearly the same as 2. It has another subtle error. Please note the SP character after the initial bracket. A structured data element SD-ID must immediately follow the beginning bracket, so the SP character invalidates the STRUCTURED-DATA. Thus, the receiver MAY discard this message.

Example 5 - Valid

        [sigSig Ver="1" RSID="1234" ... Signature="......"]
					

Example 5 is a valid example. It shows a hypothetical IANA assigned SD-I. Please note that the dots denote missing fields, which have been left out for brevity.

5.4 MSG

The MSG part contains a freeform message that gives some detailed information of the event.

The code set used in MSG MUST be UNICODE, encoded in UTF-8 as specified in RFC 2279Yergeau, F., UTF-8, a transformation format of ISO 10646, January 1998.[7]. A sender MAY issue any valid UTF-8 sequence. A receiver MUST accept any valid UTF-8 sequence. It MUST NOT fail if control characters are present in the MSG part.

5.5 Examples

The following are examples of valid syslog messages. A description of each example can be found below it. The examples base on similar examples from RFC 3164Lonvick, C., The BSD Syslog Protocol, August 2001.[11] and are eventually familiar to readers.

Example 1

     1 888 4 2003-10-11T22:14:15.003Z mymachine.example.com su -  'su
     root' failed for lonvick on /dev/pts/8
					

In this example, the VERSION is 1 and the FACILITY has the value of 888. The message was created on October, 11th 2003 at 10:14:15pm UTC, 3 milliseconds into the next second. The message originated from a host that identifies itself as "mymachine.example.com". The SENDER-NAME is "su" and the SENDER-INST is unknown. Note the two SP characters following SENDER-INST. The second SP character is the STRUCTURED-DATA delimiter. It tells that no STRUCTURED-DATA is present in this message. The MSG is "'su root' failed for lonvick...".

Example 2

      1 20 6 2003-08-24T05:14:15.000003-09:00 192.0.2.1
      myproc 10 %% It's time to
      make the do-nuts. %%  Ingredients: Mix=OK, Jelly=OK #
      Devices: Mixer=OK, Jelly_Injector=OK, Frier=OK # Transport:
      Conveyer1=OK, Conveyer2=OK # %%
					

In this example, the VERSION is again 1. The FACILITY is within the legacy syslog range (20). The severity is 6 ("Notice" semantics). It was created on 24 August 2003 at 5:14:15am, with a -9 hour offset from UTC, 3 microseconds into the next second. The HOSTNAME is "192.0.2.1", so the sender did not know its FQDN and used the IPv4 address instead. The SENDER-NAME is "myproc" and the SENDER-INST is "10". The message is "%% It's time to make the do-nuts......".

Example 3 - with STRUCTURED-DATA

        1 888 4 2003-10-11T22:14:15.003Z mymachine.example.com
        EvntSLog - [x-example-iut iut="3" EventSource="Application"
        EventID="1011"] An application event log entry...
					

This example is modeled after example 1. However, this time it contains STRUCTURED-DATA, a single element with the value "[x-example-iut iut="3" EventSource="Application" EventID="1011"]". The MSG itself is "An application event log entry..."

Example 4 - STRUCTURED-DATA Only

        1 888 4 2003-10-11T22:14:15.003Z mymachine.example.com
        EvntSLog - [x-example-iut iut="3" EventSource="Application"
        EventID="1011"][x-example-priority class="high"]
					

This example shows a message with only STRUCTURED-DATA and no MSG part. This is a valid case.



 TOC 

6. Structured Data IDs

This section defines the initial IANA-registered SD-IDs. See Section 5.3STRUCTURED-DATA for a definition of structured data elements. All SD-IDs are optional.

6.1 time

The SD-ID "time" MAY be used by the original sender to describe its notion of system time. This SD-ID SHOULD be written if the sender is not properly synchronized with a reliable external time source or if it does not know if its time zone information is correct. The main use of this structured data element is to provide some information on the level of trust of the TIMESTAMP described in Section 5.2.4TIMESTAMP.

6.1.1 tzknown

The "tzknown" parameter indicates if the original sender knows its time zone. If it does so, the value "1" MUST be used. If the time zone information is in doubt, the value "0" MUST be used. If the sender knows its time zone but decides to emit UTC, the value "1" MUST be used (because the time zone is known).

6.1.2 issynced

The "issynced" parameter indicates if the original sender is synchronized to a reliable external time source, e.g. via NTP. If the original sender is time synchronized, the value "1" SHOULD be used. If not, the value "0" MUST be used.

6.1.3 syncaccuracy

The "syncaccuracy" parameter indicates how accurate the original sender thinks the time synchronization it participates in is. It is an integer describing the maximum number of milliseconds that the clock may be off between synchronization intervals.

If the value "0" is used for "issynced", this parameter MUST NOT be specified. If the value "1" is used for "issynced" but the "syncaccuracy" parameter is absent, a receiver SHOULD assume that the time information provided is accurate enough to be considered correct. The "syncaccuracy" parameter SHOULD ONLY be written if the original sender actually has knowledge of the reliability of the external time source. In practice, in most cases, it will gain this in-depth knowledge only through operator configuration.

6.1.4 Examples

The following is an example of a system that knows that it does neither know its time zone nor if it is being synchronized:

[time tzknown="0" issynced="0"]

With this information, the sender indicates that its time information cannot be trusted. This may be a hint for the receiver to use its local time instead of the message-provided TIMESTAMP for correlation of multiple messages from different senders.

The following is an example of a system that knows its time zone and knows that it is properly synchronized to a reliable external source:

[time tzknown="1" issynced="1"]

Note: this case SHOULD be assumed by a receiver if no "time" SD-ID is provided by the sender.

The following is an example of a system that knows both its time zone and that it is externally synchronized. It also knows the accuracy of the external synchronization:

[time tzknown="1" issynced="1" syncaccuracy="60000"]

The difference between this and the previous example is that the sender expects that its clock will be kept within 60 seconds of the official time. So if the sender reports it is 9:00:00, it is no earlier than 8:59:00 and no later then 9:01:00.

6.2 origin

The SD-ID "origin" MAY be used to indicate the origin of a syslog message. The following parameters can be used. All parameters are optional.

6.2.1 ip

The "ip" parameter denotes the IP address that the sender knows it had at the time of sending this message. It MUST contain the textual representation of an IP address as outlined in Section 5.2.5HOSTNAME.

If a sender has multiple IP addresses, it MAY either use a single of its IP addresses in the "ip" parameter or it MAY include multiple "ip" parameters in a single "origin" structured data element.

6.2.2 enterpriseID

The "enterpriseID" parameter MUST be an 'SMI Network Management Private Enterprise Code', maintained by IANA, whose prefix is iso.org.dod.internet.private.enterprise (1.3.6.1.4.1). The number which follows is unique and may be registered by an on-line form at http://www.iana.org/. Only that number MUST be specified in the "enterpriseID" parameter. The complete up-to-date list of Enterprise Numbers is maintained by IANA at http://www.iana.org/assignments/enterprise-numbers.

By specifying an enterpriseID, the vendor allows more specific parsing of the message. This may be of aid to log analyzers and similar processes.

6.2.3 software

The "software" parameter uniquely identifies the software that generated this message. If it is used, "enterpriseID" SHOULD also be specified, so that a specific vendor's software can be identified. The "software" parameter is not the same as the SENDER-NAME header parameter. It always contains the name of the generating software while SENDER-NAME can contain anything else, including an operator-configured value.

Specifying the "software" parameter is an aid to log analyzers and similar processes.

The "software" parameter is a string. It MUST NOT be longer than 48 characters.

6.2.4 sw-version

The "sw-version" parameter uniquely identifies the version of the software that generated the message. If it is used, the "software" and "enterpriseID" parameters SHOULD be provided, too.

Specifying the "sw-version" parameter is an aid to log analyzers and similar processes.

The "sw-version" parameter is a string. It MUST NOT be longer than 32 characters.

6.2.5 Example

The following is an example with multiple IP addresses:

[origin ip="192.0.2.1" ip="192.0.2.129"]

In this example, the sender indicates that it has two ip addresses, one being 192.0.2.1 and the other one being 192.0.2.129.



 TOC 

7. Security Considerations

7.1 Diagnostic Logging

This document, in multiple sections, recommends that an implementation writes a diagnostic message to indicate unusual situations or other things noteworthy. Diagnostic messages are a useful tool in finding configuration issues as well as a system penetration.

Unfortunately, diagnostic logging can cause issues by itself, for example if an attacker tries to create a denial of service condition by willingly sending malformed messages that will lead to the creation of diagnostic log entries. Due to sheer volume, the resulting diagnostic log entries may exhaust system resources, e.g. processing power, I/O capability or simply storage space. For example, an attacker could flood a system with messages generating diagnostic log entries after he has compromised a system. If the log entries are stored for example in a circular buffer, the flood of diagnostic log entries would eventually overwrite useful previous diagnostics.

Besides this risk, diagnostic message, if they occur too frequently, can become meaningless. Common practice is to turn off diagnostic logging if it is too verbose. This potentially removes important diagnostic information which could aid the operator.

7.2 Control Characters

This document does not impose any restrictions on the MSG content. As such, MSG MAY contain control characters, including the NUL character.

In some programming languages (most notably C and C++), the NUL (0x00) character traditionally has a special significance as string terminator. Most, if not all, implementations of these languages assume that a string will not extend beyond the first NUL character. This is primarily a restriction of the supporting run-time libraries. Please note that this restriction is often carried over to programs and script languages written in those languages. As such, NUL characters must be considered with great care and be properly handled. An attacker may deliberately include NUL characters to hide information after them. Incorrect handling of the NUL character may also invalidate cryptographic checksums that are transmitted inside the message.

Many popular text editors are also written in languages with this restriction. This means that NUL characters SHOULD NOT be written to a file in an unencoded way - otherwise it would potentially render the file unreadable.

The same is true for other control characters. For example, deliberately included backspace characters may be used by an attacker to render parts of the log message unreadable. Similar approaches exist for almost all control characters.

Finally, invalid UTF-8 sequences may be used by an attacker to inject ASCII control characters. This is why invalid UTF-8 sequences are not allowed and MUST be rejected.

7.3 More than Maximum Message Length

The message length MUST NOT exceed the maximum value outlined in Section 5Required syslog Format. Various problems may result if a sender sends out messages with a greater length. While this document forbids oversize messages, an attacker may deliberately introduce them. As such, it is vital that each receiver performs the necessary sanity checks.

7.4 Message Length

An attacker might deliberately send message with the maximum size. This could lead to massive resource consumption and potentially denial of service on the receiver or an interim system. Besides the DoS itself, this could result in the loss of vital log data. As such, a DoS attack could be used as a way to hide another attack.

To avoid this problem, the network operator may limit the size of the received message to some value below the maximum supported by the protocol. An implementation may also provide a feature where only a configured number of maximum size messages are allowed and truncation occurs if these occur too frequently.

7.5 Message Truncation

Messages over the minimum to be supported size may be discarded or truncated by the receiver or interim systems. As such, vital log information may be lost. Even messages within that size may be lost if a non-reliable transport mapping is used.

In order to prevent information loss, messages should be less then the minimum supported size outlined in Section 5.1Message Length. For best performance and reliability, messages SHOULD be as small as possible. Important information SHOULD be placed as early in the message as possible, as the information at the begin of the message is less likely to be discarded by a size-limited receiver.

In case an application includes some user-supplied data within a syslog message, this application should limit the size of this data. Otherwise, an attacker may provide large data in the hope to exploit this potential weakness.

7.6 Single Source to a Destination

The syslog messages are usually presented (placed in a file, displayed on the console, etc.) in the order in which they are received. This is not always in accordance with the sequence in which they were generated. As they are transmitted across an IP network, some out of order receipt should be expected. This may lead to some confusion as messages may be received that would indicate that a process has stopped before it was started. This is somewhat rectified by the TIMESTAMP. However, the accuracy of the TIMESTAMP may not always be sufficiently enough.

It is desirable to use a transport with guaranteed delivery, if one is available.

7.7 Multiple Sources to a Destination

In syslog, there is no concept of unified event numbering. Single senders are free to include a sequence number within the MSG but that can hardly be coordinated between multiple senders. In such cases, multiple senders may report that each one is sending message number one. Again, this may be rectified somewhat by the TIMESTAMP. As has been noted, however, even messages from a single sender to a single collector may be received out of order. This situation is compounded when there are several senders configured to send their syslog messages to a single collector. Messages from one sender may be delayed so the collector receives messages from another sender first even though the messages from the first sender were generated before the messages from the second. If there is no sufficiently-precise timestamp or coordinated sequence number, then the messages may be presented in the order in which they were received which may give an inaccurate view of the sequence of actual events.

7.8 Multiple Sources to Multiple Destinations

The plethora of configuration options available to the network administrators may further skew the perception of the order of events. It is possible to configure a group of senders to send status messages -or other informative messages- to one collector, while sending messages of relatively higher importance to another collector. Additionally, the messages may be sent to different files on the same collector. If the messages do not contain sufficiently-precise timestamps from the source, it may be difficult to order the messages if they are kept in different places. An administrator may not be able to determine if a record in one file occurred before or after a record in a different file. This may be somewhat alleviated by placing marking messages with a timestamp into all destination files. If these have coordinated timestamps, then there will be some indication of the time of receipt of the individual messages. As such, it is highly recommended to use the best available precision in the TIMESTAMP and use automatic time synchronization on each systems (as, for example, can be done via NTP).

7.9 Replaying

Messages may be recorded and replayed at a later time. An attacker may record a set of messages that indicate normal activity of a machine. At a later time, that attacker may remove that machine from the network and replay the syslog messages to the collector. Even with a TIMESTAMP field in the HEADER part, an attacker may record the packets and could simply modify them to reflect the current time before retransmitting them. The administrators may find nothing unusual in the received messages and their receipt would falsely indicate normal activity of the machine.

Cryptographically signing messages could prevent the alteration of TIMESTAMPs and thus the reply attack.

7.10 Reliable Delivery

As there is no mechanism described within this document to ensure delivery, and since the underlying transport may be lossey (e.g. UDP), some messages may be lost. They may either be dropped through network congestion, or they may be maliciously intercepted and discarded. The consequences of the drop of one or more syslog messages cannot be determined. If the messages are simple status updates, then their non-receipt may either not be noticed, or it may cause an annoyance for the system operators. On the other hand, if the messages are more critical, then the administrators may not become aware of a developing and potentially serious problem. Messages may also be intercepted and discarded by an attacker as a way to hide unauthorized activities.

It is RECOMMENDED to use a reliable transport mapping to prevent this problem.

7.11 Message Integrity

Besides being discarded, syslog messages may be damaged in transit, or an attacker may maliciously modify them. In such cases, the original contents of the message will not be delivered to the collector. Additionally, if an attacker is positioned between the sender and collector of syslog messages, they may be able to intercept and modify those messages while in-transit to hide unauthorized activities.

7.12 Message Observation

While there are no strict guidelines pertaining to the MSG format, most syslog messages are generated in human readable form with the assumption that capable administrators should be able to read them and understand their meaning. Neither the syslog protocol nor the syslog application have mechanisms to provide confidentiality of the messages in transit. In most cases passing clear-text messages is a benefit to the operations staff if they are sniffing the packets off of the wire. The operations staff may be able to read the messages and associate them with other events seen from other packets crossing the wire to track down and correct problems. Unfortunately, an attacker may also be able to observe the human-readable contents of syslog messages. The attacker may then use the knowledge gained from those messages to compromise a machine or do other damage.

7.13 Misconfiguration

Since there is no control information distributed about any messages or configurations, it is wholly the responsibility of the network administrator to ensure that the messages are actually going to the intended recipient. Cases have been noted where senders were inadvertently configured to send syslog messages to the wrong receiver. In many cases, the inadvertent receiver may not be configured to receive syslog messages and it will probably discard them. In certain other cases, the receipt of syslog messages has been known to cause problems for the unintended recipient. If messages are not going to the intended recipient, then they cannot be reviewed or processed.

Using a reliable transport mapping can guard against these problems.

7.14 Forwarding Loop

As it is shown in Figure 1, machines may be configured to relay syslog messages to subsequent relays before reaching a collector. In one particular case, an administrator found that he had mistakenly configured two relays to forward messages with certain Priority values to each other. When either of these machines either received or generated that type of message, it would forward it to the other relay. That relay would, in turn, forward it back. This cycle did cause degradation to the intervening network as well as to the processing availability on the two devices. Network administrators must take care to not cause such a death spiral.

7.15 Load Considerations

Network administrators must take the time to estimate the appropriate size of the syslog receivers. An attacker may perform a Denial of Service attack by filling the disk of the collector with false messages. Placing the records in a circular file may alleviate this but that has the consequence of not ensuring that an administrator will be able to review the records in the future. Along this line, a receiver or collector must have a network interface capable of receiving all messages sent to it.

Administrators and network planners must also critically review the network paths between the devices, the relays, and the collectors. Generated syslog messages should not overwhelm any of the network links.

In order to reduce the impact of this issue, it is recommended to use transports with guaranteed delivery.

7.16 Denial of Service

As with any system, an attacker may just overwhelm a receiver by sending more messages to it than can be handled by the infrastructure or the device itself. Implementors should attempt to provide features that minimize this threat. Such as only receiving syslog messages from known IP addresses.

7.17 Covert Channels

Nothing in this protocol attempts to eliminate covert channels. Indeed, the unformatted message syntax in the packets could be very amenable to sending embedded secret messages. In fact, just about every aspect of syslog messages lends itself to the conveyance of covert signals. For example, a collusionist could send odd and even PRI values to indicate Morse Code dashes and dots.



 TOC 

8. Notice to RFC Editor

This is a note to the RFC editor. This ID is submitted along with ID draft-ietf-syslog-transport-udp and they cross-reference each other. When RFC numbers are determined for each of these IDs, these references will be updated to use the RFC numbers. This section will be removed at that time.



 TOC 

9. IANA Considerations

9.1 Version

IANA must maintain a registry of VERSION values as described in Section 5.2.1VERSION.

For this document, IANA must register the VERSION "1". New VERSION numbers must monotonically increment (the next VERSION will be "2") and will be registered via the Specification Required method as described in RFC 2434Narten, T. and H. Alvestrand, Guidelines for Writing an IANA Considerations Section in RFCs, October 1998.[10].

9.2 SD-IDs

IANA must maintain a registry of Structured Data ID (SD-ID) values as described in Section 6Structured Data IDs. These are the SD-IDs which do NOT have a hyphen ("-") in the second character position.

New SD-ID values may be registered through the Specification Required method as described in RFC 2434Narten, T. and H. Alvestrand, Guidelines for Writing an IANA Considerations Section in RFCs, October 1998.[10].

For this document, IANA must register the SD-IDs "time" and "origin".



 TOC 

10. Authors and Working Group Chair

The working group can be contacted via the mailing list:

      syslog-sec@employees.org
				

The current Chair of the Working Group may be contacted at:

      Chris Lonvick
      Cisco Systems
      Email: clonvick@cisco.com
				

The author of this draft is:

      Rainer Gerhards
      Email: rgerhards@adiscon.com

      Phone: +49-9349-92880
      Fax: +49-9349-928820
      
      Adiscon GmbH
      Mozartstrasse 21
      97950 Grossrinderfeld
      Germany
				



 TOC 

11. Acknowledgments

The authors wish to thank Chris Lonvick, Jon Callas, Andrew Ross, Albert Mietus, Anton Okmianski, Tina Bird, David Harrington and all other people who commented on various versions of this proposal.



 TOC 

12. References



 TOC 

12.1 Normative

[1] American National Standards Institute, "USA Code for Information Interchange", ANSI X3.4, 1968.
[2] Postel, J., "Internet Protocol", STD 5, RFC 791, September 1981.
[3] Mockapetris, P., "Domain names - concepts and facilities", STD 13, RFC 1034, November 1987.
[4] Mockapetris, P., "Domain names - implementation and specification", STD 13, RFC 1035, November 1987.
[5] Baker, F., "Requirements for IP Version 4 Routers", RFC 1812, June 1995.
[6] Bradner, S., "Key words for use in RFCs to Indicate Requirement Levels", BCP 14, RFC 2119, March 1997 (TXT, HTML, XML).
[7] Yergeau, F., "UTF-8, a transformation format of ISO 10646", RFC 2279, January 1998.
[8] Crocker, D. and P. Overell, "Augmented BNF for Syntax Specifications: ABNF", RFC 2234, November 1997.
[9] Hinden, R. and S. Deering, "IP Version 6 Addressing Architecture", RFC 2373, July 1998 (TXT, HTML, XML).
[10] Narten, T. and H. Alvestrand, "Guidelines for Writing an IANA Considerations Section in RFCs", BCP 26, RFC 2434, October 1998 (TXT, HTML, XML).
[11] Lonvick, C., "The BSD Syslog Protocol", RFC 3164, August 2001.
[12] Klyne, G. and C. Newman, "Date and Time on the Internet: Timestamps", RFC 3339, July 2002.
[13] Okmianski, A., "Transmission of syslog messages over UDP", RFC 9999, August 2004.


 TOC 

12.2 Informative

[14] Malkin, G., "Internet Users' Glossary", RFC 1983, August 1996.
[15] New, D. and M. Rose, "Reliable Delivery for syslog", RFC 3195, November 2001 (HTML, XML).


 TOC 

Author's Address

  Rainer Gerhards
  Adiscon GmbH
  Mozartstrasse 21
  Grossrinderfeld, BW 97950
  Germany
EMail:  rgerhards@adiscon.com


 TOC 

Appendix A. Implementor Guidelines

Information in this section is given as an aid to implementors. While this information is considered to be helpful, it is not normative. As such, an implementation is NOT REQUIRED to implement it in order to claim compliance to this specification.

A.1 Message Length

Implementors should note the message size limitations outlined in Section 5.1Message Length and try to keep the most important parts early in the message (within the minimum guaranteed length). This ensures they will be seen by the receiver even if it (or a relay on the message path) truncates the message.

A.2 HEADER Parsing

The section RECOMMENDS a message header parsing method based on the VERSION field described in Section 5.2.1VERSION.

The receiver MUST check the VERSION. If the VERSION is within the set of versions supported by the receiver, it MUST parse the message according to the correct syslog protocol specification.

If the receiver does not support the specified VERSION, it SHOULD log a diagnostic message. It SHOULD NOT parse beyond the VERSION field. This is because the header format may have changed in a newer version. It SHOULD NOT try to process the message, but it MAY try this if the administrator has configured the receiver to do so. In the latter case, the results may be undefined. If the administrator has configured the receiver to parse a non-supported version, it SHOULD assume that these messages are legacy syslog messages and parse and process them with respect to RFC 3164Lonvick, C., The BSD Syslog Protocol, August 2001.[11]. To be precise, a receiver receiving an unknown VERSION number, or a message without a valid VERSION, MUST discard the message by default. However, the administrator may configure it to not discard these messages. If that happens, the receiver MUST parse it according to RFC 3164Lonvick, C., The BSD Syslog Protocol, August 2001.[11]. The administrator may again override this setting and configure the receiver to parse the messages in any way. It would be considered good form if the receiver were to attempt to ensure that no application reliability issues occur.

The spirit behind these guidelines is that the administrator may sometime need the power to allow overriding of version-specific parsing, but this should be done in the most secure and reliable way. Therefore, the receiver MUST use the appropriate defaults specified above. This document is specific on this point because it is common experience that parsing unknown formats often leads to security issues.

A.3 SEVERITY Values

This section describes guidelines for using SEVERITY as outlined in Section 5.2.3SEVERITY.

All implementations SHOULD try to assign the most appropriate severity to their message. Most importantly, messages designed to enable debugging or testing of software SHOULD be assigned severity 7. Severity 0 SHOULD be reserved for messages of very high importance (like serious hardware failures or imminent power failure). An implementation MAY use severities 0 and 7 for other purposes if this is configured by the administrator.

Since severities are very subjective, the receiver SHOULD NOT assume that all senders have the same definition of severity.

A.4 time-secfrac Precision

The TIMESTAMP described in Section 5.2.4TIMESTAMP supports fractional seconds. This provides ground for a very common coding error, where leading zeros are removed from the fractional seconds. For example, the TIMESTAMP "2003-10-11T22:13:14.003" may be erroneously written as "2003-10-11T22:13:14.3". This would indicate 300 milliseconds instead of the 3 milliseconds actually meant.

A.5 Leap Seconds

The TIMESTAMP described in Section 5.2.4TIMESTAMP permits leap seconds, as described in RFC 3339Klyne, G. and C. Newman, Date and Time on the Internet: Timestamps, July 2002.[12].

The value "60" in the time-second field is used to indicate a leap second. This MUST NOT be misinterpreted. Developers and implementors are advised to replace the value "60" if seen in the header, with the value "59" if it otherwise can not be processed, e.g. stored to a database. It SHOULD NOT be converted to the first second of the next minute. Please note that such a conversion, if done on the message text itself, will cause cryptographic signatures to become invalid. As such, it is suggested that the adjustment is not performed when the plain message text is to be stored (e.g. for later verification of signatures).

A.6 Syslog Senders Without Knowledge of Time

In Section 5.2.4.1Syslog Senders Without Knowledge of Time, a specific TIMESTAMP for usage by senders without knowledge of time is defined. This is done to support a special case when a sender is not aware of time at all. It can be argued whether or not such a sender is something that can actually be found in today's IT infrastructure. However, discussion has indicated that those things may exist in practice and as such there should be a guideline established for this case. It may also be assumed that this class of senders will most probably be found in embedded devices.

Note well: an implementation MUST emit a valid TIMESTAMP if the underlying operating system, programming system and hardware supports the clock function. A proper TIMESTAMP MUST be emitted even if it is difficult, but doable, to obtain the system time. The TIMESTAMP described in Section 5.2.4.1Syslog Senders Without Knowledge of Time MUST only be used when it is actually impossible to obtain time information. This rule SHOULD NOT be used as an excuse for lazy implementations.

If a receiver receives that special TIMESTAMP, it SHOULD know that the sender had no idea of what the time actually is and act accordingly.

A.7 Additional Information on SENDER-INST

The objective behind SENDER-INSTSENDER-INST is to provide a quick way to detect a new instance of the same sender. It must be noted that this is not reliable as a second incarnation of a SENDER-INST may actually be able to use the same SENDER-INST value as the prior one. Properly used, the SENDER-INST can be helpful for analysis purposes.

A.8 Notes on the time SD-ID

It is RECOMMENDED that the value of "0" be the default for the "tzknown"tzknown parameter. It SHOULD only be changed to "1" after the administrator has specifically configured the time zone. The value "1" MAY be used as the default if the underlying operating system provides accurate time zone information. It is still advised that the administrator explicitly acknowledges the correctness of the time zone information.

It is important not to create a false impression of accuracy with the time SD-IDtime. A sender MUST only indicate a given accuracy if it actually knows it is within these bounds. It is generally assumed that the sender gains this in-depth knowledge through operator configuration. As such, by default, an accuracy SHOULD NOT be provided.

A.9 Recommendation for Diagnostic Logging

In Section 7.1Diagnostic Logging, this document describes the need as well as potential problems of diagnostic logging. In this section, a real-world approach to useful diagnostic logging is RECOMMENDED.

While this document recommends to write meaningful diagnostic logs, it also recommends to allow an operator to limit the amount of diagnostic logging. At least, an implementation SHOULD differentiate between critical, informational and debugging diagnostic message. Critical messages should only be issued in real critical states, e.g. expected or happening malfunction of the application or parts of it. A strong indication of an ongoing attack may also be considered critical. As a guideline, there should be very few critical messages. Informational messages should indicate all conditions not fully correct, but still within the bounds of normal processing. A diagnostic message logging the fact that a malformed message has been received is a good example of this category. A debug diagnostic message should not be needed during normal operation, but merely as a tool for setting up or testing a system (which includes the process of an operator configuring multiple syslog applications in a complex environment). An application may decide to not provide any debugging diagnostic messages.

An administrator should be able to configure the level for which diagnostic messages will be written. Non-configured diagnostic should not be written but discarded. An implementor may create as many different levels of diagnostic messages as he see useful - the above recommendation is just based on real-world experience of what is considered useful. Please note that experience also shows that too many levels of diagnostics typically do no good, because the typical administrator may no longer be able to understand what each level means.

Even with this categorization, a single diagnostic (or a set of them) may frequently be generated when a specific condition exists (or a system is being attacked). It will lead to the security issues outlined at the beginning of this section. To solve this, it is recommended that an implementation be allowed to set a limit of how many duplicate diagnostic messages will be generated within a limited amount of time. For example, an administrator should be able to configure that groups of 50 identical messages are logged within a specified time period with only a single diagnostic message. All subsequent identical messages will be discarded until the next time interval. It is usually considered good form to generate a subsequent message identifying the number of duplicate messages that were discarded. While this causes some information loss, it is considered a good compromise between avoiding overruns and providing most in-depth diagnostic information. An implementation offering this feature should allow the administrator to configure the number of duplicate messages as well as the time interval to whatever the administrator thinks to be reasonable for his needs. It is up to the implementor of what the term "duplicate" means. Some may decide that only totally identical (in byte-to-byte comparison) messages are actually duplicate, some other may say that a message which is of identical type but with just some changed parameter (e.g. changed remote host address) is also considered to be a duplicate. Both approaches have their advantages and disadvantages. Probably, it is best to also leave this configurable and allow the administrator to set the parameters.



 TOC 

Intellectual Property Statement

Disclaimer of Validity

Copyright Statement

Acknowledgment