Sandesh

Introduction

Sandesh is the name of a unified infrastructure in Contrail Virtual Networking solution to support:

Sandesh is also the name of the XML over TCP protocol used by the software modules in the Contrail Controller and Contrail vRouter to send the above information to the Analytics node.

Software modules on the Controller like the ControlNode, ApiServer, Schema, ServiceMonitor, QueryEngine, OpServer and also the vRouter agent are generators of the information and they produce and send the information to a module called the Collector on the Analytics node using the Sandesh protocol. The Collector receives the information and populates the Analytics database for real-time or offline analysis and debugging using the REST API provided by the Analytics node.

The Sandesh software framework is designed for scalable cross-language development, and it combines a software stack that includes a generator side library with a code generation engine to allow software developers working on the generators to easily produce and report information to the Analytics node. Languages currently supported are C++, Python and C. The code generation engine is based on Apache Thrift - http://thrift.apache.org.

Sandesh Code Generator

Overview

The Sandesh compiler or code generator allows the developers to define data types and messages to be sent to the Analytics node in a simple definition file. Taking that file as input, the compiler generates code to be used to easily send the messages to the Collector. Instead of writing a load of boilerplate code to serialize and transport the objects to the Collector, the developers working on generators can get right down to business. The data types supported in the IDL file are bool, byte, i16, i32, i64, string, list, map, struct, const static string, sandesh, u16, u32. Annotations are used in the IDL file to convey information to the Collector, for example, the annotation(key=<Table-Name>)is used to indicate that the message should be stored in a particular indexed table in the analytics database.

The Sandesh compiler is used to generate the Sandesh IDL File (.sandesh) into source code which is used by the generators. To generate the source from a sandesh file, user can run:

sandesh --gen <language> <Sandesh filename>

For example, for a sandesh file vns.sandesh, running sandesh --gen cpp vns.sandesh generates the following auto-generated C++ code:

vns_types.h
vns_types.cpp
vns_constants.h
vns_constants.cpp
vns_html.cpp
vns_html_template.cpp
vns_request_skeleton.cpp
vns.html
vns.xsl
vns.xml
style.css

Similarly, running sandesh --gen py vns.sandesh generates the following auto-generated python code:

gen_py/__init__.py
gen_py/vns
gen_py/vns/__init__.py
gen_py/vns/ttypes.py
gen_py/vns/constants.py
gen_py/vns/http_request.py
gen_py/vns/vns.xml
gen_py/vns/vns.xsl
gen_py/vns/vns.html
gen_py/vns/request_skeleton.py
gen_py/vns/style.css
gen_py/vns/index.html

Sandesh Types

Developers can define different types of sandesh in the sandesh file based on the need to convey different type of information to the analytics node. Following are the currently supported types of sandesh:

  1. Systemlog

    Use Case:

    Structured log replacement for syslog

    Example:

    systemlog sandesh BgpPeerTableMessageLog {
        1: string PeerType;
        2: "Peer"
        3: string Peer;
        4: "in table";
        5: string Table;
        6: ":";
        7: string Message;
    }
    
    Contrail-Logs Output:
    Apr 01 13:36:21.666916 a6s23.contrail.juniper.net [ControlNode:BGP] : BgpPeerTableMessageLog:1 Xmpp Peer default-global-system-config:a6s23 in table inet.0 : Register routing-table for RibInAdd, RibOutAdd
    
  2. Objectlog

    Use Case:

    Logging state transitions and lifetime events for objects (VirtualMachine, VirtualNetwork). Objectlog is useful for performing historical state queries on an object. Objects have an object-id, which is indicated using the annotation (key="Object<Name>").For example, RoutingInstanceInfo below has name as the key.

    Example:
    struct RoutingInstanceInfo {
        1: string name (key="ObjectRoutingInstance");
        2: optional string route_distinguisher;
        3: optional string operation;
        4: optional string peer;
        5: optional string family;
        6: optional list<string> add_import_rt;
        7: optional list<string> remove_import_rt;
        8: optional list<string> add_export_rt;
        9: optional list<string> remove_export_rt;
        10: string hostname;
    }
    
    objectlog sandesh RoutingInstanceCollector {
        1: RoutingInstanceInfo routing_instance;
    }
    

    Contrail-Logs Output:

    Apr 01 13:36:20.769677 a6s23.contrail.juniper.net [ControlNode:__default__] : RoutingInstanceCollector:11 [RoutingInstanceInfo: name = default-domain:service:default-virtual-network:default-virtual-network, operation = , [add_import_rt: target:64512:4], [add_export_rt: target:64512:4], hostname = a6s23]
    
  3. User Visible Entities (UVE)

    Use Case:

    User Visible Entities (UVEs) are used represent the system-wide state of externally visible objects. UVEs are a special case of an objectlog. They used to display the operational state of an object like VirtualMachine (VM) or VirtualNetwork (VN) in the Contrail VNS, by aggregating information from uve sandesh messages across generator types (configuration node, vRouter Agent and control node) and across nodes. UVE definitions like objectlog need the key annotation.

    Details and Example:

    For example, consider the VirtualNetwork (VN) uve sandesh definition below. We specify its state in two tiers: Configuration and vRouter Agent. For each tier, we return a single structure, even though a given VN might live on many modules in that tier. A VN might be present on many vRouter Agents. These agents are expected to send UVE sandesh messages when any attribute of the VN changes state. The vRouter Agent tier of the VN UVE looks like:

    struct
    UveInterVnStats {
        1:string	other_vn(aggtype="listkey")
        2:i64	out_tpkts
        3:i64	in_tpkts
    }
    
    struct
    UveVirtualNetworkAgent {<
        1: string	name(key="ObjectVNTable")
        2: optional bool deleted
        3: optional i32  total_acl_rules
        4: optional i32  total_analyzers(aggtype="sum")
        5: optional i64  in_tpkts
        7: optional i64  out_tpkts
        9: optional list<UveInterVnStats>stat(aggtype="append")
        11: optional list<string> vm_list(aggtype="union")
    }
    
    uve
    sandesh UveVirtualNetworkAgentTrace {
        1: UveVirtualNetworkAgent
    }
    

    Following are the guidelines that need to be followed when defining a sandesh UVE:

    1. A Sandesh UVE must have a single attribute (with the UVE Struct), which must be named "data"
    2. The UVE struct must have a mandatory field called "name". This must have a "key" annotation which has the name of the analytics database index table corresponding with this UVE type (each UVE Type has exactly one analytics database index table, which is the same across all tiers of this UVE Type).
    3. The UVE struct may have an optional field named "deleted". A backend module implicitly announces the existing of an object by filling out any attribute(s) of the UVEstruct and sending out the sandesh UVE. But it must explicitly announce its deletion by filling out the "deleted" field and sending the sandesh UVE.
    4. The UVE struct can have any number of other optional or mandatory fields. These field can be struct themselves as well. The "aggtype" annotation allows the sender to choose how that attribute should be aggregated across multiple modules.
    5. When sending any list attributes of a UVE, the developer must send the entire list or not send the attribute at all (do not send only the changed/added elements) . Each time you re-send a list attribute, the previous value of the list attribute is totally overwritten.

    Aggregation Rules:

    1. No "aggtype"

      In this case, it is expected this this attribute should have the same value in all modules. For example, each vRouter Agent should have the same value for "total_acl_rules" for a given VN. The aggregated value of "total_acl_rules" will be a list of lists. This top level list will have one list per unique reported value of "total_acl_rules" Each of these sublists will have its first element as the value itself, and the remaining elements will be the modules that reported that value.

    2. aggtype=sum

      This is only valid for integer types. In this case, the aggregate value reported should be a sum of the values reported by all module. For example, each vRouter Agent tracks the number of analyzer instances attached to VN using the attribute "total_analyzers". The aggregate value should be a sum of the values of this attribute across all agents on which the given VN lives.

    3. aggtype=counter

      This is also valid only for integer types. Just like " aggtype=sum", the aggregate value will be a sum of values across modules. But , we will also store this value when the object is deleted. For example, the "in_tpkts" attribute will report the total number of input packets seen for this VN, even if some vRouter Agents have restarted several times. (and have lost track of the packets statistics from their previous runs).

    4. aggtype=union

      This is valid for list of struct, or for list of elementary types. The aggregate value is a list which is created by doing a set union of the lists across all modules reporting on the object. For example, "vm_list" is a list of all VMs that are attached to this VN across all agents.

    5. aggtype=append

      This is valid for list of struct. The struct type can have multiple integer type attributes, and exactly one attribute with the annotation "aggtype=key", which is treated as the key. During aggregation, we combine the lists for all modules and create a set of struct with unique keys. If two struct elements have the same key, they are combined by adding together all the integer attributes. We maintain the struct elements of deleted objects as well. For example, VN1 exists on both agent A1 and agent A2. It exchanges packets with VN2 on both agents as well. The "stat" lists on A1 as well as on A2 have "UveInterVnStats" entries for VN2. We have got 200 packets from VN2 on A1 and 100 packets from VN2 on A2. The aggregated "stat" list will have a single entry for VN2 with 300 packets.

    6. aggtype=stats

      This allows developer to store and give historical data of this field. For example, it gives the last 10 values reported for cpu util %. It also gives the highest 5 values reported in the *prev* hour. The optional hbin gives a histogram for the same field over the course of it's existence.

  4. Trace

    Use Case:

    Light-weight in memory buffer logs for frequently occurring events

    Example:

    trace
    sandesh XmppRxStream {
        1: "Received xmpp message from: ";
        2: string IPaddress;
        3: "Port";
        4: i32 port;
        5: "Size: ";
        6: i32 size;
        7: "Packet: ";
        8: string packet;
        9: "$";
    }
    

    Log file output when local logging is enabled:

    2013-03-29 01:56:21 [139959143454464]: XmppRxStream: Received xmpp message from: 10.84.13.23 Port 5269 Size: 196 Packet: <?xml version="1.0"?>
    <stream:stream from="" to="default-global-system-config:a6s23" id="++123" version="1.0" xml:lang="en" xmlns="jabber:client" xmlns:stream="http://etherx.jabber.org/streams" > $ src/xmpp/xmpp_connection.cc 248 0
    
  5. Traceobject

    Use Case:

    Light-weight in memory buffer logs for frequently occurring object state transitions

    Example:

    traceobject
    sandesh RoutingInstanceCreate {
        1: string name;
        2: list<string> import_rt;
        3: list<string> export_rt;
        4: string virtual_network;
        5: i32 index;
    }
    

    Log file output when local logging is enabled:

    013-03-29 01:46:29 [140461185882112]: RoutingInstanceTableCreate: name = default-domain:default-project:ip-fabric:__default__ table = bgp.l3vpn.0 family = inet-vpn file = src/bgp/routing-instance/routing_instance.cc line = 233 more = 0
    

    Note about trace and traceobject:

    Developer needs to create a Sandesh Trace Buffer with a given size wherein the trace and traceobject sandesh are stored. HTTP introspect (explained in the request and response sandesh) can be used to request viewing of the trace buffer. Tracing to the buffer can be enabled or disabled, and multiple types of trace and traceobject sandesh can be traced into a single trace buffer.

  6. Buffer

    Use Case:

    Exchanging data between kernel and userspace, not sent to collector. Provides only marshaling and un-marshaling.

    Example:

    buffer
    sandesh vr_nexthop_req {
        1: sandesh_op h_op;
        2: byte	nhr_type;
        3: byte     nhr_family;
        4:  i32     nhr_id;
        5:  i32     nhr_rid;
        6:  i32     nhr_encap_oif_id;
        7:  i32     nhr_encap_len;
        8:  i32     nhr_encap_family;
        9:  i32     nhr_vrf;
        10: i32     nhr_tun_sip;
        11: i32     nhr_tun_dip;
        12: i16     nhr_tun_sport;
        13: i16     nhr_tun_dport;
        14: i32     nhr_ref_cnt;
        15: i32     nhr_marker;
        16: i16	nhr_flags;
        17: list<byte>  nhr_encap;
        18: list<i32>   nhr_nh_list;
        19: i32     nhr_label;
    }
    

  7. Flowlog

    Use Case:

    Reporting flow statistics for historical analysis

    Example:

    struct FlowDataIpv4 {
        1: string flowuuid;
        2: byte direction_ing;
        3: optional string sourcevn;
        4: optional i32 sourceip;
        5: optional string destvn;
        6: optional i32 destip;
        7: optional byte protocol;
        8: optional i16 sport;
        9: optional i16 dport;
        10: optional byte tos;
        11: optional byte tcp_flags;
        12: optional string vm;
        13: optional string input_interface;
        14: optional string output_interface;
        15: optional i32 mpls_label;
        16: optional string reverse_uuid;
        17: optional i64 setup_time;
        18: optional i64 teardown_time;
        19: optional i32 min_interarrival;
        20: optional i32 max_interarrival;
        21: optional i32 mean_interarrival;
        22: optional i32 stddev_interarrival;
        23: optional i64 bytes;
        24: optional i64 packets;
        25: optional binary data_sample;
    }
    
    sandesh FlowDataIpv4Object {
        1: FlowDataIpv4 flowdata;
    }
    

  8. Request and Response

    Use Case:

    Request is used to send commands from requestor to generator. Response is used for response from generator to requestor. The requestor can be the analytics node Collector or a HTTP web server running in the generator used for debugging the software modules in the Contrail Controller and Contrail vRouter. Request and response sandesh provide debugging facility via HTTP introspect. Developer can use a Web browser to send sandesh request and get back sandesh response providing a RESTful API to the software modules.

    Example:

    request sandesh
    SandeshLoggingParamsSet {
        1: bool enable;
        2: string category;
        3: string level;
    }
    
    response sandesh
    SandeshLoggingParams {
        1: bool enable;
        2: string category;
        3: string level;
    }
    

    Notes:

    The developer is expected to provide the implementation of the request handling function. For the above example, in C++ it will be implementation of SandeshLoggingParamsSet::HandleRequest function and for python a bound method named handle_request is expected to be present in SandeshLoggingParamsSet.