1 of 12

Operations

Perform usage operations on NSO.

NEDs and Adding Devices

Learn about NEDs, their types, and how to work with them.

Network Element Drivers, NEDs, provides the connectivity between NSO and the devices. NEDs are installed as NSO packages. For information on how to add a package for a new device type, see NSO .

To see the list of installed packages (you will not see the F5 BigIP):

The core parts of a NED are:

A Driver Element: Running in a Java VM.
Data Model: Independent of the underlying device interface technology, NEDs come with a data model in YANG that specifies configuration data and operational data that is supported for the device.
- For native NETCONF devices, the YANG comes from the device.
- For JunOS, NSO generates the model from the JunOS XML schema.
- For SNMP devices, NSO generates the model from the MIBs.
- For CLI devices, the NED designer writes the YANG to map the CLI.
Code: For NETCONF and SNMP devices, there is no code. For CLI devices there is a minimum of code managing connecting over SSH/Telnet and looking for version strings. The rest is auto-rendered from the data model.

There are four categories of NEDs depending on the device interface:

NETCONF NED: The device supports NETCONF, for example, Juniper.
CLI NED: Any device with a CLI that resembles a Cisco CLI.
Generic NED: Proprietary protocols like REST, and non-Cisco CLIs.
SNMP NED: An SNMP device.

Device Authentication

Every device needs an auth group that tells NSO how to authenticate to the device:

The CLI snippet above shows that there is a mapping from the NSO users admin and oper to the remote user and password to be used on the devices. There are two options, either a mapping from the local user to the remote user or to pass the credentials. Below is a CLI example to create a new authgroup foobar and map NSO user jim:

This auth group will pass on joe's credentials to the device.

There is a similar structure for SNMP devices authgroups snmp-group that supports SNMPv1/v2c, and SNMPv3 authentication.

The SNMP auth group above has a default auth group for non-mapped users.

Connecting Devices for Different NED Types

Make sure you know the authentication information and created authgroups as above. Also, try all information like port numbers and authentication information, and that you can read and set the configuration over for example CLI if it is a CLI NED. So if it is a CLI device try to ssh (or telnet) to the device and do show and set configuration first of all.

All devices have a admin-state with default value southbound-locked. This means that if you do not set this value to unlocked no commands will be sent to the device.

CLI NEDs

(See also examples.ncs/getting-started/using-ncs/2-real-device-cisco-ios). Straightforward, adding a new device on a specific address, standard SSH port:

NETCONF NEDs, JunOS

See also /examples.ncs/getting-started/using-ncs/3-real-device-juniper. Make sure that NETCONF over SSH is enabled on the JunOS device:

Then you can create a NSO netconf device as:

SNMP NEDs

(See also examples.ncs/snmp-ned/basic/README .) First of all, let's explain SNMP NEDs a bit. By default all read-only objects are mapped to operational data in NSO and read-write objects are mapped to configuration data. This means that a sync-from operation will load read-write objects into NSO. How can you reach read-only objects? Note the following is true for all NED types that have modeled operational data. The device configuration exists at devices device config and has a copy in CDB. NSO can speak live to the device to fetch for example counters by using the path devices device live-status:

In many cases, SNMP NEDs are used for reading operational data in parallel with a CLI NED for writing and reading configuration data. More on that later.

Before trying NSO use net-snmp command line tools or your favorite SNMP Browser to try that all settings are ok.

Adding an SNMP device assuming that NED is in place:

MIB Groups are important. A MIB group is just a named collection of SNMP MIB Modules. If you do not specify any MIB group for a device, NSO will try with all known MIBs. It is possible to create MIB groups with wild cards such as CISCO*.

Generic NEDs

Generic devices are typically configured like a CLI device. Make sure you set the right address, port, protocol, and authentication information.

Below is an example of setting up NSO with F5 BigIP:

Live Status Protocol

Assume that you have a Cisco device that you would like NSO to configure over CLI but read statistics over SNMP. This can be achieved by adding settings for live-device-protocol:

Device c0 has a config tree from the CLI NED and a live-status tree (read-only) from the SNMP NED using all MIBs in the group snmp.

Multi-NEDs for Statistics

Sometimes we wish to use a different protocol to collect statistics from the live tree than the protocol that is used to configure a managed device. There are many interesting use cases where this pattern applies. For example, if we wish to access SNMP data as statistics in the live tree on a Juniper router, or alternatively, if we have a CLI NED to a Cisco-type device, and wish to access statistics in the live tree over SNMP.

The solution is to configure additional protocols for the live tree. We can have an arbitrary number of NEDs associated to statistics data for an individual managed device.

The additional NEDs are configured under /devices/device/live-status-protocol.

In the configuration snippet below, we have configured two additional NEDs for statistics data.

Administrative State for Devices

Devices have an admin-state with following values:

unlocked: the device can be modified and changes will be propagated to the real device.
southbound-locked: the device can be modified but changes will not be propagated to the real device. Can be used to prepare configurations before the device is available in the network.
locked: the device can only be read.

The admin-state value southbound-locked is the default. This means if you create a new device without explicitly setting this value configuration changes will not propagate to the network. To see default values, use the pipe target details

Troubleshooting NEDs

To analyze NED problems, turn on the tracing for a device and look at the trace file contents.

NSO pools SSH connections and trace settings are only affecting new connections so therefore any open connection must be closed before the trace setting will take effect. Now you can inspect the raw communication between NSO and the device:

Device Communication Failure

If NSO fails to talk to the device, the typical root causes are:

Timeout Problems

Some devices are slow to respond, latency on connections, etc. Fine-tune the connect, read, and write timeouts for the device:

These settings can be set in profiles shared by devices.

Device Management Interface Problems

Examples, not enabling the NETCONF SSH subsystem on Juniper, not enabling the SNMP agent, using the wrong port numbers, etc. Use standalone tools to make sure that you can connect, read configuration, and write configuration over the device interface that NSO is using

Access Rights

The NSO-mapped user does not have access rights to do the operation on the device. Make sure the authgroups settings are OK and test them manually to read and write configuration with those credentials.

NED Data Model and Device Version Problems

If the device is upgraded and existing commands actually change in an incompatible way, the NED has to be updated. This can be done by editing the YANG data model for the device or by using Cisco support.

NSO Device Manager

Learn the concepts of NSO device management.

The NSO device manager is the center of NSO. The device manager maintains a flat list of all managed devices. NSO keeps the primary copy of the configuration for each managed device in CDB. Whenever a configuration change is done to the list of device configuration primary copies, the device manager will partition this network configuration change into the corresponding changes for the managed devices. The device manager passes on the required changes to the NEDs (Network Element Drivers). A NED needs to be installed for every type of device OS, like Cisco IOS NED, Cisco XR NED, Juniper JUNOS NED, etc. The NEDs communicate through the native device protocol southbound.

The NEDs fall into the following categories:

NETCONF-capable device: The Device Manager will produce NETCONF edit-config RPC operations for each participating device.
SNMP device: The Device Manager translates the changes made to the configuration into the corresponding SNMP SET PDUs.
Device with Cisco CLI: The device has a CLI with the same structure as Cisco IOS or XR routers. The Device Manager and a CLI NED are used to produce the correct sequence of CLI commands which reflects the changes made to the configuration.
Other devices: For devices that do not fit into any of the above-mentioned categories, a corresponding Generic NED is invoked. Generic NEDs are used for proprietary protocols like REST and for CLI flavors that do not resemble IOS or XR. The Device Manager will inform the Generic NED about the made changes and the NED will translate these to the appropriate operations toward the device.

NSO orchestrates an atomic transaction that has the very desirable characteristic of either the transaction as a whole ending up on all participating devices and in the NSO primary copy, or alternatively, the whole transaction getting aborted and resultingly, all changes getting automatically rolled back.

The architecture of the NETCONF protocol is the enabling technology making it possible to push out configuration changes to managed devices and then in the case of other errors, roll back changes. Devices that do not support NETCONF, i.e., devices that do not have transactional capabilities can also participate, however depending on the device, error recovery may not be as good as it is for a proper NETCONF-enabled device.

To understand the main idea behind the NSO device manager it is necessary to understand the NSO data model and how NSO incorporates the YANG data models from the different managed devices.

The NEDs will publish YANG data models even for non-NETCONF devices. In the case of SNMP the YANG models are generated from the MIBs. For JunOS devices the JunOS NED generates a YANG from the JunOS XML Schema. For Schema-less devices like CLI devices, the NED developer writes YANG models corresponding to the CLI structure. The result of this is the device manager and NSO CDB has YANG data models for all devices independent of the underlying protocol.

Throughout this section, we will use the examples.ncs/service-provider/mpls-vpn example. The example network consists of Cisco ASR 9k and Juniper core routers (P and PE) and Cisco IOS-based CE routers.

Managed Device Tree

The central part of the NSO YANG model, in the file tailf-ncs-devices.yang, has the following structure:

Each managed device is uniquely identified by its name, which is a free-form text string. This is typically the DNS name of the managed device but could equally well be the string format of the IP address of the managed device or anything else. Furthermore, each managed device has a mandatory address/port pair that together with the authgroup leaf provides information to NSO on how to connect and authenticate over SSH/NETCONF to the device. Each device also has a mandatory parameter device-type that specifies which southbound protocol to use for communication with the device.

The following device types are available:

NETCONF
CLI: A corresponding CLI NED is needed to communicate with the device. This requires YANG models with the appropriate annotations for the device CLI.
SNMP: The device speaks SNMP, preferably in read-write mode.
Generic NED: A corresponding Generic NED is needed to communicate with the device. This requires YANG models and Java code.

The NSO CLI command below lists the NED types for the devices in the example network.

The empty container /ncs:devices/device/config is used as a mount point for the YANG models from the different managed devices.

As previously mentioned, NSO needs the following information to manage a device:

The IP/Port of the device and authentication information.
Some or all of the YANG data models for the device.

In the example setup, the address and authentication information are provided in the NSO database (CDB) initialization file. There are many different ways to add new managed devices. All of the NSO northbound interfaces can be used to manipulate the set of managed devices. This will be further described later.

Once NSO has started you can inspect the meta information for the managed devices through the NSO CLI. This is an example session:

Alternatively, this information could be retrieved from the NSO northbound NETCONF interface by running the simple Python-based netconf-console program towards the NSO NETCONF server.

All devices in the above two examples (Show Device Configuration in NSO CLI and Show Device Configuration in NETCONF) have /devices/device/state/admin-state set to unlocked, this will be described later in this section.

The NED Packages

To communicate with a managed device, a NED for that device type needs to be loaded by NSO. A NED contains the YANG model for the device and corresponding driver code to talk CLI, REST, SNMP, etc. NEDs are distributed as packages.

The CLI command in the above example (Installed Packages) shows all the loaded packages. NSO loads packages at startup and can reload packages at run-time. By default, the packages reside in the packages directory in the NSO run-time directory.

Starting the NSO Daemon

Once you have access to the network information for a managed device, its IP address and authentication information, as well as the data models of the device, you can actually manage the device from NSO.

You start the ncs daemon in a terminal like:

Which is the same as, NSO loads it config from a ncs.conf file

During development, it is sometimes convenient to run ncs in the foreground as:

Once the daemon is running, you can issue the command:

To get more information about options to ncs do:

The ncs --status command produces a lengthy list describing for example which YANG modules are loaded in the system. This is a valuable debug tool.

The same information is also available in the NSO CLI (and thus through all available northbound interfaces, including Maapi for Java programmers)

Synchronizing Devices

When the NSO daemon is running and has been initialized with IP/Port and authentication information as well as imported all modules you can start to manage devices through NSO.

NSO provides the ability to synchronize the configuration to or from the device. If you know that the device has the correct configuration you can choose to synchronize from a managed device whereas if you know NSO has the correct device configuration and the device is incorrect, you can choose to synchronize from NSO to the device.

In the normal case, the configuration on the device and the copy of the configuration inside NSO should be identical.

In a cold start situation like in the mpls-vpn example, where NSO is empty and there are network devices to talk to, it makes sense to synchronize from the devices. You can choose to synchronize from one device at a time or from all devices at once. Here is a CLI session to illustrate this.

The command devices sync-from, in example (Synchronize from Devices), is an action that is defined in the NSO data model. It is important to understand the model-driven nature of NSO. All devices are modeled in YANG, network services like MPLS VPN are also modeled in YANG, and the same is true for NSO itself. Anything that can be performed over the NSO CLI or any north-bound is defined in the YANG files. The NSO YANG files are located here:

All packages comes with YANG files as well. For example the directory packages/cisco-ios/src/yang/ contains the YANG definition of an IOS device.

The tailf-ncs.yang is the main part of the NSO YANG data model. The file mode tailf-ncs.yang includes all parts of the model from different files.

The actions sync-from and sync-to are modeled in the file tailf-ncs-devices.yang. The sync action(s) are defined as:

Synchronizing from NSO to the device is common when a device has been configured out-of-band. NSO has no means to enforce that devices are not directly reconfigured behind the scenes of NSO; however, once an out-of-band configuration has been performed, NSO can detect the fact. When this happens it may (or may not, depending on the situation at hand) make sense to synchronize from NSO to the device, i.e. undo the rogue reconfigurations.

The command to do that is:

A dry-run option is available for the action sync-to.

This makes it possible to investigate the changes before they are transmitted to the devices.

Partial `sync-from`

Configuring Devices

It is now possible to configure several devices through the NSO inside the same network transaction. To illustrate this, start the NSO CLI from a terminal application.

The example above (Configure Devices) illustrates a multi-host transaction. In the same transaction, three hosts were re-configured. Had one of them failed, or been non-operational, the transaction as a whole would have failed.

As seen from the output of the command commit dry-run outformat native, NSO generates the native CLI and NETCONF commands which will be sent to each device when the transaction is committed.

Since the /devices/device/config path contains different models depending on the augmented device model NSO uses the data model prefix in the CLI names; ios, cisco-ios-xr and junos. Different data models might use the same name for elements and the prefix avoids name clashes.

NSO uses different underlying techniques to implement the atomic transactional behavior in case of any error. NETCONF devices are straightforward using confirmed commit. For CLI devices like IOS NSO calculates the reverse diff to restore the configuration to the state before the transaction was applied.

Connection Management

Each managed device needs to be configured with the IP address and the port where the CLI, NETCONF server, etc. of the managed device listens for incoming requests.

Connections are established on demand as they are needed. It is possible to explicitly establish connections, but that functionality is mostly there for troubleshooting connection establishment. We can, for example, do:

We were able to connect to all managed devices. It is also possible to explicitly attempt to test connections to individual managed devices:

Three configuration parameters can be used to control the connection establishment: connect-timeout, read-timeout, and write-timeout. In the NSO data model file tailf-ncs-devices.yang, these timeouts are modeled as:

Thus, to change these parameters (globally for all managed devices) you do:

Or, to use a profile:

Authentication Groups

When NSO connects to a managed device, it requires authentication information for that device. The authgroups are modeled in the NSO data model:

Each managed device must refer to a named authgroup. The purpose of an authentication group is to map local users to remote users together with the relevant SSH authentication information.

Southbound authentication can be done in two ways. One is to configure the stored user and credential components as shown in the example below (Configured authgroup) and the next example (authgroup default-map). The other way is to configure a callback to retrieve user and credentials on demand as shown in the example below (authgroup-callback).

In the example above (Configured authgroup) in the auth group named default, the two local users oper and admin shall use the remote users' name oper and admin respectively with identical passwords.

Inside an authgroup, all local users need to be enumerated. Each local user name must have credentials configured which should be used for the remote host. In centralized AAA environments, this is usually a bad strategy. You may also choose to instantiate a default-map. If you do that it probably only makes sense to specify the same user name/password pair should be used remotely as the pair that was used to log into NSO.

In the example (Configured authgroup), only two users admin and oper were configured. If the default-map in example (authgroup default-map) is configured, all local users not found in the umap list will end up in the default-map. For example, if the user rocky logs in to NSO with the password secret. Since NSO has a built-in SSH server and also a built-in HTTPS server, NSO will be able to pick up the clear text passwords and can then reuse the same password when NSO attempts to establish southbound SSH connections. The user rocky will end up in the default-map and when NSO attempts to propagate rocky's changes towards the managed devices, NSO will use the remote user name rocky with whatever password rocky used to log into NSO.

Authenticating southbound using stored configuration has two main components to define remote user and remote credentials. This is defined by the authgroup. As for the southbound user, there exist two options, the same user logged in to NSO or another user, as specified in the authgroup. As for the credentials, there are three options.

Regular password.
Finally, an interesting option is to use the 'same-pass' option. Since NSO runs its own SSH server and its own SSL server, NSO can pick up the password of a user in clear text. Hence, if the 'same-pass' option is chosen for an authgroup, NSO will reuse the same password when attempting to connect southbound to a managed device.

Connecting Using SSH Keyboard-Interactive (Multi-Factor) Authentication

NSO can connect to a device that is using multi-factor authentication. For this, the authgroup must be configured with an executable for handling the keyboard-interactive part, and optionally some opaque data that is passed to the executable. ie., the /devices/authgroups/group/umap/mfa/executable and /devices/authgroups/group/umap/mfa/opaque (or under default-map for users that are not in umap) must be configured.

The prompts from the SSH server (including the password prompt and any additional challenge prompts) are passed to the stdin of the executable along with some other relevant data. The executable must write a single line to its stdout as the reply to the prompt. This is the reply that NSO sends to the SSH server.

For example, with the above configured for the authgroup, if the user admin is trying to login to the device dev0 with password admin, this is the line that is sent to the stdin of the handle_mfa.py script:

The input to the script is the device, username, password, opaque data, as well as the name, instruction, and prompt from the SSH server. All these fields are base64 encoded, and separated by a semi-colon (';'). So, the above line in effect encodes the following:

A small Python program can be used to implement the keyboard-interactive authentication towards a device, such as:

This script will then be invoked with the above fields for every prompt from the server, and the corresponding output from the script will be sent as the reply to the server.

Using a Callback to Provide Device Credentials

In the case of authenticating southbound using a callback, remote user and remote credentials are obtained by an action invocation. The action is defined by the callback-node and action-name as in the example below (authgroup-callback) and supported credentials are remote password and optionally a secondary password for the provided local user, authgroup, and device.

Caveats

Authentication groups and the functionality they bring come with some limitations on where and how it is used.

The callback option that enables authgroup-callback feature is not applicable for members of snmp-group list.
Generic devices that implement their own authentication scheme do not use any mapping or callback functionality provided by Authgroups.
Cluster nodes use their own authgroups and mapping model, thus functionality differs, e.g. callback option is not applicable.

Device Session Pooling

Opening a session towards a managed device is potentially time and resource-consuming. Also, the probability that a recently accessed device is still subject to further requests is reasonably high. These are motives for having a managed devices session pool in NSO.

The NSO device session pool is by default active and normally needs no maintenance. However, under certain circumstances, it might be of interest to modify its behavior. Examples can be when some device type has characteristics that make session pooling undesired, or when connections to a specific device are very costly, and therefore the time that open sessions can stay in the pool should increase.

Changes from the default configuration of the NSO session pool should only be performed when absolutely necessary and when all effects of the change are understood.

NSO presents operational data that represent the current state of the session pool. To visualize this, we use the CLI to connect to NSO and force connection to all known devices:

We can now list all open sessions in the session-pool. But note that this is a live pool. Sessions will only remain open for a certain amount of time, the idle time.

In addition to the idle time for sessions, we can also see the type of device, current number of pooled sessions, and maximum number of pooled sessions.

We can close pooled sessions for specific devices.

And we can close all pooled sessions in the session pool.

The session pool configuration is found in the tailf-ncs-devices.yang submodel. The following part of the YANG device-profile-parameters grouping controls how the session pool is configured:

This grouping can be found in the NSO model under /ncs:devices/global-settings/session-pool, /ncs:devices/profiles/profile/session-pool and /ncs:devices/device/session-pool to be able to control session pooling for all devices, a group of devices, and a specific device respectively.

In addition under /ncs:devices/global-settings/session-pool/default it is possible to control the global max size of the session pool, as defined by the following yang snippet:

Let's illustrate the possibilities with an example configuration of the session pool:

In the above configuration, the default idle time is set to 100 seconds for all devices. A device profile called small is defined which contains a max-session value of 3 sessions, this profile is set on all ce* devices. The devices pe0 has a max-sessions 0 which implies that this device cannot be pooled. Let's connect all devices and see what happens in the session pool:

Now, we set an upper limit to the maximum number of sessions in the pool. Setting the value to 4 is too small for a real situation but serves the purpose of illustration:

The number of open sessions in the pool will be adjusted accordingly:

Device Session Limits

Some devices only allow a small number of concurrent sessions, in the extreme case it only allows one (for example through a terminal server). For this reason, NSO can limit the number of concurrent sessions to a device and make operations wait if the maximum number of sessions has been reached.

In other situations, we need to limit the number of concurrent connect attempts made by NSO. For example, the devices managed by NSO talk to the same server for authentication which can only handle a limited number of connections at a time.

The configuration for session limits is found in the tailf-ncs-devices.yang submodel. The following part of the YANG device-profile-parameters grouping controls how the session limits are configured:

This grouping can be found in the NSO model under /ncs:devices/global-settings/session-limits, /ncs:devices/profiles/profile/session-limits and /ncs:devices/device/session-limits to be able to control session limits for all devices, a group of devices, and a specific device respectively.

In addition, under /ncs:devices/global-settings/session-limits, it is possible to control the number of concurrent connect attempts allowed and the maximum time to wait for a device to be available to connect.

Tracing Device Communication

It is possible to turn on and off NED traffic tracing. This is often a good way to troubleshoot problems. To understand the trace output, a basic prerequisite is a good understanding of the native device interface. For NETCONF devices, an understanding of NETCONF RPC is a prerequisite. Similarly for CLI NEDs, a good understanding of the CLI capabilities of the managed devices is required.

To turn on southbound traffic tracing, we need to enable the feature and we must also configure a directory where we want the trace output to be written. It is possible to have the trace output in two different formats, pretty and raw. The format of the data depends on the type of the managed device. For NETCONF devices, the pretty mode indents all the XML data for enhanced readability and the raw mode does not. Sometimes when the XML is broken, raw mode is required to see all the data received. Tracing in raw mode will also signal to the corresponding NED to log more verbose tracing information.

To enable tracing, do:

The trace setting only affects new NED connections, so to ensure that we get any tracing data, we can do:

The above command terminates all existing connections.

At this point, if you execute a transaction towards one or several devices and then view the trace data.

It is possible to clear all existing trace files through the command

Finally, it is worth mentioning the trace functionality does not come for free. It is fairly costly to have the trace turned on. Also, there exists no trace log wrapping functionality.

Checking Device Configuration

When managing large networks with NSO a good strategy is to consider the NSO copy of the network configuration to be the main primary copy. All device configuration changes must go through NSO and all other device re-configurations are considered rogue.

NSO does not contain any functionality which disallows rogue re-configurations of managed devices, however, it does contain a mechanism whereby it is a very cheap operation to discover if one or several devices have been configured out-of-band.

The underlying mechanism for cheap check-sync is to compare time stamps, transaction IDs, hash-sums, etc., depending on what the device supports. This is in order not to have to read the full configuration to check if the NSO copy is in sync.

The transaction IDs are stored in CDB and can be viewed as:

Some of the devices do not have a transaction ID, this is the case where the NED has not implemented the cheap check-sync mechanism. Although it is called transaction-id, the underlying value in the device can be anything to detect a config change, like for example a time-stamp.

To check for consistency, we execute:

Alternatively for all (or a subset) managed devices:

The following YANG grouping is used for the return value from the check-sync command:

Comparing Device Configurations

In the previous section, we described how we can easily check if a managed device is in sync. If the device is not in sync, we are interested to know what the difference is. The CLI sequence below shows how to modify ce0 out-of-band using the ncs-netsim tool. Finally, the sequence shows how to do an explicit configuration comparison.

The diff in the above output should be interpreted as: what needs to be done in NSO to become in sync with the device.

Previously in the example (Synchronize from Devices), NSO was brought in sync with the devices by fetching configuration from the devices. In this case, where the device has a rogue re-configuration, NSO has the correct configuration. In such cases, you want to reset the device configuration to what is stored inside NSO.

When you decide to reset the configuration with the copy kept in NSO use the option dry-run in conjunction with sync-to and inspect what will be sent to the device:

As this is the desired data to send to the device a sync-to can now safely be performed.

The device configuration should now be in sync with the copy in NSO and compare-config ought to yield an empty output:

Initialize Device

There exist several ways to initialize new devices. The two common ways are to initialize a device from another existing device or to use device templates.

From Other

For example, another CE router has been added to our example network. You want to base the configuration of that host on the configuration of the managed device ce0 which has a valid configuration:

If the configuration is accurate you can create a new managed device based on that configuration as:

In the example above (Instantiate Device from Other) the commands first create the new managed device, ce9 and then populates the configuration of the new device based on the configuration of ce0.

This new configuration might not be entirely correct, you can modify any configuration before committing it.

The above concludes the instantiation of a new managed device. The new device configuration is committed and NSO returned OK without the device existing in the network (netsim). Try to force a sync to the device:

The device is southbound locked, this is a mode that is used where you can reconfigure a device, but any changes done to it are never sent to the managed device. This will be thoroughly described in the next section. Devices are by default created southbound locked. Default values are not shown if not explicitly requested:

By Template

Another alternative to instantiating a device from the actual working configuration of another device is to have a number of named device templates that manipulate the configuration.

The template tree looks like this:

The tree for device templates is generated from all device YANG models. All constraints are removed and the data type of all leafs is changed to string.

A device template is created by setting the desired data in the configuration. The created device template is stored in NSO CDB.

In the following CLI session, a new device ce10 is created:

Initialize the newly created device ce10 with the device template ce-initialize:

When initializing devices, NSO does not have any knowledge about the capabilities of the device, no connect has been done. This can be overridden by the option accept-empty-capabilities

Inspect the changes made by the template ce-initialize:

Device Templates

Device templates are part of the NSO configuration. Device templates are created and changed in the tree /devices/template/config the same way as any other configuration data and are affected by rollbacks and upgrades. Device templates can only manipulate configuration data in the /devices/device/config tree i.e., only device data.

The $NCS_DIR/examples.ncs/service-provider/mpls-vpn example comes with a pre-populated template for SNMP settings.

The variable $DEVICE is used internally by NSO and can not be used in a template.

Templates can be created like any configuration data and use the CLI tab completion to navigate. Variables can be used instead of hard-coded values. In the template above the community string is a variable. The template can cover several device types/NEDs, by making use of the namespace information. This will make sure that only devices modeled with this particular namespace will be affected by this part of the template. Hence, it is possible for one template to handle a multitude of devices from various manufacturers.

Applying the snmp1 template, providing a value for the COMMUNITY template variable:

The result of applying the template:

Debug

By adding the CLI pipe flag debug template when applying a template, the CLI will output detailed information on what is happening when the template is being applied:

Renaming Devices in NSO

The usual way to rename an instance in a list is to delete it and create a new instance. Aside from having to explicitly create all its children, an obvious problem with this method is the dependencies - if there is a leafref that refers to this instance, this method of deleting and recreating will fail unless the leafref is also explicitly reset to the value of the new instance.

The /devices/device/rename action renames an existing device and fixes the node/data dependencies in CDB. When renaming a device, the action fixes the following dependencies:

Leafrefs and instance-identifiers (both config true and config false).
Monitor and kick-node of kickers, if they refer to this device.
Diff-sets and forward-diff-sets of services that touch this device (This includes nano-services and also zombies).

NSO maintains a history of past renames at /devices/device/rename-history.

Examples

The rename action takes a device lock to prevent modifications to the device while renaming it. Depending on the input parameters, the action will either immediately fail if it cannot get the device lock, or wait wait a specified amount of seconds before timing out.

The parameter no-wait-for-lock makes the action fail immediately if the device lock is unavailable, while a timeout of infinity can be used to make it wait indefinitely for the lock.

Limitations

If a nano-service has components whose names are derived from the device name, and that device is renamed, the corresponding service components in its plan are not automatically renamed.

For example, let's say the nano-service has components with names matching device names.

If this device is renamed, the corresponding nano-service component is not renamed.

To handle this, the component with the old name must be force-back-tracked and the service re-deployed.

When a device is renamed, all components that derive their name from that device's name in all the service instances must be force-back-tracked.

Auto-configuring Devices in NSO

Provisioning new devices in NSO requires the user to be familiar with the concept of Network Element Drivers and the unique ned-id they use to distinguish their schema. For an end user interacting with a northbound client of NSO, the concept of a ned-id might feel too abstract. It could be challenging to know what device type and ned-id to select when configuring a device for the first time in NSO. After initial configuration, there are also additional steps required before the device can be operated from NSO.

NSO can auto-configure devices during initial provisioning. Under /devices/device/auto-configure, a user can specify either the ned-id explicitly or a combination of the device vendor and product-family or operating-system. These are meta-data specified in the package-meta-data.xml file in the NED package. Based on the combination of this meta-data or using the ned-id explicitly configured, a ned-id from a matching NED package is selected from the currently loaded packages. If multiple packages match the given combination, the package with the latest version is selected. In the same transaction, NSO also fetches the host keys if required, and synchronizes the configuration from the device, making it ready to operate in a single step.

Examples

NSO will auto-configure a new device in a transaction if either /devices/device/auto-configure/vendor or /devices/device/auto-configure/ned-id is set in that transaction.

One can configure either vendor and product-family, or vendor and operating-system or just the ned-id explicitly.

The admin-state for the device, if configured, will be honored. I.e., while auto-configuring a new device, if the admin-state is set to be southbound-locked, NSO will only pick the ned-id automatically. NSO will not fetch host keys and synchronize config from the device.

Many NEDs require additional custom configuration to be operational. This applied in particular to Generic NEDs. Information about such additional configuration can be found in the files README.md and README-ned-settings.md bundled with the NED package.

`oper-state` and `admin-state`

NSO differentiates between oper-state and admin-state for a managed device. oper-state is the actual state of the device. We have chosen to implement a very simple oper-state model. A managed device oper-state is either enabled or disabled. oper-state can be mapped to an alarm for the device. If the device is disabled, we may have additional error information. For example, the ce9 device created from another device and ce10 created with a device template in the previous section is disabled, and no connection has been established with the device, so its state is completely unknown:

Or, a slightly more interesting CLI usage:

If you manually stop a managed device, for example ce0, NSO doesn't immediately indicate that. NSO may have an active SSH connection to the device, but the device may voluntarily choose to close its end of that (idle) SSH connection. Thus the fact that a socket from the device to NSO is closed by the managed device doesn't indicate anything. The only certain method NSO has to decide a managed device is non-operational - from the point of view of NSO - is NSO cannot SSH connect to it. If you manually stop managed device ce0, you still have:

NSO cannot draw any conclusions from the fact that a managed device closed its end of the SSH connection. It may have done so because it decided to time out an idle SSH connection. Whereas if NSO tried to initiate any operations towards the dead device, the device would be marked as oper-state disabled:

Now, NSO has failed to connect to it, NSO knows that ce0 is dead:

This concludes the oper-state discussion. The next state to be illustrated is the admin-state. The admin-state is what the operator configures, this is the desired state of the managed device.

In tailf-ncs.yang we have the following configuration definition for admin-state:

In the example above (tailf-ncs-devices.yang - admin-state), you can see the four different admin states for a managed device as defined in the YANG model.

locked - This means that all changes to the device are forbidden. Any transaction which attempts to manipulate the configuration of the device will fail. It is still possible to read the configuration of the device.
unlocked -This is the state a device is set into when the device is operational. All changes to the device are attempted to be sent southbound.
southbound-locked - This is the default value. It means that it is possible to manipulate the configuration of the device but changes done to the device configuration are never pushed to the device. This mode is useful during e.g. pre-provisioning, or when we instantiate new devices.
config-locked - This means that any transaction which attempts to manipulate the configuration of the device will fail. It is still possible to read the configuration of the device and send live-status commands or RPCs.

Configuration Source

NSO manages a set of devices that are given to NSO through any means like CLI, inventory system integration through XML APIs, or configuration files at startup. The list of devices to manage in an overall integrated network management solution is shared between different tools and therefore it is important to keep an authoritative database of this and share it between different tools including NSO. The purpose of this part is to identify the source of the population of managed devices. The source attribute should indicate the source of the managed device like "inventory", "manual", or "EMS".

These attributes should be automatically set by the integration towards the inventory source, rather than manipulated manually.

added-by-user: Identify the user who loaded the managed device.
context: In what context was the device loaded.
when: When the device was added to NSO.
from-ip: From which IP the load activity was run.
source: Identify the source of the managed device such as the inventory system name or the name of the source file.

Capabilities, Modules, and Revision Management

The NETCONF protocol mandates that the first thing both the server and the client have to do is to send its list of NETCONF capabilities in the <hello> message. A capability indicates what the peer can do. For example the validate:1.0 indicates that the server can validate a proposed configuration change, whereas the capability http://acme.com/if indicates the device implements the http://acme.com proprietary capability.

The NEDs report the capabilities of the devices at connection time. The NEDs also load the YANG modules for NSO. For a NETCONF/YANG device, all this is straightforward, for non-NETCONF devices the NEDs do the translation.

The capabilities announced by a device also contain the YANG version 1 modules supported. In addition to this, YANG version 1.1 modules are advertised in the YANG library module on the device. NSO checks both the capabilities and the YANG library to find out which YANG modules a device supports.

The capabilities and modules detected by NSO are available in two different lists, /devices/device/capability and devices/device/module. The capability list contains all capabilities announced and all YANG modules in the YANG library. The module list contains all YANG modules announced that are also supported by the NED in NSO.

NSO can be used to handle all or some of the YANG configuration modules for a device. A device may announce several modules through its capability list which NSO ignores. NSO will only handle the YANG modules for a device which are loaded (and compiled through ncsc --ncs-compile-bundle) or ncsc --ncs-compile-module) all other modules for the device are ignored. If you require a situation where NSO is entirely responsible for a device so that complete device backup/configurations are stored in NSO you must ensure NSO indeed has support for all modules for the device. It is not possible to automate this process since a capability URI doesn't necessarily indicate actual configuration.

Discovery of a NETCONF Device

When a device is added to NSO its NED ID must be set. For a NETCONF device, it is possible to configure the generic NETCONF NED id netconf (defined in the YANG module tailf-ncs-ned). If this NED ID is configured, we can then ask NSO to connect to the device and then check the capability list to see which modules this device implements.

We can also check which modules the loaded NEDs support. Then we can pick the most suitable NED and configure the device with this NED ID.

Configuration Datastore Support

NSO works best if the managed devices support the NETCONF candidate configuration datastore. However, NSO reads the capabilities of each managed device and executes different sequences of NETCONF commands towards different types of devices.

For implementations of the NETCONF protocol that do not support the candidate datastore, and in particular, devices that do not support NETCONF commit with a timeout, NSO tries to do the best of the situation.

NSO divides devices into the following groups.

start_trans_running: This mode is used for devices that support the Tail-f proprietary transaction extension defined by http://tail-f.com/ns/netconf/transactions/1.0. Read more on this in the Tail-f ConfD user guide. In principle it's a means to - over the NETCONF interface - control transaction processing towards the running data store. This may be more efficient than going through the candidate data store. The downside is that it is Tail-f proprietary non-standardized technology.
lock_candidate: This mode is used for devices that support the candidate data store but disallow direct writes to the running data store.
lock_reset_candidate: This mode is used for devices that support the candidate data and also allow direct writes to the running data store. This is the default mode for Tail-f ConfD NETCONF server. Since the running data store is configurable, we must, before each configuration attempt, copy all of the running to the candidate. (ConfD has optimized this particular usage pattern, so this is a very cheap operation for ConfD)
startup: This mode is used for devices that have writable running, no candidate but do support the startup data store. This is the typical mode for Cisco-like devices.
running-only: This mode is used for devices that only support writable running.
NED: The transaction is controlled by a Network Element Driver. The exact transaction mode depends on the type of the NED.

Which category NSO chooses for a managed device depends on which NETCONF capabilities the device sends to NSO in its NETCONF hello message. You can see in the CLI what NSO has decided for a device as in:

NSO talking to ConfD device running in its standard configuration, thus lock-reset-candidate.

Another important discriminator between managed devices is whether they support the confirmed commit with a timeout capability, i.e., the confirmed-commit:1.0 standard NETCONF capability. If a device supports this capability, NSO utilizes it. This is the case with for example Juniper routers.

If a managed device does not support this capability, NSO attempts to do the best it can.

This is how NSO handles common failure scenarios:

The operator aborts the transaction, or the NSO loses the SSH connection to another managed device which is also participating in the same network transaction. If the device does support the confirmed-commit capability, NSO aborts the outstanding yet-uncommitted transaction simply by closing the SSH connection. When the device does not support the confirmed-commit capability, NSO has the reverse diff and simply sends the precise undo information to the device instead.
The device rejects the transaction in the first place, i.e. the NSO attempts to modify its running data store. This is an easy case since NSO then simply aborts the transaction as a whole in the initial commit confirmed [time] attempt.
NSO loses SSH connectivity to the device during the timeout period. This is a real error case and the configuration is now in an unknown state. NSO will abort the entire transaction, but the configuration of the failing managed device is now probably in error. The correct procedure once network connectivity has been restored to the device is to sync it in the direction from NSO to the device. The NSO copy of the device configuration will be what was configured before the failed transaction.

Thus, even if not all participating devices have first-class NETCONF server implementations, NSO will attempt to fake the confirmed-commit capability.

Action Proxy

When the managed device defines top-level NETCONF RPCs or alternatively, define tailf:action points inside the YANG model, these RPCs and actions are also imported into the data model that resides in NSO.

For example, the Juniper NED comes with a set of JunOS RPCs defined in: $NCS_DIR/packages/neds/juniper-junos/src/yang/junos-rpc.yang

Thus, since all RPCs and actions from the devices are accessible through the NSO data model, these actions are also accessible through all NSO northbound APIs, REST, JAVA MAAPI, etc. Hence it is possible to - from user scripts/code - invoke actions and RPCs on all managed devices. The RPCs are augmented below an RPC container:

In the simulated environment of the mpls-vpn example, these RPCs might not have been implemented.

Device Groups

The NSO device manager has a concept of groups of devices. A group is nothing more than a named group of devices. What makes this interesting is that we can invoke several different actions in the group, thus implicitly invoking the action on all members in the group. This is especially interesting for the apply-template action.

The definition of device groups resides at the same layer in the NSO data model as the device list, thus we have:

The MPLS VPN example comes with a couple of pre-defined device-groups:

Device groups are created like below:

Device groups can reference other device groups. There is an operational attribute that flattens all members in the group. The CLI sequence below adds the PE group to my-group. Then it shows the configuration of that group followed by the status of this group. The status for the group contains a members attribute that lists all device members.

Once you have a group, you can sync and check-sync the entire group.

However, what makes device groups really interesting is the ability to apply a template to a group. You can use the pre-populated templates to apply SNMP settings to device groups.

Policies

Policies allow you to specify network-wide constraints that always must be true. If someone tries to apply a configuration change over any northbound interface that would be evaluated to false, the configuration change is rejected by NSO. Policies can be of type warning means that it is possible to override them, or error which cannot be overridden.

Assume you would like to enforce all CE routers to have a Gigabit interface 0/1.

As seen in the example above (Policies) , a policy rule has (an optional) for each statement and a mandatory expression and error message. The foreach statement evaluates to a node set, and the expression is then evaluated on each node. So in this example, the expression would be evaluated for every device in NSO which begins with ce. The name variable in the warning message refers to a leaf available from the for-each node set.

Validation is always performed at commit but can also be requested interactively.

Note any configuration can be activated or deactivated. This means that to temporarily turn off a certain policy you can deactivate it. Note also that if the configuration was changed by any other means than NSO by local tools to the device like a CLI, a devices sync-from operation might fail if the device configuration violates the policy.

Commit Queue

One of the strengths of NSO is the concept of network-wide transactions. When you commit data to NSO that spans multiple devices in the /ncs:devices/device tree, NSO will - within the NSO transaction - commit the data on all devices or none, keeping the network consistent with CDB. The NSO transaction doesn't return until all participants have acknowledged the proposed configuration change. The downside of this is that the slowest device in each transaction limits the overall transactional throughput in NSO. Such things as out-of-sync checks, network latency, calculation of changes sent southbound, or device deficiencies all affect the throughput.

Typically when automation software north of NSO generates network change requests it may very well be the case more requests arrive than what can be handled. In NSO deployment scenarios where you wish to have higher transactional throughput than what is possible using network-wide transactions, you can use the commit queue instead. The goal of the commit queue is to increase the transactional throughput of NSO while keeping an eventual consistency view of the database. With the commit queue, NSO will compute the configuration change for each participating device, put it in an outbound queue item, and immediately return. The queue is then independently run.

Another use case where you can use the commit queue is when you wish to push a configuration change to a set of devices and don't care about whether all devices accept the change or not. You do not want the default behavior for transactions which is to reject the transaction as a whole if one or more participating devices fail to process its part of the transaction.

An example of the above could be if you wish to set a new NTP server on all managed devices in our entire network, if one or more devices currently are non-operational, you still want to push out the change. You also want the change automatically pushed to the non-operational devices once they go live again.

The big upside of this scheme is that the transactional throughput through NSO is considerably higher. Also, transient devices are handled better. The downsides are:

If a device rejects the proposed change, NSO and the device are now out of sync until any error recovery is performed. Whenever this happens, an NSO alarm (called commit-through-queue-failed) is generated.
While a transaction remains in the queue, i.e., it has been accepted for delivery by NSO but is not yet delivered, the view of the network in NSO is not (yet) correct. Eventually, though, the queued item will be delivered, thus achieving eventual consistency.

To facilitate the two use cases of the commit queue the outbound queue item can be either in an atomic or non-atomic mode.

In atomic mode the outbound queue item will push all configuration changes concurrently once there are no intersecting devices ahead in the queue. If any device rejects the proposed change, all device configuration changes in the queue item will be rejected as a whole, leaving the network in a consistent state. The atomic mode also allows for automatic error recovery to be performed by NSO.

In the non-atomic mode, the outbound queue item will push configuration changes for a device whenever all occurrences of it are completed or it doesn't exist ahead in the queue. The drawback to this mode is that there is no automatic error recovery that can be performed by NSO.

In the following sequences, the simulated device ce0 is stopped to illustrate the commit queue. This can be achieved by the following sequence including returning to the NSO CLI config mode:

By default, the commit queue is turned off. You can configure NSO to run a transaction, device, or device group through the commit queue in a number of different ways, either by providing a flag to the commit command as:

Or, by configuring NSO to always run all transactions through the commit queue as in:

Or, by configuring a number of devices to run through the commit queue as default:

When enabling the commit queue as default on a per device/device group basis, an NSO transaction will compute the configuration change for each participating device, put the devices enabled for the commit queue in the outbound queue, and then proceed with the normal transaction behavior for those devices not commit queue enabled. The transaction will still be successfully committed even if some of the devices added to the outbound queue will fail. If the transaction fails in the validation phase the entire transaction will be aborted, including the configuration change for those devices added to the commit queue. If the transaction fails after the validation phase, the configuration change for the devices in the commit queue will still be delivered.

Do some changes and commit through the commit queue:

Commit Queue Scheduling

In the example above (Commit through Commit Queue), the commit affected three devices, ce0, ce1 and ce2. If you immediately would have launched yet another transaction, as in the second one (see example below), manipulating an interface of ce2, that transaction would have been queued instead of immediately launched. The idea here is to queue entire transactions that touch any device that has anything queued ahead in the queue.

Each transaction committed through the queues becomes a queue item. A queue item has an ID number. A bigger number means that it's scheduled later. Each queue item waits for something to happen. A queue item is in either of three states.

waiting: The queue item is waiting for other queue items to finish. This is because the waiting queue item has participating devices that are part of other queue items, ahead in the queue. It is waiting for a set of devices, to not occur ahead of itself in the queue.
executing: The queue item is currently being processed. Multiple queue items can run currently as long as they don't share any managed devices. Transient errors might be present. These errors occur when NSO fails to communicate with some of the devices. The errors are shown in the leaf-list transient-errors. Retries will take place at intervals specified in /ncs:devices/global-settings/commit-queue/retry-timeout. Examples of transient errors are connection failures and that the changes are rejected due to the device being locked. Transient errors are potentially bad since the queue might grow if new items are added, waiting for the same device.
locked: This queue item is locked and will not be processed until it has been unlocked, see the action /ncs:devices/commit-queue/queue-item/unlock. A locked queue item will block all subsequent queue items that are using any device in the locked queue item.

Viewing and Manipulating the Commit Queue

You can view the queue in the CLI. There are three different view modes, summary, normal, and detailed. Depending on the output, both the summary and the normal look good:

The age field indicated how many seconds a queue item has been in the queue.

You can also view the queue items in detailed mode:

The queue items are stored persistently, thus if NSO is stopped and restarted, the queue remains the same. Similarly, if NSO runs in HA (High Availability) mode, the queue items are replicated, ensuring the queue is processed even in case of failover.

A number of useful actions are available to manipulate the queue:

devices commit-queue add-lock device [ ... ]. This adds a fictive queue item to the commit queue. Any queue item, affecting the same devices, which is entering the commit queue will have to wait for this lock item to be unlocked or deleted. If no devices are specified, all devices in NSO are locked.
devices commit-queue clear. This action clears the entire queue. All devices present in the commit queue will, after this action, have executed be out of sync. The clear action is a rather blunt tool and is not recommended to be used in any normal use case.
devices commit-queue prune device [ ... ] . This action prunes all specified devices from all queue items in the commit queue. The affected devices will, after this action has been executed, be out of sync. Devices that are currently being committed to will not be pruned unless the force option is used. Atomic queue items will not be affected, unless all devices in it are pruned. The force option will brutally kill an ongoing commit. This could leave the device in a bad state. It is not recommended in any normal use case.
devices commit-queue set-atomic-behaviour atomic [ true,false ]. This action sets the atomic behavior of all queue items. If these are set to false, the devices contained in these queue items can start executing if the same devices in other non-atomic queue items ahead of it in the queue are completed. If set to true, the atomic integrity of these queue items is preserved.
devices commit-queue wait-until-empty. This action waits until the commit queue is empty. The default is to wait infinity. A timeout can be specified to wait for a number of seconds. The result is empty if the queue is empty or timeout if there are still items in the queue to be processed.
devices commit-queue queue-item [ id ] lock. This action puts a lock on an existing queue item. A locked queue item will not start executing until it has been unlocked.
devices commit-queue queue-item [ id ] unlock. This action unlocks a locked queue item. Unlocking a queue item that is not locked is silently ignored.
devices commit-queue queue-item [ id ] delete. This action deletes a queue item from the queue. If other queue items are waiting for this (deleted) item, they will all automatically start to run. The devices of the deleted queue item will, after the action has been executed, be out of sync if they haven't started executing. Any error option set for the queue item will also be disregarded. The force option will brutally kill an ongoing commit. This could leave the device in a bad state. It is not recommended in any normal use case.
devices commit-queue queue-item [ id ] prune device [ ... ]. This action prunes the specified devices from the queue item. Devices that are currently being committed to will not be pruned unless the force option is used. Atomic queue items will not be affected, unless all devices in it are pruned. The force option will brutally kill an ongoing commit. This could leave the device in a bad state. It is not recommended in any normal use case.
devices commit-queue queue-item [ id ] set-atomic-behaviour atomic [ true,false ]. This action sets the atomic behavior of this queue item. If this is set to false, the devices contained in this queue item can start executing if the same devices in other non-atomic queue items ahead of it in the queue are completed. If set to true, the atomic integrity of the queue item is preserved.
devices commit-queue queue-item [ id ] wait-until-completed. This action waits until the queue item is completed. The default is to wait infinity. A timeout can be specified to wait for a number of seconds. The result is completed if the queue item is completed or timeout if the timer expired before the queue item was completed.
devices commit-queue queue-item [ id ] retry. This action retries devices with transient errors instead of waiting for the automatic retry attempt. The device option will let you specify the devices to retry.

A typical use scenario is where one or more devices are not operational. In the example above (Viewing Queue Items), there are two queue items, waiting for the device ce0 to come alive. ce0 is listed as a transient error, and this is blocking the entire queue. Whenever a queue item is blocked because another item ahead of it cannot connect to a specific managed device, an alarm is generated:

Block other affecting device ce0 from entering the commit queue:
Now queue item 9577950918 is blocking other items using ce0 from entering the queue.
Prune the usage of the device ce0 from all queue items in the commit queue:
The lock will be in the queue until it has been deleted or unlocked. Queue items affecting other devices are still allowed to enter the queue.
Fix the problem with the device ce0, remove the lock item and sync from the device:

Commit Queue in a Cluster Environment

In an LSA cluster, each remote NSO has its own commit queue. When committing through the commit queue on the upper node NSO will automatically create queue items on the lower nodes where the devices in the transaction reside. The progress of the lower node queue items is monitored through a queue item on the upper node. The remote NSO is treated as a device in the queue item and the remote queue items and devices are opaque to the user of the upper node.

Generally, it is not recommended to interfere with the queue items of the lower nodes that have been created by an upper NSO. This can cause the upper queue item to not synchronize with the lower ones correctly.

Configuring Commit Queue in a Cluster Environment

To be able to track the commit queue on the lower cluster nodes, NSO uses the built-in stream ncs-events that generates northbound notifications for internal events. This stream is required if running the commit queue in a clustered scenario. It is enabled in ncs.conf:

In addition, the commit queue needs to be enabled in the cluster configuration.

Error Recovery with Commit Queue

The goal of the commit queue is to increase the transactional throughput of NSO while keeping an eventual consistency view of the database. This means no matter if changes committed through the commit queue originate as pure device changes or as the effect of service manipulations the effects on the network should eventually be the same as if performed without a commit queue no matter if they succeed or not. This should apply to a single NSO node as well as NSO nodes in an LSA cluster.

Depending on the selected error-option NSO will store the reverse of the original transaction to be able to undo the transaction changes and get back to the previous state. This data is stored in the /ncs:devices/commit-queue/completed tree from where it can be viewed and invoked with the rollback action. When invoked the data will be removed.

The error option can be configured under /ncs:devices/global-settings/commit-queue/error-option. Possible values are: continue-on-error, rollback-on-error, and stop-on-error. The continue-on-error value means that the commit queue will continue on errors. No rollback data will be created. The rollback-on-error value means that the commit queue item will roll back on errors. The commit queue will place a lock on the failed queue item, thus blocking other queue items with overlapping devices from being executed. The rollback action will then automatically be invoked when the queue item has finished its execution. The lock will be removed as part of the rollback. The stop-on-error means that the commit queue will place a lock on the failed queue item, thus blocking other queue items with overlapping devices from being executed. The lock must then either manually be released when the error is fixed or the rollback action under /devices/commit-queue/completed be invoked. The rollback action is as:

The error option can also be given as a commit parameter.

To guarantee service integrity NSO checks for overlapping service or device modifications against the items in the commit queue and returns an error if such exists. If a service instance does a shared set on the same data as a service instance in the queue actually changed, the reference count will be increased but no actual change is pushed to the device(s). This will give a false positive that the change is actually deployed in the network. The rollback-on-error and stop-on-error error options will automatically create a queue lock on the involved services and devices to prevent such a case.

In a clustered environment, different parts of the resulting configuration change set will end up on different lower nodes. This means on some nodes the queue item could succeed and on others, it could not.

The error option in a cluster environment will originate on the upper node. The reverse of the original transaction will be committed on this node and propagated through the cluster down to the lower nodes. The net effect of this is the state of the network will be the same as before the original change.

As the error option in a cluster environment will originate on the upper node, any configuration on the lower nodes will be meaningless.

When NSO is recovering from a failed commit, the rollback data of the failed queue items in the cluster is applied and committed through the commit queue. In the rollback, the no-networking flag will be set on the commits towards the failed lower nodes or devices to get CDB consistent with the network. Towards the successful nodes or devices, the commit is done as before. This is what the rollback action in /ncs:devices/commit-queue/completed/queue-item does.

TR1; service s1 creates ce0:a and ce1:b. The nodes a and b are created in CDB. In the changes of the queue item, CQ1, a and b are created.
TR2; service s2 creates ce1:c and ce2:d. The nodes c and d are created in CDB. In the changes of the queue item, CQ2, c, and d are created.
The queue item from TR1, CQ1, starts to execute. The node a cannot be created on the device. The node b was created on the device but that change is reverted as a failed to be created.

The reverse of TR1, the rollback of CQ1, TR3, is committed.
TR3; service s1 is applied with the old parameters. Thus the effect of TR1 is reverted. Nothing needs to be pushed towards the network, so no queue item is created.
TR2; as the queue item from TR2, CQ2, is not the same service instance and has no overlapping data on the ce1 device, this queue item executes as normal.

NSO1:TR1; service s1 dispatches the service to NSO2 and NSO3 through the queue item NSO1:CQ1. In the changes of NSO1:CQ1, NSO2:s1 and NSO3:s1 are created.
NSO1:TR2; service s2 dispatches the service to NSO2 through the queue item NSO1:CQ2. In the changes of NSO1:CQ2, NSO2:s2 is created.
The queue item from NSO2:TR1, NSO2:CQ1, starts to execute. The node a cannot be created on the device. The node b was created on the device, but that change is reverted as a failed to be created.
The queue item from NSO3:TR1, NSO3:CQ1, starts to execute. The changes in the queue item are committed successfully to the network.

The reverse of TR1, rollback of CQ1, TR3, is committed on all nodes part of TR1 that failed.
NSO2:TR3; service s1 is applied with the old parameters. Thus the effect of NSO2:TR1 is reverted. Nothing needs to be pushed towards the network, so no queue item is created.
NSO1:TR3; service s1 is applied with the old parameters. Thus the effect of NSO1:TR1 is reverted. A queue item is created to push the transaction changes to the lower nodes that didn't fail.
NSO3:TR3; service s1 is applied with the old parameters. Thus the effect of NSO3:TR1 is reverted. Since the changes in the queue item NSO3:CQ1 was successfully committed to the network a new queue item NSO3:CQ3 is created to revert those changes.

If for some reason the rollback transaction fails there are, depending on the failure, different techniques to reconcile the services involved:

Make sure that the commit queue is blocked to not interfere with the error recovery procedure. Do a sync-from on the non-completed device(s) and then re-deploy the failed service(s) with the reconcile option to reconcile original data, i.e., take control of that data. This option acknowledges other services controlling the same data. The reference count will indicate how many services control the data. Release any queue lock that was created.
Make sure that the commit queue is blocked to not interfere with the error recovery procedure. Use un-deploy with the no-networking option on the service and then do sync-from on the non-completed device(s). Make sure the error is fixed and then re-deploy the failed service(s) with the reconcile option. Release any queue lock that was created.

Commit Queue Tuning

As the goal of the commit queue is to increase the transactional throughput of NSO it means that we need to calculate the configuration change towards the device(s) outside of the transaction lock. To calculate a configuration change NSO needs a pre-commit running and a running view of the database. The key enabler to support this in the commit queue is to allow different views of the database to live beyond the commit. In NSO, this is implemented by keeping a snapshot database of the configuration tree for devices and storing configuration changes towards this snapshot database on a per-device basis. The snapshot database is updated when a device in the queue has been processed. This snapshot database is stored on disk for persistence (the S.cdb file in the ncs-cdb directory).

The snapshot database could be populated in two ways. This is controlled by the /ncs-config/cdb/snapshot/pre-populate setting in the ncs.conf file. The parameter controls whether the snapshot database should be pre-populated during the upgrade or not. Switching this on or off implies different trade-offs.

If set to false, NSO is optimized for the default transaction behavior. The snapshot database is populated in a lazy manner (when a device is committed through the commit queue for the first time after an upgrade). The drawback is that this commit will suffer performance-wise, which is especially true for devices with large configurations. Subsequent commits on the same device will not have the same penalty.

If true, NSO is optimized for systems using the commit queue extensively. This will lead to better performance when committing using the commit queue with no additional penalty for first-time commits. The drawbacks are that the time to do upgrades will increase and also an almost twofold increase in NSO memory consumption.

NETCONF Call Home

With NETCONF SSH Call Home, the NETCONF client listens for TCP connection requests from NETCONF servers. The SSH client protocol is started when the connection is accepted. The SSH client validates the server's presented host key with credentials stored in NSO. If no matching host key is found the TCP connection is closed immediately. Otherwise, the SSH connection is established, and NSO is enabled to communicate with the device. The SSH connection is kept open until the device itself terminates the connection, an NSO user disconnects the device, or the idle connection timeout is triggered (configurable in the ncs.conf file).

NSO will generate an asynchronous notification event whenever there is a connection request. An application can subscribe to these events and, for example, add an unknown device to the device tree with the information provided, or invoke actions on the device if it is known.

If an SSH connection is established, any outstanding configuration in the commit queue for the device will be pushed. Any notification stream for the device will also be reconnected.

NETCONF Call Home is enabled and configured under /ncs-config/netconf-call-home in the ncs.conf file. By default NETCONF Call Home is disabled.

A device can be connected through the NETCONF Call Home client only if /devices/device/state/admin-state is set to call-home. This state prevents any southbound communication to the device unless the connection has already been established through the NETCONF Call Home client protocol.

Notifications

The NSO device manager has built-in support for device notifications. Notifications are a means for the managed devices to send structured data asynchronously to the manager. NSO has native support for NETCONF event notifications (see RFC 5277) but could also receive notifications from other protocols implemented by the Network Element Drivers.

Notifications can be utilized in various use-case scenarios. It can be used to populate alarms in the Alarm manager, collect certain types of errors over time, build a network-wide audit log, react to configuration changes, etc.

The basic mode of operation is the manager subscribes to one or more named notification channels which are announced by the managed device. The manager keeps an open SSH channel towards the managed device, and then, the managed device may asynchronously send structured XML data on the SSH channel.

The notification support in NSO is usable as is without any further programming. However, NSO cannot understand any semantics contained inside the received XML messages, thus for example a notification with a content of "Clear Alarm 456" cannot be processed by NSO without any additional programming.

When you add programs to interpret and act upon notifications, make sure that resulting operations are idempotent. This means that they should be able to be called any number of times while guaranteeing that side effects only occur once. The reason for this is that, for example, replaying notifications can sometimes mean that your program will handle the same notifications multiple times.

In the tailf-ncs.yang data model, you find a YANG data model that can be used to:

Setup subscriptions. A subscription is configuration data from the point of view of NSO, thus if NSO is restarted, all configured subscriptions are automatically resumed.
Inspect which named streams a managed device publishes.
View all received notifications.

An Example Session

In this section, we will use the examples.ncs/web-server-farm/basic example.

Let's dive into an example session with the NSO CLI. In the NSO example collection, the webserver publishes two NETCONF notification structures, indicating what they intend to send to any interested listeners. They all have the YANG module:

Follow the instructions in the README file if you want to run the example: build the example, start netsim, and start NCS.

The above shows how we can inspect - as status data - which named streams the managed device publishes. Each stream also has some associated data. The data model for that looks like this:

Let's set up a subscription for the stream called interface. The subscriptions are NSO configuration data, thus to create a subscription we need to enter configuration mode:

The above example created subscriptions for the interface stream on all web servers, i.e. managed devices, www0, www1, and www2. Each subscription must have an associated stream to it, this is however not the key for an NSO notification, the key is a free-form text string. This is because we can have multiple subscriptions to the same stream. More on this later when we describe the filter that can be associated with a subscription. Once the notifications start to arrive, they are read by NSO and stored in stable storage as CDB operational data. they are stored under each managed device - and we can view them as:

Each received notification has some associated metadata, such as the time the event was received by NSO, which subscription and which stream is associated with the notification, and also which user created the subscription.

It is fairly instructive to inspect the XML that goes on the wire when we create a subscription and then also receive the first notification. We can do:

Thus, once the subscription has been configured, NSO continuously receives, and stores in CDB oper persistent storage, the notifications sent from the managed device. The notifications are stored in a circular buffer, to set the size of the buffer, we can do:

The default value is 200. Once the size of the circular buffer is exceeded, the older notification is removed.

Subscription Status

A running subscription can be in either of three states. The YANG model has:

If a subscription is in the failed state, an optional failure-reason field indicates the reason for the failure. If a subscription fails due to, not being able to connect to the managed device or if the managed device closed its end of the SSH socket, NSO will attempt to automatically reconnect. The re-connect attempt interval is configurable.

SNMP Notifications

SNMP Notifications (v1, v2c, v3) can be received by NSO and acted upon. The SNMP receiver is a stand-alone process and by default, all notifications are ignored. IP addresses must be opted in and a handler must be defined to take actions on certain notifications. This can be used to for example listen to configuration change notifications and trigger a log action or a resync for example

Inactive Configuration

NSO can configure inactive parameters on the devices that support inactive configuration. Currently, these devices include Juniper devices and devices that announce http://tail-f.com/ns/netconf/inactive/1.0 capability. NSO itself implements http://tail-f.com/ns/netconf/inactive/1.0 capability which is formally defined in tailf-netconf-inactive YANG module.

To recap, a node that is marked as inactive exists in the data store but is not used by the server. The nodes announced as inactive by the device will also be inactive in the device's configuration in NSO, and activating/deactivating a node in NSO will push the corresponding change to the device. This also means that for NSO to be able to manage inactive configuration, both /ncs-config/enable-inactive and /ncs-config/netconf-north-bound/capabilities/inactive need to be enabled in ncs.conf.

If the inactive feature is disabled in ncs.conf, NSO will still be able to manage devices that have inactive configuration in their datastore, but the inactive attribute will be ignored, so the data will appear as active in NSO and it would not be possible for NSO to activate/deactivate such nodes in the device.

SSH Key Management

Learn about NSO SSH key management.

The SSH protocol uses public key technology for two distinct purposes:

Server Authentication: This use is a mandatory part of the protocol. It allows an SSH client to authenticate the server, i.e. verify that it is really talking to the intended server and not some man-in-the-middle intruder. This requires that the client has prior knowledge of the server's public keys, and the server proves its possession of one of the corresponding private keys by using it to sign some data. These keys are normally called 'host keys', and the authentication procedure is typically referred to as 'host key verification' or 'host key checking'.
Client Authentication: This use is one of several possible client authentication methods, i.e. it is an alternative to the commonly used password authentication. The server is configured with one or more public keys which are authorized for authentication of a user. The client proves possession of one of the corresponding private keys by using it to sign some data - i.e. the exact reverse of the server authentication provided by host keys. The method is called 'public key authentication' in SSH terminology.

These two usages are fundamentally independent, i.e. host key verification is done regardless of whether the client authentication is via public key, password, or some other method. However host key verification is of particular importance when client authentication is done via password, since failure to detect a man-in-the-middle attack in this case will result in the cleartext password being divulged to the attacker.

NSO as SSH Server

NSO can act as an SSH server for northbound connections to the CLI or the NETCONF agent, and for connections from other nodes in an NSO cluster - cluster connections use NETCONF, and the server side setup used is the same as for northbound connections to the NETCONF agent. It is possible to use either the NSO built-in SSH server or an external server such as OpenSSH, for all of these cases. When using an external SSH server, host keys for server authentication and authorized keys for client/user authentication need to be set up per the documentation for that server, and there is no NSO-specific key management in this case.

When the NSO built-in SSH server is used, the setup is very similar to the one OpenSSH uses:

Host Keys

The private host key(s) must be placed in the directory specified by /ncs-config/aaa/ssh-server-key-dir in ncs.conf, and named either ssh_host_dsa_key (for a DSA key) or ssh_host_rsa_key (for a RSA key). The key(s) must be in PEM format (e.g. as generated by the OpenSSH ssh-keygen command), and must not be encrypted - protection can be achieved by file system permissions (not enforced by NSO). The corresponding public key(s) is/are typically stored in the same directory with a .pub extension to the file name, but they are not used by NSO. The NSO installation creates a DSA private/public key pair in the directory specified by the default ncs.conf.

Public Key Authentication

The public keys that are authorized for authentication of a given user must be placed in the user's SSH directory. Refer to Public Key Login for details on how NSO searches for the keys to use.

NSO as SSH Client

NSO can act as an SSH client for connections to managed devices that use SSH (this is always the case for devices accessed via NETCONF, typically also for devices accessed via CLI), and for connections to other nodes in an NSO cluster. In all cases, a built-in SSH client is used. The $NCS_DIR/examples.ncs/getting-started/using-ncs/8-ssh-keys example in the NSO example collection has a detailed walk-through of the NSO functionality that is described in this section.

Host Key Verification

Verification Level

The level of host key verification can be set globally via /ssh/host-key-verification. The possible values are:

reject-unknown: The host key provided by the device or cluster node must be known by NSO for the connection to succeed.
reject-mismatch: The host key provided by the device or cluster node may be unknown, but it must not be different from the "known" key for the same key algorithm, for the connection to succeed.
none: No host key verification is done - the connection will never fail due to the host key provided by the device or cluster node.

The default is reject-unknown, and it is not recommended to use a different value, although it can be useful or needed in certain circumstances. E.g. none maybe useful in a development scenario, and temporary use of reject-mismatch maybe motivated until host keys have been configured for a set of existing managed devices.

Allowing SSH Connections With Unknown Host Keys

admin@ncs(config)# ssh host-key-verification reject-mismatch
admin@ncs(config)# commit
Commit complete.

Connection to a Managed Device

The public host keys for a device that is accessed via SSH are stored in the /devices/device/ssh/host-key list. There can be several keys in this list, one each for the ssh-ed25519 (ED25519 key), ssh-dss (DSA key) and ssh-rsa (RSA key) key algorithms. In case a device has entries in its live-status-protocol list that use SSH, the host keys for those can be stored in the /devices/device/live-status-protocol/ssh/host-key list, in the same way as the device keys - however if /devices/device/live-status-protocol/ssh does not exist, the keys from /devices/device/ssh/host-key are used for that protocol. The keys can be configured e.g. via input directly in the CLI, but in most cases, it will be preferable to use the actions described below to retrieve keys from the devices. These actions will also retrieve any live-status-protocol keys for a device.

The level of host key verification can also be set per device, via /devices/device/ssh/host-key-verification. The default is to use the global value (or default) for /ssh/host-key-verification, but any explicitly set value will override the global value. The possible values are the same as for /ssh/host-key-verification.

There are several actions that can be used to retrieve the host keys from a device and store them in the NSO configuration:

/devices/fetch-ssh-host-keys: Retrieve the host keys for all devices. Successfully retrieved keys are committed to the configuration.
/devices/device-group/fetch-ssh-host-keys: Retrieve the host keys for all devices in a device group. Successfully retrieved keys are committed to the configuration.
/devices/device/ssh/fetch-host-keys: Retrieve the host keys for one or more devices. In the CLI, range expressions can be used for the device name, e.g. using '*' will retrieve keys for all devices, etc. The action will commit the retrieved keys if possible, i.e. if the device entry is already committed, otherwise (i.e. if the action is invoked from "configure mode" when the device entry has been created but not committed), the keys will be written to the current transaction, but not committed.

The fingerprints of the retrieved keys will be reported as part of the result from these actions, but it is also possible to ask for the fingerprints of already retrieved keys by invoking the /devices/device/ssh/host-key/show-fingerprint action (/devices/device/live-status-protocol/ssh/host-key/show-fingerprint for live-status protocols that use SSH).

Retrieving SSH Host Keys for All Configured Devices

admin@ncs# devices fetch-ssh-host-keys
fetch-result {
    device c0
    result unchanged
    fingerprint {
        algorithm ssh-dss
        value 03:64:fc:b7:87:bd:34:5e:3b:6e:d8:71:4d:3f:46:76
    }
}
fetch-result {
    device h0
    result unchanged
    fingerprint {
        algorithm ssh-dss
        value 03:64:fc:b7:87:bd:34:5e:3b:6e:d8:71:4d:3f:46:76
    }
}

Connection to an NSO Cluster Node

This is very similar to the case of a connection to a managed device, it differs mainly in locations - and in the fact that SSH is always used for connection to a cluster node. The public host keys for a cluster node are stored in the /cluster/remote-node/ssh/host-key list, in the same way as the host keys for a device. The keys can be configured e.g. via input directly in the CLI, but in most cases, it will be preferable to use the action described below to retrieve keys from the cluster node.

The level of host key verification can also be set per cluster node, via /cluster/remote-node/ssh/host-key-verification. The default is to use the global value (or default) for /ssh/host-key-verification, but any explicitly set value will override the global value. The possible values are the same as for /ssh/host-key-verification.

The /cluster/remote-node/ssh/fetch-host-keys action can be used to retrieve the host keys for one or more cluster nodes. In the CLI, range expressions can be used for the node name, e.g. using '*' will retrieve keys for all nodes, etc. The action will commit the retrieved keys if possible, but if it is invoked from "configure mode" when the node entry has been created but not committed, the keys will be written to the current transaction, but not committed.

The fingerprints of the retrieved keys will be reported as part of the result from this action, but it is also possible to ask for the fingerprints of already retrieved keys by invoking the /cluster/remote-node/ssh/host-key/show-fingerprint action.

Retrieving SSH Host Keys for All Cluster Nodes

admin@ncs# cluster remote-node * ssh fetch-host-keys
cluster remote-node ncs1 ssh fetch-host-keys
    result updated
    fingerprint {
        algorithm ssh-dss
        value 03:64:fc:b7:87:bd:34:5e:3b:6e:d8:71:4d:3f:46:76
    }
cluster remote-node ncs2 ssh fetch-host-keys
    result updated
    fingerprint {
        algorithm ssh-dss
        value 03:64:fc:b7:87:bd:34:5e:3b:6e:d8:71:4d:3f:46:76
    }
cluster remote-node ncs3 ssh fetch-host-keys
    result updated
    fingerprint {
        algorithm ssh-dss
        value 03:64:fc:b7:87:bd:34:5e:3b:6e:d8:71:4d:3f:46:76
    }

Public Key Authentication

Private Key Selection

The private key used for public key authentication can be taken either from the SSH directory for the local user or from a list of private keys in the NSO configuration. The user's SSH directory is determined according to the same logic as for the server-side public keys that are authorized for authentication of a given user, see Public Key Login, but of course, different files in this directory are used, see below. Alternatively, the key can be configured in the /ssh/private-key list, using an arbitrary name for the list key. In both cases, the key must be in PEM format (e.g. as generated by the OpenSSH ssh-keygen command), and it may be encrypted or not. Encrypted keys configured in /ssh/private-key must have the passphrase for the key configured via /ssh/private-key/passphrase.

Connection to a Managed Device

The specific private key to use is configured via the authgroup indirection and the umap selection mechanisms as for password authentication, just a different alternative. Setting /devices/authgroups/group/umap/public-key (or default-map instead of umap for users that are not in umap) without any additional parameters will select the default of using a file called id_dsa in the local user's SSH directory, which must have an unencrypted key. A different file name can be set via /devices/authgroups/group/umap/public-key/private-key/file/name. For an encrypted key, the passphrase can be set via /devices/authgroups/group/umap/public-key/private-key/file/passphrase, or /devices/authgroups/group/umap/public-key/private-key/file/use-password can be set to indicate that the password used (if any) by the local user when authenticating to NSO should also be used as a passphrase for the key. To instead select a private key from the /ssh/private-key list, the name of the key is set via /devices/authgroups/group/umap/public-key/private-key/name.

Configuring a Private Key File for Publickey Authentication to Devices

admin@ncs(config)# devices authgroups group default umap admin
admin@ncs(config-umap-admin)# public-key private-key file name /home/admin/.ssh/id-dsa
admin@ncs(config-umap-admin)# public-key private-key file passphrase
(<AES encrypted string>): *********
admin@ncs(config-umap-admin)# commit
Commit complete.

Connection to an NSO Cluster Node

This is again very similar to the case of a connection to a managed device, since the same authgroup/umap scheme is used. Setting /cluster/authgroup/umap/public-key (or default-map instead of umap for users that are not in umap) without any additional parameters will select the default of using a file called id_dsa in the local user's SSH directory, which must have an unencrypted key. A different file name can be set via /cluster/authgroup/umap/public-key/private-key/file/name. For an encrypted key, the passphrase can be set via /cluster/authgroup/umap/public-key/private-key/file/passphrase, or /cluster/authgroup/umap/public-key/private-key/file/use-password can be set to indicate that the password used (if any) by the local user when authenticating to NSO should also be used as a passphrase for the key. To instead select a private key from the /ssh/private-key list, the name of the key is set via /cluster/authgroup/umap/public-key/private-key/name.

Configuring a Private Key File for Publickey Authentication in Cluster

admin@ncs(config)# cluster authgroup default umap admin
admin@ncs(config-umap-admin)# public-key private-key file name /home/admin/.ssh/id-dsa
admin@ncs(config-umap-admin)# public-key private-key file passphrase
(<AES encrypted string>): *********
admin@ncs(config-umap-admin)# commit
Commit complete.

Alarm Manager

Manage NSO alarms with native alarm manager.

NSO embeds a generic alarm manager. It manages NSO native alarms and can easily be extended with application-specific alarms. Alarm sources can be notifications from devices, undesired states on services detected or anything provided via the Java API.

The Alarm Manager has three main components:

Alarm List: A list of alarms in NSO. Each list entry represents an alarm state for a specific device, an object within the device, and an alarm type.
Alarm Model: For each alarm type, you can configure the mapping to for example X.733 alarm standard parameters that are sent as notifications northbound.
Operator Actions: Actions to set operator states on alarms such as acknowledgement, and also actions to administratively manage the alarm list such as deleting alarms.

The alarm manager is accessible over all northbound interfaces. A read-only view including an SNMP alarm table and alarm notifications is available in an SNMP Alarm MIB. This MIB is suitable for integration with SNMP-based alarm systems.

To populate the alarm list there is a dedicated Java API. This API lets a developer add alarms, change states on alarms, etc. A common usage pattern is to use the SNMP notification receiver to map a subset of the device traps into alarms.

Alarm Concepts

First of all, it is important to clearly define what an alarm means: "An alarm denotes an undesirable state in a resource for which an operator action is required". Alarms are often confused with general logging and event mechanisms, thereby overflooding the operator with alarms. In NSO, the alarm manager shows undesired resource states that an operator should investigate. NSO contains other mechanisms for logging in general. Therefore, NSO does not naively populate the alarm list with traps received in the SNMP notification receiver.

Before looking into how NSO handles alarms, it is important to define the fundamental concepts. We make a clear distinction between alarms and events in general. Alarms should be taken seriously and be investigated. Alarms have states; they go active with a specific severity, they change severity, and they are cleared by the resource. The same alarm may become active again. A common mistake is to confuse the operator view with the resource view. The model described so far is the resource view. The resource itself may consider the alarm cleared. The alarm manager does not automatically delete cleared alarms. An alarm that has existed in the network may still need investigation. There are dedicated actions an operator can use to manage the alarm list, for example, delete the alarms based on criteria such as cleared and date. These actions can be performed over all northbound interfaces.

Rather than viewing alarms as a list of alarm notifications, NSO defines alarms as states on objects. The NSO alarm list uses four keys for alarms: the alarming object within a device, the alarm type, and an optional specific problem.

Alarm types are normally unique identifiers for a specific alarm state and are defined statically. An alarm type corresponds to the well-known X.733 alarm standard tuple event type and probable cause. A specific problem is an optional key that is string-based and can further redefine an alarm type at run-time. This is needed for alarms that are not known before a system is deployed.

Imagine a system with general digital inputs. A MIB might specify traps called input-high, or input-low. When defining the SNMP notification reception, an integrator might define an alarm type called "External-Alarm". input-high might imply a major alarm and input-low might imply clear.

At installation, some detectors report "fire-alarm" and some "door-open" alarms. This is configured at the device and sent as free text in the SNMP var-binds. This is then managed by using the specific problem field of the NSO alarm manager to separate these different alarm types.

The data model for the alarm manager is outlined below.

This means that we have a list with key: (managed device, managed object, alarm type, specific problem). In the example above, we might have the following different alarms:

Device : House1; Managed Object : Detector1; Alarm-Type : External Alarm; Specific Problem = Smoke;
Device : House1; Managed Object : Detector2; Alarm-Type : External Alarm; Specific Problem = Door Open;

Each alarm entry shows the last status change for the alarm and also a child list with all status changes sorted in chronological order.

is-cleared: was the last state change clear?
last-status-change: timestamp for the last status change.
last-perceived-severity: last severity (not equal to clear).
last-alarm-text: the last alarm text (not equal to clear).
status-change, event-time: the time reported by the device.
status-change, received-time: the time the state change was received by NSO.
status-change, perceived-severity: the new perceived severity.
status-change, alarm-text: descriptive text associated with the new alarm status.

It is fundamental to define alarm types (specific problem) and the managed objects with a fine-grained mechanism that still is extensible. For objects we allow YANG instance-identifiers to refer to a YANG instance identifier, an SNMP OID, or a string. Strings can be used when the underlying object is not modeled. We use YANG identities to define alarm types. This has the benefit that alarm types can be defined in a named hierarchy and thereby provide an extensible mechanism. To support "dynamic alarm types" so that alarms can be separated by information only available at run-time, the string-based field-specific problem can also be used.

So far we have described the model based on the resource view. It is common practice to let operators manipulate the alarms corresponding to the operator's investigation. We clearly separate the resource and the operator view, for example, there is no such thing as an operator "clearing an alarm". Rather the alarm entries can have a corresponding alarm handling state. Operators may want to acknowledge an alarm and set the alarm state to closed or similar.

Alarm List Administrative Actions

We also support some alarm list administrative actions:

Synchronize alarms: try to read the alarm states in the underlying resources and update the alarm list accordingly (this action needs to be implemented by user code for specific applications).
Purge alarms: delete entries in the alarm list based on several different filter criteria.
Filter alarms: with an XPATH as filter input, this action returns all alarms fulfilling the filter.
Compress alarms: since every entry may contain a large amount of state change entries this action compresses the history to the latest state change.

Alarms can be forwarded over NSO northbound interfaces. In many telecom environments, alarms need to be mapped to X.733 parameters. We provide an alarm model where every alarm type is mapped to the corresponding X.733 parameters such as event type and probable cause. In this way, it is easy to integrate NSO alarms into whatever X.733 enumerated values the upper fault management system requires.

The Alarm Model

The central part of the YANG Alarm model tailf-ncs-alarms.yang has the following structure.

tailf-ncs-alarms.yang

module tailf-ncs-alarms {

  namespace "http://tail-f.com/ns/ncs-alarms";
  prefix "al";
  ...
 typedef managed-object-t {
    type union {
      type instance-identifier {
        require-instance false;
        }
      type yang:object-identifier;
      type string;
    }


  ...
  typedef event-type  {
    type enumeration {
      enum other {value 1;}
      enum communicationsAlarm {value 2;}
      enum qualityOfServiceAlarm {value 3;}
      enum processingErrorAlarm {value 4;}
      enum equipmentAlarm {value 5;}
      ...
    }
    description
    "...";
    reference
    "ITU Recommendation X.736, 'Information Technology - Open
     Systems Interconnection - System Management: Security
     Alarm Reporting Function', 1992";
  }

  typedef severity-t  {
    type enumeration {
      enum cleared {value 1;}
      enum indeterminate {value 2;}
      enum critical {value 3;}
      enum major {value 4;}
      enum minor {value 5;}
      enum warning {value 6;}
    }
    description
      "...";
  }
  ...
  identity alarm-type {
    description
    "Base identity for alarm types."
    ...
  }

  identity ncs-dev-manager-alarm {
    base alarm-type;
  }

  identity ncs-service-manager-alarm {
    base alarm-type;
  }

  identity connection-failure {
    base ncs-dev-manager-alarm;
    description
      "NCS failed to connect to a device";
  }
  ....
  container alarm-model {
    list alarm-type {
      key "type";
        leaf type {
          type alarm-type-t;
        }

        uses alarm-model-parameters;
     }
  }

      ...


    container alarm-list {
      config false;
      leaf number-of-alarms {
        type yang:gauge32;
      }

      leaf last-changed {
        type yang:date-and-time;
      }

      list alarm {
        key "device type managed-object specific-problem";
        uses common-alarm-parameters;
        leaf is-cleared {
          type boolean;
          mandatory true;
        }

        leaf last-status-change {
          type yang:date-and-time;
          mandatory true;
        }

        leaf last-perceived-severity {
          type severity-t;
        }

        leaf last-alarm-text {
          type alarm-text-t;
        }

        list status-change {
          key event-time;
          min-elements 1;
          uses alarm-state-change-parameters;
        }

        leaf last-alarm-handling-change {
          type yang:date-and-time;
        }

        list alarm-handling {
          key time;
          leaf time {
            tailf:info "Time stamp for operator action";
            type yang:date-and-time;
          }
          leaf state {
            tailf:info "The operators view of the alarm state";
            type alarm-handling-state-t;
            mandatory true;
            description
              "The operators view of the alarm state.";
          }
          ...
        }
        ...
        notification alarm-notification {
        ...
        rpc synchronize-alarms {
        ...
        rpc compress-alarms {
        ...
        rpc purge-alarms {

The first part of the YANG listing above shows the definition for managed-object type in order for alarms to refer to YANG, SNMP, and other resources. We also see basic definitions from the X.733 standard for severity levels.

Note well the definition of alarm type using YANG identities. In this way, we can create a structured alarm-type hierarchy all rooted at alarm-type. For you to add your specific alarm types, define your own alarm types YANG file and add identities using alarm-type as a base.

The alarm-model container contains the mapping from alarm types to X.733 parameters used for north-bound interfaces.

The alarm-list container is the actual alarm list where we maintain a list mapping (device, managed-object, alarm-type, specific-problem) to the corresponding alarm state changes [(time, severity, text)].

Finally, we see the northbound alarm notification and alarm administrative actions.

Alarm Handling

The NSO alarm manager has support for the operator to acknowledge alarms. We call this alarm handling. Each alarm has an associated list of alarm handling entries as:

container alarms {
  ....
  container alarm-list {
    config false;
    ....
    list alarm {
      key "device type managed-object specific-problem";

      .....

      list alarm-handling {
        key time;
        leaf time {
          type yang:date-and-time;
          description
            "Time-stamp for operator action on alarm.";
        }
        leaf state {
          mandatory true;
          type alarm-handling-state-t;
          description
            "The operators view of the alarm state";
        }
        leaf user {
          description "Which user has acknowledged this alarm";
          mandatory true;
          type string;
        }
        leaf description {
          description "Additional optional textual information regarding
            this new alarm-handling entry";
          type string;
        }
      }

        tailf:action handle-alarm {
          tailf:info "Set the operator state of this alarm";
          description
            "An action to allow the operator to add an entry to the
             alarm-handling list. This is a means for the operator to indicate
             the level of human intervention on an alarm.";
          input {
            leaf state {
              type alarm-handling-state-t;
              mandatory true;
            }
          }
        }
      }

The following typedef defines the different states an alarm can be set into.

Alarm state

  typedef alarm-handling-state-t  {
    type enumeration {
      enum none {
        value 1;
      }
      enum ack {
        value 2;
      }
      enum investigation {
        value 3;
      }
      enum observation {
        value 4;
      }
      enum closed {
        value 5;
      }
    }
    description
      "Operator actions on alarms";
  }

It is of course also possible to manipulate the alarm handling list from either Java code or Javascript code running in the web browser using the js_maapi library.

Below is a simple scenario to illustrate the alarm concepts. The example can be found in examples.ncs/service-provider/simple-mpls-vpn.

$ make stop clean all start
$ ncs-netsim stop pe0
$ ncs-netsim stop pe1
$ ncs_cli -u admin -C
admin connected from 127.0.0.1 using console on host
admin@ncs# devices connect
...
connect-result {
    device pe0
    result false
    info Failed to connect to device pe0: connection refused
}
connect-result {
    device pe1
    result false
    info Failed to connect to device pe1: connection refused
}
...
admin@ncs# show alarms alarm-list
alarms alarm-list number-of-alarms 2
alarms alarm-list last-changed 2015-02-18T08:02:49.162436+00:00
alarms alarm-list alarm pe0 connection-failure /devices/device[name='pe0'] ""
 is-cleared              false
 last-status-change      2015-02-18T08:02:49.162734+00:00
 last-perceived-severity major
 last-alarm-text         "Failed to connect to device pe0: connection refused"
 status-change 2015-02-18T08:02:49.162734+00:00
  received-time      2015-02-18T08:02:49.162734+00:00
  perceived-severity major
  alarm-text         "Failed to connect to device pe0: connection refused"
alarms alarm-list alarm pe1 connection-failure /devices/device[name='pe1'] ""
 is-cleared              false
 last-status-change      2015-02-18T08:02:49.162436+00:00
 last-perceived-severity major
 last-alarm-text         "Failed to connect to device pe1: connection refused"
 status-change 2015-02-18T08:02:49.162436+00:00
  received-time      2015-02-18T08:02:49.162436+00:00
  perceived-severity major
  alarm-text         "Failed to connect to device pe1: connection refused"

In the above scenario, we stop two of the devices and then ask NSO to connect to all devices. This results in two alarms for pe0 and pe1. Note that the key for the alarm is the device name, the alarm type, the full path to the object (in this case, the device and not an object within the device), and finally an empty string for the specific problem.

In the next command sequence, we start the device and request NSO to connect. This will clear the alarms.

admin@ncs# exit
$ ncs-netsim start pe0
DEVICE pe0 OK STARTED
$ ncs-netsim start pe1
DEVICE pe1 OK STARTED
$ ncs_cli -u admin -C
$ admin@ncs# devices connect
...
connect-result {
    device pe0
    result true
    info (admin) Connected to pe0 - 127.0.0.1:10028
}
connect-result {
    device pe1
    result true
    info (admin) Connected to pe1 - 127.0.0.1:10029
}
...
admin@ncs# show alarms alarm-list
alarms alarm-list number-of-alarms 2
alarms alarm-list last-changed 2015-02-18T08:05:04.942637+00:00
alarms alarm-list alarm pe0 connection-failure /devices/device[name='pe0'] ""
 is-cleared              true
 last-status-change      2015-02-18T08:05:04.942637+00:00
 last-perceived-severity major
 last-alarm-text         "Failed to connect to device pe0: connection refused"
 status-change 2015-02-18T08:02:49.162734+00:00
  received-time      2015-02-18T08:02:49.162734+00:00
  perceived-severity major
  alarm-text         "Failed to connect to device pe0: connection refused"
 status-change 2015-02-18T08:05:04.942637+00:00
  received-time      2015-02-18T08:05:04.942637+00:00
  perceived-severity cleared
  alarm-text         "Connected as admin"
alarms alarm-list alarm pe1 connection-failure /devices/device[name='pe1'] ""
 is-cleared              true
 last-status-change      2015-02-18T08:05:04.84115+00:00
 last-perceived-severity major
 last-alarm-text         "Failed to connect to device pe1: connection refused"
 status-change 2015-02-18T08:02:49.162436+00:00
  received-time      2015-02-18T08:02:49.162436+00:00
  perceived-severity major
  alarm-text         "Failed to connect to device pe1: connection refused"
 status-change 2015-02-18T08:05:04.84115+00:00
  received-time      2015-02-18T08:05:04.84115+00:00
  perceived-severity cleared
  alarm-text         "Connected as admin"

Note that there are two status-change entries for the alarm and that the alarm is cleared. In the following scenario, we will state that the alarm is closed and finally purge (delete) all alarms that are cleared and closed (Again, note the distinction between operator states and the states from the underlying resources).

admin@ncs# alarms alarm-list alarm pe0 connection-failure /devices/device[name='pe0']
          "" handle-alarm state closed description Fixed

admin@ncs# show alarms alarm-list alarm alarm-handling

DEVICE  TYPE                 STATE   USER   DESCRIPTION
---------------------------------------------------------
pe0     connection-failure   closed  admin  Fixed

admin@ncs# alarms purge-alarms alarm-handling-state-filter { state closed }
Value for 'alarm-status' [any,cleared,not-cleared]: cleared
purged-alarms 1

Assume that you need to configure the northbound parameters. This is done using the alarm model. A logical mapping of the connection problem above is to map it to X.733 probable cause connectionEstablishmentError (22) . This is done in the NSO CLI in the following way:

admin@ncs# config
Entering configuration mode terminal
admin@ncs(config)# alarms alarm-model alarm-type connection-failure probable-cause 22
admin@ncs(config-alarm-type-connection-failure/*)# commit
Commit complete.
admin@ncs(config-alarm-type-connection-failure/*)# show full-configuration
alarms alarm-model alarm-type connection-failure *
 event-type     communicationsAlarm
 has-clear      true
 kind-of-alarm  root-cause
 probable-cause 22

Plug-and-Play Scripting

Use NSO's plug-and-play scripting mechanism to add new functionality to NSO.

A scripting mechanism can be used together with the CLI (scripting is not available for any other northbound interfaces). This section is intended for users who are familiar with UNIX shell scripting and/or programming. With the scripting mechanism, an end-user can add new functionality to NSO in a plug-and-play-like manner. No special tools are needed.

There are three categories of scripts:

command scripts: Used to add new commands to the CLI.
policy scripts: Invoked at validation time and may control the outcome of a transaction. Policy scripts have the mandate to cause a transaction to abort.
post-commit scripts: Invoked when a transaction has been committed. Post-commit scripts can for example be used for logging, sending external events etc.

The terms 'script' and 'scripting' used throughout this description refer to how functionality can be added without a requirement for integration using the NSO programming APIs. NSO will only run the scripts as UNIX executables. Thus they may be written as shell scripts, or by using another scripting language that is supported by the OS, e.g., Python, or even as compiled code. The scripts are run with the same user ID as NSO.

The examples in this section are written using shell scripts as the least common denominator, but they can be written in another suitable language, e.g., Python or C.

Script Storage

Scripts are stored in a directory tree with a predefined structure where there is a sub-directory for each script category:

scripts/
        command/
        policy/
        post-commit/

For all script categories, it suffices to just add a valid script in the correct sub-directory to enable the script. See the details for each script category for how a valid script of that category is defined. Scripts with a name beginning with a dot character ('.') are ignored.

The directory path to the location of the scripts is configured with the /ncs-config/scripts/dir configuration parameter. It is possible to have several script directories. The sample ncs.conf file that comes with the NSO release specifies two script directories: ./scripts and ${NCS_DIR}/scripts.

Script Interface

All scripts are required to provide a formal description of their interface. When the scripts are loaded, NSO will invoke the scripts with (one of) the following as an argument depending on the script category.

--command
--policy
--post-commit

The script must respond by writing its formal interface description on stdout and exit normally. Such a description consists of one or more sections. Which sections are required, depends on the category of the script.

The sections do however have a common syntax. Each section begins with the keyword begin followed by the type of section. After that one or more lines of settings follow. Each such setting begins with a name, followed by a colon character (:), and after that the value is stated. The section ends with the keyword end. Empty lines and spaces may be used to improve readability.

For examples see each corresponding section below.

Script Loading

Scripts are automatically loaded at startup and may also be manually reloaded with the CLI command script reload. The command takes an optional verbosity parameter which may have one of the following values:

diff: Shows info about those scripts that have been changed since the latest (re)load. This is the default.
all: Shows info about all scripts regardless of whether they have been changed or not.
errors: Shows info about those scripts that are erroneous, regardless of whether they have been changed or not. Typical errors are invalid file permissions and syntax errors in the interface description.

Yet another parameter may be useful when debugging the reload of scripts:

debug: Shows additional debug info about the scripts.

An example session reloading scripts:

admin@ncs# script reload all
$NCS_DIR/examples.ncs/getting-started/using-ncs/7-scripting/scripts:
ok
command:
    add_user.sh: unchanged
    echo.sh: unchanged
policy:
    check_dir.sh: unchanged
post-commit:
    show_diff.sh: unchanged
/opt/ncs/scripts: ok
command:
    device_brief.sh: unchanged
    device_brief_c.sh: unchanged
    device_list.sh: unchanged
    device_list_c.sh: unchanged
    device_save.sh: unchanged

Command Scripts

Command scripts are used to add new commands to the CLI. The scripts are executed in the context of a transaction. When the script is run in oper mode, this is a read-only transaction, when it is run in config mode, it is a read-write transaction. In that context, the script may make use of the environment variables NCS_MAAPI_USID and NCS_MAAPI_THANDLE in order to attach to the active transaction. This makes it simple to make use of the ncs-maapi command (see the ncs-maapi(1) in Manual Pages manual page) for various purposes.

Each command script must be able to handle the argument --command and, when invoked, write a command section to stdout. If the CLI command is intended to take parameters, one param section per CLI parameter must also be emitted.

The command is not paginated by default in the CLI and will only do so if it is piped to more.

joe@io> example_command_script | more

`command` Section

The following settings can be used to define a command:

modes: Defines in which CLI mode(s) that the command should be available. The value can be oper, config or both (separated with space).
styles: Defines in which CLI styles the command should be available. The value can be one or more of c, i and j (separated with space). c means Cisco style, imeans Cisco IOS, and j J-style.
cmdpath: Is the full CLI command path. For example, the command path my script echo implies that the command will be called my script echo in the CLI.
help: Command help text.

An example of a command section is:

begin command
  modes: oper
  styles: c i j
  cmdpath: my script echo
  help: Display a line of text
end

`param` Section

Now let's look at various aspects of a parameter. This may both affect the parameter syntax for the end-user in the CLI as well as what the command script will get as arguments.

The following settings can be used to customize each CLI parameter:

name: Optional name of the parameter. If provided, the CLI will prompt for this name before the value. By default, the name is not forwarded to the script. See flag and prefix.
type: The type of the parameter. By default each parameter has a value, but by setting the type to void the CLI will not prompt for a value. To be useful the void type must be combined with name and either flag or prefix.
presence: Controls whether the parameter must be present in the CLI input or not. Can be set to optional or mandatory.
words: Controls the number of words that the parameter value may consist of. By default, the value must consist of just one word (possibly quoted if it contains spaces). If set to any, the parameter may consist of any number of words. This setting is only valid for the last parameter.
flag: Extra argument added before the parameter value. For example, if set to -f and the user enters logfile, the script will get -f logfile as arguments.
prefix: Extra string prepended to the parameter value (as a single word). For example, if set to --file= and the user enters logfile, the script will get --file=logfile as argument.
help: Parameter help text.

If the command takes a parameter to redirect the output to a file, a param section might look like this:

begin param
 name: file
 presence: optional
 flag: -f
 help: Redirect output to file
end

Full `command` Example

A command denying changes the configured trace-dir for a set of devices, it can use the check_dir.sh script.

#!/bin/bash

set -e

while [ $# -gt 0 ]; do
    case "$1" in
        --command)
            # Configuration of the command
            #
            # modes   - CLI mode (oper config)
            # styles  - CLI style (c i j)
            # cmdpath - Full CLI command path
            # help    - Command help text
            #
            # Configuration of each parameter
            #
            # name     - (optional) name of the parameter
            # more     - (optional) true or false
            # presence - optional or mandatory
            # type     - void - A parameter without a value
            # words    - any - Multi word param. Only valid for the last param
            # flag     - Extra word added before the parameter value
            # prefix   - Extra string prepended to the parameter value
            # help     - Command help text
            cat << EOF

begin command
  modes: config
  styles: c i j
  cmdpath: user-wizard
  help: Add a new user
end
EOF
            exit
            ;;
        *)
            break
            ;;
    esac
    shift
done

## Ask for user name
while true; do
    echo -n "Enter user name: "
    read user

    if [ ! -n "${user}" ]; then
        echo "You failed to supply a user name."
    elif ncs-maapi --exists "/aaa:aaa/authentication/users/user{${user}}"; then
        echo "The user already exists."
    else
        break
    fi
done

## Ask for password
while true; do
    echo -n "Enter password: "
    read -s pass1
    echo

    if [ "${pass1:0:1}" == "$" ]; then
        echo -n "The password must not start with $. Please choose a "
        echo    "different password."
    else
        echo -n "Confirm password: "
        read -s pass2
        echo

        if [ "${pass1}" != "${pass2}" ]; then
            echo "Passwords do not match."
        else
            break
        fi
    fi
done

groups=`ncs-maapi --keys "/nacm/groups/group"`
while true; do
    echo "Choose a group for the user."
    echo -n "Available groups are: "
    for i in ${groups}; do echo -n "${i} "; done
    echo
    echo -n "Enter group for user: "
    read group

    if [ ! -n "${group}" ]; then
        echo "You must enter a valid group."
    else
        for i in ${groups}; do
            if [ "${i}" == "${group}" ]; then
                # valid group found
                break 2;
            fi
        done
        echo "You entered an invalid group."
    fi
    echo
done

echo "Creating user"

ncs-maapi --create "/aaa:aaa/authentication/users/user{${user}}"
ncs-maapi --set "/aaa:aaa/authentication/users/user{${user}}/password" \
                "${pass1}"

echo "Setting home directory to: /homes/${user}"
ncs-maapi --set "/aaa:aaa/authentication/users/user{${user}}/homedir" \
            "/homes/${user}"

echo "Setting ssh key directory to: /homes/${user}/ssh_keydir"
ncs-maapi --set "/aaa:aaa/authentication/users/user{${user}}/ssh_keydir" \
            "/homes/${user}/ssh_keydir"

ncs-maapi --set "/aaa:aaa/authentication/users/user{${user}}/uid" "1000"
ncs-maapi --set "/aaa:aaa/authentication/users/user{${user}}/gid" "100"

echo "Adding user to the ${group} group."
gusers=`ncs-maapi --get "/nacm/groups/group{${group}}/user-name"`

for i in ${gusers}; do
    if [ "${i}" == "${user}" ]; then
        echo "User already in group"
        exit 0
    fi
done

ncs-maapi --set "/nacm/groups/group{${group}}/user-name" "${gusers} ${user}"

Calling $NCS_DIR/examples.ncs/getting-started/using-ncs/7-scripting/scripts/command/echo.sh with the argument --command argument produces a command section and a couple of param sections:

$ ./echo.sh --command
begin command
  modes: oper
  styles: c i j
  cmdpath: my script echo
  help: Display a line of text
end

begin param
 name: nolf
 type: void
 presence: optional
 flag: -n
 help: Do not output the trailing newline
end

begin param
 name: file
 presence: optional
 flag: -f
 help: Redirect output to file
end

begin param
 presence: mandatory
 words: any
 help: String to be displayed
end

In the complete example $NCS_DIR/examples.ncs/getting-started/using-ncs/7-scripting , there is a README file and a simple command script scripts/command/echo.sh.

Policy Scripts

Policy scripts are invoked at validation time before a change is committed. A policy script can reject the data, accept it, or accept it with a warning. If a warning is produced, it will be displayed for interactive users (e.g. through the CLI or Web UI). The user may choose to abort or continue to commit the transaction.

Policy scripts are typically assigned to individual leafs or containers. In some cases, it may be feasible to use a single policy script, e.g. on the top-level node of the configuration. In such a case, this script is responsible for the validation of all values and their relationships throughout the configuration.

All policy scripts are invoked on every configuration change. The policy scripts can be configured to depend on certain subtrees of the configuration, which can save time but it is very important that all dependencies are stated and also updated when the validation logic of the policy script is updated. Otherwise, an update may be accepted even though a dependency should have denied it.

There can be multiple dependency declarations for a policy script. Each declaration consists of a dependency element specifying a configuration subtree that the validation code is dependent upon. If any element in any of the subtrees is modified, the policy script is invoked. A subtree is specified as an absolute path.

If there are no declared dependencies, the root of the configuration tree (/) is used, which means that the validation code is executed when any configuration element is modified. If dependencies are declared on a leaf element, an implicit dependency on the leaf itself is added.

Each policy script must handle the argument --policy and, when invoked, write a policy section to stdout. The script must also perform the actual validation when invoked with the argument --keypath.

`policy` Section

The following settings can be used to configure a policy script:

keypath: Mandatory. The keypath is the path to a node in the configuration data tree. The policy script will be associated with this node. The path must be absolute. A keypath can for example be /devices/device/c0. The script will be invoked if the configuration node, referred to by the keypath, is changed or if any node in the subtree under the node (if the node is a container or list) is changed.
dependency: Declaration of a dependency. The dependency must be an absolute key path. Multiple dependency settings can be declared. Default is /.
priority: An optional integer parameter specifying the order policy scripts will be evaluated, in order of increasing priority, where a lower value is higher priority. The default priority is 0.
call: This optional setting can only be used if the associated node, declared as keypath, is a list. If set to once, the policy script is only called once even though there exists many list entries in the data store. This is useful if we have a huge amount of instances or if values assigned to each instance have to be validated in comparison with its siblings. Default is each.

A policy that will be run for every change on or under /devices/device.

begin policy
 keypath: /devices/device
 dependency: /devices/global-settings
 priority: 4
 call: each
end

Validation

When NSO has concluded that the policy script should be invoked to perform its validation logic, the script is invoked with the option --keypath. If the registered node is a leaf, its value will be given with the --value option. For example --keypath /devices/device/c0 or if the node is a leaf --keypath /devices/device/c0/address --value 127.0.0.1.

Once the script has performed its validation logic it must exit with a proper status.

The following exit statuses are valid:

0: Validation ok. Vote for commit.
1: When the outcome of the validation is dubious, it is possible for the script to issue a warning message. The message is extracted from the script output on stdout. An interactive user can choose to abort or continue to commit the transaction. Non-interactive users automatically vote for commit.
2: When the validation fails, it is possible for the script to issue an error message. The message is extracted from the script output on stdout. The transaction will be aborted.

Full `policy` Example

A policy denying changes the configured trace-dir for a set of devices, it can use the check_dir.sh script.

#!/bin/sh

usage_and_exit() {
    cat << EOF
Usage: $0 -h
       $0 --policy
       $0 --keypath <keypath> [--value <value>]

  -h                    display this help and exit
  --policy              display policy configuration and exit
  --keypath <keypath>   path to node
  --value <value>       value of leaf

Return codes:

  0 - ok
  1 - warning message is printed on stdout
  2 - error message   is printed on stdout
EOF
    exit 1
}

while [ $# -gt 0 ]; do
    case "$1" in
        -h)
            usage_and_exit
            ;;
        --policy)
            cat << EOF
begin policy
  keypath: /devices/global-settings/trace-dir
  dependency: /devices/global-settings
  priority: 2
  call: each
end
EOF
            exit 0
            ;;
        --keypath)
            if [ $# -lt 2 ]; then
                echo "<ERROR> --keypath <keypath> - path omitted"
                usage_and_exit
            else
                keypath=$2
                shift
            fi
            ;;
        --value)
            if [ $# -lt 2 ]; then
                echo "<ERROR> --value <value> - leaf value omitted"
                usage_and_exit
            else
                value=$2
                shift
            fi
            ;;
        *)
            usage_and_exit
            ;;
    esac
    shift
done

if [ -z "${keypath}" ]; then
    echo "<ERROR> --keypath <keypath> is mandatory"
    usage_and_exit
fi

if [ -z "${value}" ]; then
    echo "<ERROR> --value <value> is mandatory"
    usage_and_exit
fi

orig="./logs"
dir=${value}
# dir=`ncs-maapi --get /devices/global-settings/trace-dir`
if [ "${dir}" != "${orig}" ] ; then
    echo "/devices/global-settings/trace-dir: must retain it original value (${orig})"
    exit 2
fi

Trying to change that parameter would result in an aborted transaction

admin@ncs(config)# devices global-settings trace-dir ./testing
admin@ncs(config)# commit
Aborted: /devices/global-settings/trace-dir: must retain it original
value (./logs)

In the complete example $NCS_DIR/examples.ncs/getting-started/using-ncs/7-scripting/ there is a README file and a simple policy script scripts/policy/check_dir.sh.

Post-commit Scripts

Post-commit scripts are run when a transaction has been committed, but before any locks have been released. The transaction hangs until the script has returned. The script cannot change the outcome of the transaction. Post-commit scripts can for example be used for logging, sending external events etc. The scripts run as the same user ID as NSO.

The script is invoked with --post-commit at script (re)load. In future releases, it is possible that the post-commit section will be used for control of the post-commit scripts behavior.

At post-commit, the script is invoked without parameters. In that context, the script may make use of the environment variables NCS_MAAPI_USID and NCS_MAAPI_THANDLE in order to attach to the active (read-only) transaction.

This makes it simple to make use of the ncs-maapi command. Especially the command ncs-maapi --keypath-diff / may turn out to be useful, as it provides a listing of all updates within the transaction on a format that is easy to parse.

`post-commit` Section

All post-commit scripts must be able to handle the argument --post-commit and, when invoked, write an empty post-commit section to stdout:

begin post-commit
end

Full `post-commit` Example

Assume the administrator of a system would want to have a mail each time a change is performed on the system, a script such as mail_admin.sh:

#!/bin/bash

set -e

if [ $# -gt 0 ]; then
    case "$1" in
        --post-commit)
            cat &lt;&lt; EOF
begin post-commit
end
EOF
            exit 0
            ;;
        *)
            echo
            echo "Usage: $0 [--post-commit]"
            echo
            echo "  --post-commit Mandatory for post-commit scripts"
            exit 1
            ;;
    esac
else
    file="mail_admin.log"
    NCS_DIFF=$(ncs-maapi --keypath-diff /)
    mail -s "NCS Mailer" admin@example.com &lt;&lt;EOF
AutoGenerated mail from NCS

$NCS_DIFF
EOF
fi

If the admin then loads this script:

admin@ncs# script reload debug
$NCS_DIR/examples.ncs/getting-started/using-ncs/1-simulated-cisco-ios/scripts:
ok
    post-commit:
        mail_admin.sh: new
--- Output from
$NCS_DIR/examples.ncs/getting-started/using-ncs/1-simulated-cisco-ios/scripts/post-commit/mail_admin.sh
--post-commit ---
1: begin post-commit
2: end
3:
---
admin@ncs# config
Entering configuration mode terminal
admin@ncs(config)# devices global-settings trace-dir ./again
admin@ncs(config)# commit
Commit complete.

This configuration change will produce an email to admin@example.com with subject NCS Mailer and body.

AutoGenerated mail from NCS
value set  : /devices/global-settings/trace-dir

In the complete example $NCS_DIR/examples.ncs/getting-started/using-ncs/7-scripting/ , there is a README file and a simple post-commit script scripts/post-commit/show_diff.sh.

Compliance Reporting

Audit and verify your network for configuration compliance.

When the network configuration is broken, there is a need to gather information and verify the network. NSO has numerous functions to show different aspects of such a network configuration verification. However, to simplify this task, compliance reporting can assemble information using a selection of these NSO functions and present the resulting information in one report. This report aims to answer two fundamental questions:

Who has done what?
Is the network correctly configured?

What defines a correctly configured network? Where is the authoritative configuration kept? Naturally, NSO, with the configurations stored in CDB, is the authority. Checking the live devices against the NSO-stored device configuration is a fundamental part of compliance reporting. Compliance reporting can also be based on one or a number of stored templates which the live devices are compared against. The compliance reports can also be a combination of both approaches.

Compliance reporting can be configured to check the current situation, check historical events, or both. To assemble historical events, rollback files are used. Therefore this functionality must be enabled in NSO before report execution, otherwise, the history view cannot be presented.

The reports can be created in either plain text, HTML, or DocBook XML format. In addition, the data can also be exported to a SQLite database file. The DocBook XML format allows you to use the report in further post-processing, such as creating a PDF using Apache FOP and your own custom styling.

Reports can be generated using either the CLI or Web UI. The suggested and favored way of generating compliance reports is via the Web UI, which provides a convenient way of creating, configuring, and consuming compliance reports. In the NSO Web UI, compliance reporting options are accessible from the Tools menu (see Web User Interface for more information). The CLI options are described in the sections below.

Creating Compliance Report Definitions

It is possible to create several named compliance report definitions. Each named report defines the devices, services, and/or templates that should be part of the network configuration verification.

Let us walk through a simple compliance report definition. This example is based on the examples.ncs/service-provider/mpls-vpn example. For the details of the included services and devices in this example, see the README file.

Each report definition has a name and can specify device and service checks. Device checks are further classified into sync and configuration checks. Device sync checks verify the in-sync status of the devices included in the report, while device configuration checks verify individual device configuration against a compliance template (see Device Configuration Checks).

For device checks, you can select the devices to be checked in four different ways:

all-devices - Check all defined devices.
device-group - Specified list of device groups.
device - Specified list of devices.
select-devices - Specified by an XPath expression.

Consider the following example report definition named gold-check:

ncs(config)# compliance reports report gold-check
ncs(config-report-gold-check)# device-check all-devices

This report definition, when executed, checks whether all devices known to NSO are in sync.

For such a check, the behavior of the verification can be specified:

To request a check-sync action to verify that the device is currently in sync. This behavior is controlled by the leaf current-out-of-sync (default true).
To scan the commit log (i.e. rollback files) for changes on the devices and report these. This behavior is controlled by the leaf historic-changes (default true).

ncs(config-report-gold-check)# device-check ?
Possible completions:
  all-devices            Report on all devices
  current-out-of-sync    Should current check-sync action be performed?
  device                 Report on specific devices
  device-group           Report on specific device groups
  historic-changes       Include commit log events from within the report
                         interval
  select-devices         Report on devices selected by an XPath expression
  <cr>

For the example gold-check, you can also use service checks. This type of check verifies if the specified service instances are in sync, that is if the network devices contain configuration as defined by these services. You can select the services to be checked in four different ways:

all-services - Check all known service instances.
service - Specified list of service instances.
select-services - Specified list of service instances through an XPath expression.
service-type - Specified list of service types.

For service checks, the verification behavior can be specified as well:

To request a check-sync action to verify that the service is currently in sync. This behavior is controlled by the leaf current-out-of-sync (default true).
To scan the commit log (i.e. rollback files) for changes on the services and report these. This behavior is controlled by the leaf historic-changes (default true).

ncs(config-report-gold-check)# service-check ?
Possible completions:
  all-services          Report on all services
  current-out-of-sync   Should current check-sync action be performed?
  historic-changes      Include commit log events from within the report
                        interval
  select-services       Report on services selected by an XPath expression
  service               Report on specific services
  service-type          The type of service.
  <cr>

In the example report, you might choose the default behavior and check all instances of the l3vpn service:

ncs(config-report-gold-check)# service-check service-type /l3vpn:vpn/l3vpn:l3vpn
ncs(config-report-gold-check)# commit
Commit complete.
ncs(config-report-gold-check)# show full-configuration
compliance reports report gold-check
 device-check all-devices
 service-check service-type /l3vpn:vpn/l3vpn:l3vpn
!

You can also use the web UI to define compliance reports. See the section Compliance Reporting for details.

Running Compliance Reports

Compliance reporting is a read-only operation. When running a compliance report, the result is stored in a file located in a sub-directory compliance-reports under the NSO state directory. NSO has operational data for managing this report storage which makes it possible to list existing reports.

Here is an example of such a report listing:

ncs# show compliance report-results
compliance report-results report 1
 name              gold-check
 title             "GOLD NW 1"
 time              2015-02-04T18:48:57+00:00
 who               admin
 compliance-status violations
 location          http://.../report_1_admin_1_2015-2-4T18:48:57:0.xml
compliance report-results report 2
 name              gold-check
 title             "GOLD NW 2"
 time              2015-02-04T18:51:48+00:00
 who               admin
 compliance-status violations
 location          http://.../report_2_admin_1_2015-2-4T18:51:48:0.text
compliance report-results report 3
 name              gold-check
 title             "GOLD NW 3"
 time              2015-02-04T19:11:43+00:00
 who               admin
 compliance-status violations
 location          http://.../report_3_admin_1_2015-2-4T19:11:43:0.text

There is also a remove action to remove report results (and the corresponding file):

ncs# compliance report-results report 2..3 remove
ncs# show compliance report-results
compliance report-results report 1
 name              gold-check
 title             "GOLD NW 1"
 time              2015-02-04T18:48:57+00:00
 who               admin
 compliance-status violations
 location          http://.../report_1_admin_1_2015-2-4T18:48:57:0.xml

When running the report, there are a number of parameters that can be specified with the specific run action.

The parameters that are possible to specify for a report run action are:

title: The title in the resulting report.
from: The date and time from which the report should start the information gathering. If not set, the oldest available information is implied.
to: The date and time when the information gathering should stop. If not set, the current date and time are implied. If set, no new check-syncs of devices and/or services will be attempted.
outformat: One of xml, html, text, or sqlite. If xml is specified, the report will formatted using the DocBook schema.

We will request a report run with a title and formatted as text.

ncs# compliance reports report gold-check run \
> title "My First Report" outformat text

In the above command, the report was run without a from or a to argument. This implies that historical information gathering will be based on all available information. This includes information gathered from rollback files.

When a from argument is supplied to a compliance report run action, this implies that only historical information younger than the from date and time is checked.

ncs# compliance reports report gold-check run \
> title "First check" from 2015-02-04T00:00:00

When a to argument is supplied, this implies that historical information will be gathered for all logged information up to the date and time of the to argument.

ncs# compliance reports report gold-check run \
> title "Second check" to 2015-02-05T00:00:00

The from and a to arguments can be combined to specify a fixed historic time interval.

ncs# compliance reports report gold-check run \
> title "Third check" from 2015-02-04T00:00:00 to 2015-02-05T00:00:00

When a compliance report is run, the action will respond with a flag indicating if any discrepancies were found. Also, it reports how many devices and services have been verified in total by the report.

ncs# compliance reports report gold-check run \
> title "Fourth check" outformat text
time 2015-2-4T20:42:45.019012+00:00
compliance-status violations
info Checking 17 devices and 2 services
location http://.../report_7_admin_1_2015-2-4T20:42:45.019012+00:00.text

Below is an example of a compliance report result (in text format):

Compliance Report Result

$ cat ./state/compliance-reports/report_7_admin_1_2015-2-4T20\:42\:45.019012+00\:00.text
reportcookie : g2gCbQAAAAtGaWZ0aCBjaGVja20AAAAKZ29sZC1jaGVjaw==

Compliance report : Fourth check

        Publication date : 2015-2-4 20:42:45
        Produced by user : admin

Chapter : Summary

        Compliance result titled "Fourth check" defined by report "gold-check"
        Resulting in violations
        Checking 17 devices and 2 services
        Produced 2015-2-4 20:42:45
        From : Oldest available information
        To : 2015-2-4 20:42:45

Devices out of sync

p0

        check-sync unsupported for device

p1

        check-sync unsupported for device

p2

        check-sync unsupported for device

p3

        check-sync unsupported for device

pe0

        check-sync unsupported for device

pe1

        check-sync unsupported for device

pe3

        check-sync unsupported for device



Template discrepancies

gold-conf

        Discrepancies in device
        ce0
        ce1
        ce2
        ce3


Chapter : Details


Commit list

        SeqNo   ID      User    Client  Timestamp            Label  Comment
        0       10031   admin   cli     2015-02-04 20:31:42
        1       10030   admin   cli     2015-02-04 20:03:41
        2       10029   admin   cli     2015-02-04 19:54:40
        3       10028   admin   cli     2015-02-04 19:45:20
        4       10027   admin   cli     2015-02-04 18:38:05


Service commit changes

        No service data commits saved for the time interval


Device commit changes

        No device data commits saved for the time interval


Service differences

        No service data diffs found


Template discrepancies details

gold-conf

Device ce0

 config {
     ios:snmp-server {
+        community public {
+        }
     }
 }

Device ce1

 config {
     ios:snmp-server {
+        community public {
+        }
     }
 }

Device ce2

 config {
     ios:snmp-server {
+        community public {
+        }
     }
 }

Device ce3

 config {
     ios:snmp-server {
+        community public {
+        }
     }
 }

Device Configuration Checks

Services are the preferred way to manage device configuration in NSO as they provide numerous benefits (see Why services? in Development). However, on your journey to full automation, perhaps you only use NSO to configure a subset of all the services (configuration) on the devices. In this case, you can still perform generic configuration validation on other parts with the help of device configuration checks.

Often, each device will have a somewhat different configuration, such as its own set of IP addresses, which makes checking against a static template impossible. For this reason, NSO supports compliance templates.

These templates are similar to but separate from, device templates. With compliance templates, you use regular expressions to check compliance, instead of simple fixed values. You can also define and reference variables that get their values when a report is run. All selected devices are then checked against the compliance template and the differences (if any) are reported as a compliance violation.

You can create a compliance template from scratch. For example, to check that the router uses only internal DNS servers from the 10.0.0.0/8 range, you might create a compliance template such as:

admin@ncs(config)# compliance template internal-dns
admin@ncs(config-template-internal-dns)# ned-id router-nc-1.0 config sys dns server 10\\\\..+

Here, the value of the /sys/dns/server must start with 10., followed by any string (the regular expression .+). Since a dot has a special meaning with regular expressions (any character), it must be escaped with a backslash to match only the actual dot character. But note the required multiple escaping (\\\\) in this case.

As these expressions can be non-trivial to construct, the templates have a check command that allows you to quickly check compliance for a set of devices, which is a great development aid.

admin@ncs(config)# show full-configuration devices device ex0 config sys dns server
devices device ex0
 config
  sys dns server 10.2.3.4
  !
  sys dns server 192.168.100.10
  !
 !
!
admin@ncs(config)# compliance template internal-dns
admin@ncs(config-template-internal-dns)# check device ex0
check-result {
    device ex0
    result violations
    diff  config {
     sys {
         dns {
+            # after server 10.2.3.4
+            /* No match of 10\\..+ */
+            server 192.168.100.10;
         }
     }
 }

}

Alternatively, you can use the /compliance/create-template action when you already have existing device templates that you would like to use as a starting point for a compliance template. For example:

admin@ncs(config)# show full-configuration devices template use-internal-dns
devices template use-internal-dns
 ned-id router-nc-1.0
  config
   ! Tags: replace (/devices/template{use-internal-dns}/ned-id{router-nc-1.0:router-nc-1.0}/config/r:sys/dns)
   sys dns server 10.8.8.8
   !
  !
 !
!
admin@ncs(config)# compliance create-template name internal-dns device-template use-internal-dns
admin@ncs(config)# show configuration
compliance template internal-dns
 ned-id router-nc-1.0
  config
   ! Tags: replace (/compliance/template{internal-dns}/ned-id{router-nc-1.0:router-nc-1.0}/config/r:sys/dns)
   sys dns server 10.8.8.8
   !
  !
 !
!
admin@ncs(config)# compliance template internal-dns
admin@ncs(config-template-internal-dns)# ned-id router-nc-1.0 config sys dns server 10\\\\..+

Finally, to use compliance templates in a report, reference them from device-check/template:

admin@ncs(config-report-gold-check)# device-check template internal-dns

Additional Configuration Checks

In some cases, it is insufficient to only check that the required configuration is present, as other configurations on the device can interfere with the desired functionality. For example, a service may configure a routing table entry for the 198.51.100.0/24 network. If someone also configures a more specific entry, say 198.51.100.0/28, that entry will take precedence and may interfere with the way the service requires the traffic to be routed. In effect, this additional configuration can render the service inoperable.

To help operators ensure there is no such extraneous configuration on the managed devices, the compliance reporting feature supports the so-called strict mode. This mode not only checks whether the required configuration is present but also reports any configuration present on the device that is not part of the template.

You can configure this mode in the report definition, when specifying the device template to check against, for example:

ncs(config)# compliance reports report gold-check
ncs(config-report-gold-check)# device-check template internal-dns strict

Listing Packages

View currently loaded packages.

NSO Packages contain data models and code for a specific function. It might be a NED for a specific device, a service application like MPLS VPN, a WebUI customization package, etc. Packages can be added, removed, and upgraded in run-time.

The currently loaded packages can be viewed with the following command:

Show Currently Loaded Packages

admin@ncs# show packages
packages package cisco-ios
 package-version 3.0
 description     "NED package for Cisco IOS"
 ncs-min-version [ 3.0.2 ]
 directory       ./state/packages-in-use/1/cisco-ios
 component upgrade-ned-id
  upgrade java-class-name com.tailf.packages.ned.ios.UpgradeNedId
 component cisco-ios
  ned cli ned-id  cisco-ios
  ned cli java-class-name com.tailf.packages.ned.ios.IOSNedCli
  ned device vendor Cisco
NAME      VALUE
---------------------
show-tag  interface

 build-info date "2015-01-29 23:40:12"
 build-info file ncs-3.4_HEAD-cisco-ios-3.0.tar.gz
 build-info arch linux.x86_64
 build-info java "compiled Java class data, version 50.0 (Java 1.6)"
 build-info package name cisco-ios
 build-info package version 3.0
 build-info package ref 3.0
 build-info package sha1 a8f1329
 build-info ncs version 3.4_HEAD
 build-info ncs sha1 81a1e4c
 build-info dev-support version 0.99
 build-info dev-support branch e4d3fa7
 build-info dev-support sha1 e4d3fa7
 oper-status up

Thus the above command shows that NSO currently has only one package loaded, the NED package for Cisco IOS. The output includes the name and version of the package, the minimum required NSO version, the Java components included, package build details, and finally the operational status of the package. The operational status is of particular importance - if it is anything other than up, it indicates that there was a problem with the loading or the initialization of the package. In this case, an item error-info may also be present, giving additional information about the problem. To show only the operational status for all loaded packages, this command can be used:

admin@ncs# show packages package * oper-status
packages package cisco-ios
 oper-status up

Lifecycle Operations

Manipulate and manage existing services and devices.

Devices and services are the most important entities in NSO. Once created, they may be manipulated in several different ways. The three main categories of operations that affect the state of services and devices are:

Commit Flags: Commit flags modify the transaction semantics.
Device Actions: Explicit actions that modify the devices.
Service Actions: Explicit actions that modify the services.

The purpose of this section is more of a quick reference guide, an enumeration of commonly used commands. The context in which these commands should be used is found in other parts of the documentation.

Commit Flags

Commit flags may be present when issuing a commit command:

commit <flag>

Some of these flags may be configured to apply globally for all commits, under /devices/global-settings, or per device profile, under /devices/profiles.

Some of the more important flags are:

and-quit: Exit to (CLI operational mode) after commit.
check: Validate the pending configuration changes. Equivalent to validate command (See NSO CLI ).
comment | label: Add a commit comment/label visible in compliance reports, rollback files, etc.
dry-run: Validate and display the configuration changes but do not perform the actual commit. Neither CDB nor the devices are affected. Instead, the effects that would have taken place are shown in the returned output. The output format can be set with the outformat option. Possible output formats are: xml, cli, and native.
- The xml format displays all changes in the whole data model. The changes will be displayed in NETCONF XML edit-config format, i.e., the edit-config that would be applied locally (at NCS) to get a config that is equal to that of the managed device.
- The cli format displays all changes in the whole data model. The changes will be displayed in CLI curly bracket format.
- The native format displays only changes under /devices/device/config. The changes will be displayed in native device format. The native format can be used with the reverse option to display the device commands for getting back to the current running state in the network if the commit is successfully executed. Beware that if any changes are done later on the same data, the reverse device commands returned are invalid.
no-networking: Validate the configuration changes, and update the CDB but do not update the actual devices. This is equivalent to first setting the admin state to southbound locked, then issuing a standard commit. In both cases, the configuration changes are prevented from being sent to the actual devices.
If the commit implies changes, it will make the device out-of-sync.
The sync-to command can then be used to push the change to the network.
no-out-of-sync-check: Commit even if the device is out of sync. This can be used in scenarios where you know that the change you are doing is not in conflict with what is on the device and do not want to perform the action sync-from first. Verify the result by using the action compare-config.
The device's sync state is assumed to be unknown after such a commit and the stored last-transaction-id value is cleared.
no-overwrite: NSO will check that the data that should be modified has not changed on the device compared to NSO's view of the data. This is a fine-granular sync check; NSO verifies that NSO and the device are in sync regarding the data that will be modified. If they are not in sync, the transaction is aborted. This parameter is particularly useful in brownfield scenarios where the device is always out of sync due to being directly modified by operators or other management systems.
The device's sync state is assumed to be unknown after such a commit and the stored last-transaction-id value is cleared.
no-revision-drop: Fail if one or more devices have obsolete device models. When NSO connects to a managed device the version of the device data model is discovered. Different devices in the network might have different versions. When NSO is requested to send configuration to devices, NSO defaults to drop any configuration that only exists in later models than the device supports. This flag forces NSO to never silently drop any data set operations towards a device.
no-deploy: Commit without invoking the service create method, i.e., write the service instance data without activating the service(s). The service(s) can later be redeployed to write the changes of the service(s) to the network.
reconcile: Reconcile the service data. All data which existed before the service was created will now be owned by the service. When the service is removed, that data will also be removed. In technical terms, the reference count will be decreased by one for everything that existed before the service. If manually configured data exists below in the configuration tree that data is kept unless the option discard-non-service-config is used.
use-lsa: Force handling of the LSA nodes as such. This flag tells NSO to propagate applicable commit flags and actions to the LSA nodes without applying them on the upper NSO node itself. The commit flags affected are: dry-run, no-networking, no-out-of-sync-check, no-overwrite and no-revision-drop.
no-lsa: Do not handle any of the LSA nodes as such. These nodes will be handled as any other device.
commit-queue: Commit through the commit queue (see Commit Queue). While the configuration change is committed to CDB immediately it is not committed to the actual device but rather queued for eventual commit to increase transaction throughput. This enables the use of the commit queue feature for individual commit commands without enabling it by default. Possible operation modes are: async, sync and bypass.
- If the async mode is set, the operation returns successfully if the transaction data has been successfully placed in the queue.
- The sync mode will cause the operation to not return until the transaction data has been sent to all devices, or a timeout occurs. If the timeout occurs the transaction data stays in the queue and the operation returns successfully. The timeout value can be specified with the timeout or infinity option. By default, the timeout value is determined by what is configured in /devices/global-settings/commit-queue/sync.
- The bypass mode means that if /devices/global-settings/commit-queue/enabled-by-default is true, the data in this transaction will bypass the commit queue. The data will be written directly to the devices. The operation will still fail if the commit queue contains one or more entries affecting the same device(s) as the transaction to be committed. In addition, the commit-queue flag has a number of other useful options that affect the resulting queue item:
- The tag option sets a user-defined opaque tag that is present in all notifications and events sent referencing the queue item.
- The block-others option will cause the resulting queue item to block subsequent queue items which use any of the devices in this queue item, from being queued.
- The lock option will place a lock on the resulting queue item. The queue item will not be processed until it has been unlocked, see the actions unlock and lock in /devices/commit-queue/queue-item. No following queue items, using the same devices, will be allowed to execute as long as the lock is in place.
- The atomic option sets the atomic behavior of the resulting queue item. If this is set to false, the devices contained in the resulting queue item can start executing if the same devices in other non-atomic queue items ahead of it in the queue are completed. If set to true, the atomic integrity of the queue item is preserved.
- Depending on the selected error-option, NSO will store the reverse of the original transaction to be able to undo the transaction changes and get back to the previous state. This data is stored in the /devices/commit-queue/completed tree from where it can be viewed and invoked with the rollback action. When invoked, the data will be removed. Possible values are: continue-on-error, rollback-on-error, and stop-on-error.
  - The continue-on-error value means that the commit queue will continue on errors. No rollback data will be created.
  - The rollback-on-error value means that the commit queue item will roll back on errors. The commit queue will place a lock on the failed queue item, thus blocking other queue items with overlapping devices from being executed. The rollback action will then automatically be invoked when the queue item has finished its execution. The lock will be removed as part of the rollback.
  - The stop-on-error means that the commit queue will place a lock on the failed queue item, thus blocking other queue items with overlapping devices from being executed. The lock must then either manually be released when the error is fixed, or the rollback action under /devices/commit-queue/completed be invoked.
  Read about error recovery in Commit Queue for a more detailed explanation.
trace-id: Use the provided trace ID as part of the log messages emitted while processing. If no trace ID is given, NSO is going to generate and assign a trace ID to the processing.

All commands in NSO can also have pipe commands. A useful pipe command for commit is details:

ncs% commit | details

This will give feedback on the steps performed in the commit.

When working with templates, there is a pipe command debug which can be used to troubleshoot templates. To enable debugging on all templates use:

ncs% commit | debug template

When configuring using many templates the debug output can be overwhelming. For this reason, there is an option to only get debug information for one template, in this example, a template named l3vpn:

ncs% commit | debug template l3vpn

Device Actions

Actions for devices can be performed globally on the /devices path and for individual devices on /devices/device/name. Many actions are also available on device groups as well as device ranges.

add-capability

This action adds a capability to the list of capabilities. If uri is specified, then it is parsed as a YANG capability string and module, revision, feature and deviation parameters are derived from the string. If module is specified, then the namespace is looked up in the list of loaded namespaces, and the capability string is constructed automatically. If the module is specified and the attempt to look it up fails, then the action does nothing. If module is specified or can be derived from the capability string, then the module is also added/replaced in the list of modules. This action is only intended to be used for pre-provisioning; it is not possible to override capabilities and modules provided by the NED implementation using this action.

apply-template

Take a named template and apply its configuration here.

If the accept-empty-capabilities parameter is included, the template is applied to devices even if the capability of the device is unknown.

This action will behave differently depending on whether it is invoked with a transaction or not. When invoked with a transaction (such as via the CLI) it will apply the template to it and leave it to the user to commit or revert the resulting changes. If invoked without a transaction (for example when invoked via RESTCONF), the action will automatically create one and commit the resulting changes. An error will be returned and the transaction aborted if the template failed to apply on any of the devices.

check-sync

Check if the NSO copy of the device configuration is in sync with the actual device configuration, using device-specific mechanisms. This operation is usually cheap as it only compares a signature of the configuration from the device rather than comparing the entire configuration.

Depending on the device the signature is implemented as a transaction-id, timestamp, hash-sum or not at all. The capability must be supported by the corresponding NED. The output might say unsupported, and then the only way to perform this would be to do a full compare-config command.

As some NEDs implement the signature as a hash-sum of the entire configuration, this operation might for some devices be just as expensive as performing a full compare-config command.

check-yang-modules

Check if the device YANG modules loaded by NSO have revisions that are compatible with the ones reported by the device.

This can indicate for example that the device has a YANG module of later revision than the corresponding NED.

clear-trace

Clear all trace files for all active traces for all managed devices.

compare-config

Retrieve the config from the device and compare it to the NSO locally stored copy.

connect

Set up a session to the unlocked device. This is not used in real operational scenarios. NSO automatically establishes connections on demand. However, it is useful for test purposes when installing new NEDs, adding devices, etc.

When a device is southbound locked, all southbound communication is turned off. The override-southbound-locked flag overrides the southbound lock for connection attempts. Thus, this is a way to update the capabilities including revision information for a managed device although the device is southbound locked.

copy-capabilities

This action copies the list of capabilities and the list of modules from another device or profile. When used on a device, this action is only intended to be used for pre-provisioning: it is not possible to override capabilities and modules provided by the NED implementation using this action.

Note that this action overwrites the existing list of capabilities.

delete-config

Delete the device configuration in NSO without executing the corresponding delete on the managed device.

disconnect

Close all sessions to the device.

fetch-ssh-host-keys

Retrieve the SSH host keys from all devices, or all devices in the given device group, and store them in each device's ssh/host-key list. Successfully retrieved new or updated keys are always committed by the action.

find-capabilities

This action populates the list of capabilities based on the configured ned-id for the device, if possible. NSO will look up the package corresponding to the ned-id and add all the modules from these packages to the list of device capabilities and list of modules. It is the responsibility of the caller to verify that the automatically populated list of capabilities matches the actual device's capabilities. The list of capabilities can then be fine-tuned using add-capability and capability/remove actions. Currently, this approach will only work for CLI and generic devices. This action is only intended to be used for pre-provisioning: it is not possible to override capabilities and modules provided by the NED implementation using this action.

Note that this action overwrites the existing list of capabilities.

instantiate-from-other-device

Instantiate the configuration for the device as a copy of the configuration of some other already working device.

load-native-config

Load configuration data in native format into the transaction. This action is only applicable to devices with NETCONF, CLI, and generic NEDs.

The action can load the configuration data either from a file in the local filesystem or as a string through the northbound client. If loading XML the data must be a valid XML document, either with a single namespace or wrapped in a config node with the http://tail-f.com/ns/config/1.0 namespace.

The verbose option can be used to show additional parse information reported by the NED. By default, the behavior is to merge the configuration that is applied. This can be changed by setting the mode option to replace. This will replace the entire device configuration.

This action will behave differently depending on if it is invoked with a transaction or not. When invoked with a transaction (such as via the CLI), it will load the configuration into it and leave it to the user to commit or revert the resulting changes. If invoked without a transaction (for example, when invoked via RESTCONF), the action will automatically create one and commit the resulting changes.

migrate

Change the NED identity and migrate all data. As a side-effect reads and commits the actual device configuration.

The action reports what paths have been modified and the services affected by those changes. If the verbose option is used, all service instances are reported instead of just the service points. If the dry-run option is used, the action simply reports what it would do.

If the no-networking option is used, no southbound traffic is generated toward the devices. Only the device configuration in CDB is used for the migration. If used, NSO can not know if the device is in sync. To determine this, the compare-config or the sync-from action must be used.

partial-sync-from

Synchronize parts of the devices' configuration by pulling from the network.

ping

ICMP ping the device.

scp-from

Secure copy the file from the device.

The port option specifies the port to connect to on the device. If this leaf is not configured, NSO will use the port for the management interface of the device.

The preserve option preserves modification times, access times, and modes from the original file. This is not always supported by the device.

scp-to

Secure copy file to the device.

The port option specifies the port to connect to on the device. If this leaf is not configured, NSO will use the port for the management interface of the device.

The preserve option preserves modification times, access times, and modes from the original file. This is not always supported by the device.

sync-from

Synchronize the NSO copy of the device configuration by reading the actual device configuration. The change will be immediately committed to NSO.

If the dry-run option is used, the action simply reports (in different formats) what it would do. The verbose option can be used to show additional parse information reported by the NED.

If you have any services that has created configuration on the device the corresponding service might be out-of-sync. Use the commands check-sync and re-deploy to reconcile this.

sync-to

Synchronize the device configuration by pushing the NSO copy to the device.

NSO pushes a minimal diff to the device. The diff is calculated by reading the configuration from the device and comparing it with the configuration in NSO.

If the dry-run option is used, the action simply reports (in different formats) what it would do.

Some of the operations above can't be performed while the device is being committed to (or waiting in the commit queue). This is to avoid getting inconsistent data when reading the configuration. The wait-for-lock option in these specifies a timeout to wait for a device lock to be placed in the commit queue. The lock will be automatically released once the action has been executed. If the no-wait-for-lock option is specified, the action will fail immediately for the device if the lock is taken for the device or if the device is placed in the commit queue. The wait-for-lock and the no-wait-for-lock options are device settings as well, they can be set as a device profile, device, and global setting. The no-wait-for-lock option is set in the global settings by default. If neither wait-for-lock and the no-wait-for-lock options are provided together with the action, the device setting is used.

Service Actions

Service actions are performed on the service instance.

check-sync

Check if the service has been undermined, i.e., if the service was to be re-deployed, would it do anything? This action will invoke the FASTMAP code to create the change set that is compared to the existing data in CDB locally.

If outformat is a boolean, true is returned if the service is in sync, i.e., a re-deploy would do nothing. If outformat is cli, xml or native, the changes that the service would do to the network if re-deployed are returned.

If configuration changes have been made out-of-band then deep-check-sync is needed to detect an out-of-sync condition.

The deep option is used to recursively check-sync stacked services. The shallow option only check-sync the topmost service.

deep-check-sync

Check if the service has been undermined on the device itself. The action check-sync compares the output of the service code to what is stored in CDB locally. This action retrieves the configuration from the devices touched by the service and compares the forward diff set of the service to the retrieved data. This is thus a fairly heavy-weight operation. As opposed to the check-sync action that invokes the FASTMAP code, this action re-applies the forward diff-set. This is the same output you see when inspecting the get-modifications operational field in the service instance.

If the device is in sync with CDB, the output of this action is identical to the output of the cheaper check-sync action.

get-modifications

Returns the data the service modified, either in CLI curly bracket format or NETCONF XML edit-config format. The modifications are shown as if the service instance was the only instance that modifies the data. This data is only available if the parameter /services/global-settings/collect-forward-diff is set to true.

If the parameter reverse is given, the modifications needed to reverse the effect of the service is shown. The modifications are shown as if this service instance was the last service instance. This will be applied if the service is deleted. This data is always available.

The deep option is used to recursively get-modifications for stacked services. The shallow option only get-modifications for the topmost service.

re-deploy

Run the service code again, possibly writing the changes of the service to the network once again. There are several reasons for performing this operation such as:

a device sync-from action has been performed to incorporate an out-of-band change.
data referenced by the service has changed such as topology information, QoS policy definitions, etc.

The deep option is used to recursively re-deploy stacked services. The shallow option only re-deploy the topmost service.

If the dry-run option is used, the action simply reports (in different formats) what it would do.

Use the option reconcile if the service should reconcile original data, i.e., take control of that data. This option acknowledges other services controlling the same data. All data which existed before the service was created will now be owned by the service. When the service is removed that data will also be removed. In technical terms, the reference count will be decreased by one for everything that existed prior to the service. If manually configured data exists below in the configuration tree that data is kept unless the option discard-non-service-config is used.

Note: The action is idempotent. If no configuration diff exists then nothing needs to be done.

Note: The NSO general principle of minimum change applies.

reactive-re-deploy

This is a tailored re-deploy intended to be used in the reactive FASTMAP scenario. It differs from the ordinary re-deploy in that this action does not take any commit parameters.

This action will re-deploy the services as a shallow depth re-deploy. It will be performed with the same user as the original commit. Also, the commit parameters will be identical to the latest commit involving this service.

By default, this action is asynchronous and returns nothing. Use the sync leaf to get synchronous behavior and block until the service re-deploy transaction is committed. The sync leaf also means that the action will possibly return a commit result, such as commit queue ID if any, or an error if the transaction failed.

touch

This action marks the service as changed.

Executing the action touch followed by a commit is the same as executing the action re-deploy shallow.

By using the action touch, several re-deploys can be performed in the same transaction.

un-deploy

Undo the effects of the service instance but keep the service itself. The service can later be re-deployed. This is a means to deactivate a service while keeping it in the system.

Network Simulator

Use NSO's network simulator to simulate your network and test functionality.

The ncs-netsim program is a useful tool to simulate a network of devices to be managed by NSO. It makes it easy to test NSO packages towards simulated devices. All you need is the NSO NED packages for the devices that you need to simulate. The devices are simulated with the Tail-f ConfD product.

All the NSO examples use ncs-netsim to simulate the devices. A good way to learn how to use ncs-netsim is to study them.

Using Netsim

The ncs-netsim tool takes any number of NED packages as input. The user can specify the number of device instances per package (device type) and a string that is used as a prefix for the name of the devices. The command takes the following parameters:

admin$ ncs-netsim --help
Usage ncs-netsim  [--dir <NetsimDir>]
            create-network <NcsPackage> <NumDevices> <Prefix> |
            create-device <NcsPackage> <DeviceName> |
            add-to-network <NcsPackage> <NumDevices> <Prefix> |
            add-device <NcsPackage> <DeviceName> |
            delete-network                     |
            [-a | --async]  start [devname]    |
            [-a | --async ] stop [devname]     |
            [-a | --async ] reset [devname]    |
            [-a | --async ] restart [devname]  |
            list                      |
            is-alive [devname]        |
            status [devname]          |
            whichdir                  |
            ncs-xml-init [devname]    |
            ncs-xml-init-remote <RemoteNodeName> [devname] |
            [--force-generic]         |
            packages                  |
            netconf-console devname [XpathFilter] |
            [-w | --window] [cli | cli-c | cli-i] devname

Assume that you have prepared an NSO package for a device called router. (See the examples.ncs/getting-started/developing-with-ncs/0-router-network example). Also, assume the package is in ./packages/router. At this point, you can create the simulated network by:

$ ncs-netsim create-network ./packages/router 3 device --dir ./netsim

This creates three devices; device0, device1, and device2. The simulated network is stored in the ./netsim directory. The output structure is:

          ./netsim/device/
               device0/<ConfD files>, <log files>
               device1/
               ....

There is one separate directory for every ConfD simulating the devices.

The network can be started with:

$ ncs-netsim start

You can add more devices to the network in a similar way as it was created. E.g. if you created a network with some Juniper devices and want to add some Cisco IOS devices. Point to the NED you want to use (See {NCS_DIR}/packages/neds/) and run the command. Remember to start the new devices after they have been added to the network.

$ ncs-netsim add-to-network ${NCS_DIR}/packages/neds/cisco-ios 2 c-device --dir ./netsim

To extract the device data from the simulated network to a file in XML format:

$ ncs-netsim ncs-xml-init > devices.xml

This data is usually used to load the simulated network into NSO. Putting the XML file in the ./ncs-cdb folder will load it when NSO starts. If NSO is already started it can be reloaded while running.

$ ncs_load -l -m devices.xml

The generated device data creates devices of the same type as the device being simulated. This is true for NETCONF, CLI, and SNMP devices. When simulating generic devices, the simulated device will run as a netconf device.

Under very special circumstances, one can choose to force running the simulation as a generic device with the option --force-generic.

The simulated network device info can be shown with:

 $ ncs-netsim list
...
 name=device0 netconf=12022 snmp=11022 ipc=5010 cli=10022 dir=examples.ncs/getting-started/developing-
with-ncs/0-router-network/netsim/device/device0
...

Here you can see the device name, the working directory, and the port number for different services to be accessed on the simulated device (NETCONF SSH, SNMP, IPC, and direct access to the CLI).

You can reach the CLI of individual devices with:

$ ncs-netsim cli-c device0

The simulated devices actually provide three different styles of CLI:

cli: J-Style
cli-c: Cisco XR Style
cli-i: Cisco IOS Style

Individual devices can be started and stopped with:

$ ncs-netsim start device0
$ ncs-netsim stop device0

You can check the status of the simulated network. Either a short version just to see if the device is running or a more verbose with all the information.

$ ncs-netsim is-alive device0
$ ncs-netsim status device0

View which packages are used in the simulated network:

$ ncs-netsim packages

It is also possible to reset the network back to the state of initialization:

$ ncs-netsim reset

When you are done, remove the network:

$ ncs-netsim delete-network

Using ConfD Tools with Netsim

The netsim tool includes a standard ConfD distribution and the ConfD C API library (libconfd) that the ConfD tools use. The library is built with default settings where the values for MAXDEPTH and MAXKEYLEN are 20 and 9, respectively. These values define the size of confd_hkeypath_t struct and this size is related to the size of data models in terms of depth and key lengths. Default values should be big enough even for very large and complex data models. But in some rare cases, one or both of these values might not be large enough for a given data model.

One might observe a limitation when the data models that are used by simulated devices exceed these limits. Then it would not be possible to use the ConfD tools that are provided with the netsim. To overcome this limitation, it is advised to use the corresponding NSO tools to perform desired tasks on devices.

NSO and ConfD tools and Python APIs are basically the same except for naming, the default IPC port and the MAXDEPTH and MAXKEYLEN values, where for NSO tools, the values are set to 60 and 18, respectively. Thus, the advised solution is to use the NSO tools and NSO Python API with netsim.

E.g. Instead of using the below command:

$ CONFD_IPC_PORT=5010 ${NCS_DIR}/netsim/confd/bin/confd_load -m -l *.xml

One may use:

$ NCS_IPC_PORT=5010 ncs_load -m -l *.xml

Learn More

The README file in examples.ncs/getting-started/developing-with-ncs/0-router-network gives a good introduction on how to use ncs-netsim.

Lifecycle Operations

Manipulate and manage existing services and devices.

Commit Flags: Commit flags modify the transaction semantics.
Device Actions: Explicit actions that modify the devices.
Service Actions: Explicit actions that modify the services.

Commit Flags

Commit flags may be present when issuing a commit command:

commit <flag>

Some of these flags may be configured to apply globally for all commits, under /devices/global-settings, or per device profile, under /devices/profiles.

Some of the more important flags are:

and-quit: Exit to (CLI operational mode) after commit.
check: Validate the pending configuration changes. Equivalent to validate command (See NSO CLI ).
comment | label: Add a commit comment/label visible in compliance reports, rollback files, etc.
dry-run: Validate and display the configuration changes but do not perform the actual commit. Neither CDB nor the devices are affected. Instead, the effects that would have taken place are shown in the returned output. The output format can be set with the outformat option. Possible output formats are: xml, cli, and native.
- The xml format displays all changes in the whole data model. The changes will be displayed in NETCONF XML edit-config format, i.e., the edit-config that would be applied locally (at NCS) to get a config that is equal to that of the managed device.
- The cli format displays all changes in the whole data model. The changes will be displayed in CLI curly bracket format.
- The native format displays only changes under /devices/device/config. The changes will be displayed in native device format. The native format can be used with the reverse option to display the device commands for getting back to the current running state in the network if the commit is successfully executed. Beware that if any changes are done later on the same data, the reverse device commands returned are invalid.
no-networking: Validate the configuration changes, and update the CDB but do not update the actual devices. This is equivalent to first setting the admin state to southbound locked, then issuing a standard commit. In both cases, the configuration changes are prevented from being sent to the actual devices.
If the commit implies changes, it will make the device out-of-sync.
The sync-to command can then be used to push the change to the network.
no-out-of-sync-check: Commit even if the device is out of sync. This can be used in scenarios where you know that the change you are doing is not in conflict with what is on the device and do not want to perform the action sync-from first. Verify the result by using the action compare-config.
The device's sync state is assumed to be unknown after such a commit and the stored last-transaction-id value is cleared.
no-overwrite: NSO will check that the data that should be modified has not changed on the device compared to NSO's view of the data. This is a fine-granular sync check; NSO verifies that NSO and the device are in sync regarding the data that will be modified. If they are not in sync, the transaction is aborted. This parameter is particularly useful in brownfield scenarios where the device is always out of sync due to being directly modified by operators or other management systems.
The device's sync state is assumed to be unknown after such a commit and the stored last-transaction-id value is cleared.
no-revision-drop: Fail if one or more devices have obsolete device models. When NSO connects to a managed device the version of the device data model is discovered. Different devices in the network might have different versions. When NSO is requested to send configuration to devices, NSO defaults to drop any configuration that only exists in later models than the device supports. This flag forces NSO to never silently drop any data set operations towards a device.
no-deploy: Commit without invoking the service create method, i.e., write the service instance data without activating the service(s). The service(s) can later be redeployed to write the changes of the service(s) to the network.
reconcile: Reconcile the service data. All data which existed before the service was created will now be owned by the service. When the service is removed, that data will also be removed. In technical terms, the reference count will be decreased by one for everything that existed before the service. If manually configured data exists below in the configuration tree that data is kept unless the option discard-non-service-config is used.
use-lsa: Force handling of the LSA nodes as such. This flag tells NSO to propagate applicable commit flags and actions to the LSA nodes without applying them on the upper NSO node itself. The commit flags affected are: dry-run, no-networking, no-out-of-sync-check, no-overwrite and no-revision-drop.
no-lsa: Do not handle any of the LSA nodes as such. These nodes will be handled as any other device.
commit-queue: Commit through the commit queue (see Commit Queue). While the configuration change is committed to CDB immediately it is not committed to the actual device but rather queued for eventual commit to increase transaction throughput. This enables the use of the commit queue feature for individual commit commands without enabling it by default. Possible operation modes are: async, sync and bypass.
- If the async mode is set, the operation returns successfully if the transaction data has been successfully placed in the queue.
- The sync mode will cause the operation to not return until the transaction data has been sent to all devices, or a timeout occurs. If the timeout occurs the transaction data stays in the queue and the operation returns successfully. The timeout value can be specified with the timeout or infinity option. By default, the timeout value is determined by what is configured in /devices/global-settings/commit-queue/sync.
- The bypass mode means that if /devices/global-settings/commit-queue/enabled-by-default is true, the data in this transaction will bypass the commit queue. The data will be written directly to the devices. The operation will still fail if the commit queue contains one or more entries affecting the same device(s) as the transaction to be committed. In addition, the commit-queue flag has a number of other useful options that affect the resulting queue item:
- The tag option sets a user-defined opaque tag that is present in all notifications and events sent referencing the queue item.
- The block-others option will cause the resulting queue item to block subsequent queue items which use any of the devices in this queue item, from being queued.
- The lock option will place a lock on the resulting queue item. The queue item will not be processed until it has been unlocked, see the actions unlock and lock in /devices/commit-queue/queue-item. No following queue items, using the same devices, will be allowed to execute as long as the lock is in place.
- The atomic option sets the atomic behavior of the resulting queue item. If this is set to false, the devices contained in the resulting queue item can start executing if the same devices in other non-atomic queue items ahead of it in the queue are completed. If set to true, the atomic integrity of the queue item is preserved.
- Depending on the selected error-option, NSO will store the reverse of the original transaction to be able to undo the transaction changes and get back to the previous state. This data is stored in the /devices/commit-queue/completed tree from where it can be viewed and invoked with the rollback action. When invoked, the data will be removed. Possible values are: continue-on-error, rollback-on-error, and stop-on-error.
  - The continue-on-error value means that the commit queue will continue on errors. No rollback data will be created.
  - The rollback-on-error value means that the commit queue item will roll back on errors. The commit queue will place a lock on the failed queue item, thus blocking other queue items with overlapping devices from being executed. The rollback action will then automatically be invoked when the queue item has finished its execution. The lock will be removed as part of the rollback.
  - The stop-on-error means that the commit queue will place a lock on the failed queue item, thus blocking other queue items with overlapping devices from being executed. The lock must then either manually be released when the error is fixed, or the rollback action under /devices/commit-queue/completed be invoked.
  Read about error recovery in Commit Queue for a more detailed explanation.
trace-id: Use the provided trace ID as part of the log messages emitted while processing. If no trace ID is given, NSO is going to generate and assign a trace ID to the processing.

All commands in NSO can also have pipe commands. A useful pipe command for commit is details:

ncs% commit | details

This will give feedback on the steps performed in the commit.

When working with templates, there is a pipe command debug which can be used to troubleshoot templates. To enable debugging on all templates use:

ncs% commit | debug template

ncs% commit | debug template l3vpn

Device Actions

Actions for devices can be performed globally on the /devices path and for individual devices on /devices/device/name. Many actions are also available on device groups as well as device ranges.

add-capability

apply-template

Take a named template and apply its configuration here.

If the accept-empty-capabilities parameter is included, the template is applied to devices even if the capability of the device is unknown.

check-sync

As some NEDs implement the signature as a hash-sum of the entire configuration, this operation might for some devices be just as expensive as performing a full compare-config command.

check-yang-modules

Check if the device YANG modules loaded by NSO have revisions that are compatible with the ones reported by the device.

This can indicate for example that the device has a YANG module of later revision than the corresponding NED.

clear-trace

Clear all trace files for all active traces for all managed devices.

compare-config

Retrieve the config from the device and compare it to the NSO locally stored copy.

connect

copy-capabilities

Note that this action overwrites the existing list of capabilities.

delete-config

Delete the device configuration in NSO without executing the corresponding delete on the managed device.

disconnect

Close all sessions to the device.

fetch-ssh-host-keys

find-capabilities

Note that this action overwrites the existing list of capabilities.

instantiate-from-other-device

Instantiate the configuration for the device as a copy of the configuration of some other already working device.

load-native-config

Load configuration data in native format into the transaction. This action is only applicable to devices with NETCONF, CLI, and generic NEDs.

migrate

Change the NED identity and migrate all data. As a side-effect reads and commits the actual device configuration.

partial-sync-from

Synchronize parts of the devices' configuration by pulling from the network.

ping

ICMP ping the device.

scp-from

Secure copy the file from the device.

The port option specifies the port to connect to on the device. If this leaf is not configured, NSO will use the port for the management interface of the device.

The preserve option preserves modification times, access times, and modes from the original file. This is not always supported by the device.

scp-to

Secure copy file to the device.

The port option specifies the port to connect to on the device. If this leaf is not configured, NSO will use the port for the management interface of the device.

The preserve option preserves modification times, access times, and modes from the original file. This is not always supported by the device.

sync-from

Synchronize the NSO copy of the device configuration by reading the actual device configuration. The change will be immediately committed to NSO.

If the dry-run option is used, the action simply reports (in different formats) what it would do. The verbose option can be used to show additional parse information reported by the NED.

If you have any services that has created configuration on the device the corresponding service might be out-of-sync. Use the commands check-sync and re-deploy to reconcile this.

sync-to

Synchronize the device configuration by pushing the NSO copy to the device.

NSO pushes a minimal diff to the device. The diff is calculated by reading the configuration from the device and comparing it with the configuration in NSO.

If the dry-run option is used, the action simply reports (in different formats) what it would do.

Service Actions

Service actions are performed on the service instance.

check-sync

If configuration changes have been made out-of-band then deep-check-sync is needed to detect an out-of-sync condition.

The deep option is used to recursively check-sync stacked services. The shallow option only check-sync the topmost service.

deep-check-sync

If the device is in sync with CDB, the output of this action is identical to the output of the cheaper check-sync action.

get-modifications

The deep option is used to recursively get-modifications for stacked services. The shallow option only get-modifications for the topmost service.

re-deploy

Run the service code again, possibly writing the changes of the service to the network once again. There are several reasons for performing this operation such as:

a device sync-from action has been performed to incorporate an out-of-band change.
data referenced by the service has changed such as topology information, QoS policy definitions, etc.

The deep option is used to recursively re-deploy stacked services. The shallow option only re-deploy the topmost service.

If the dry-run option is used, the action simply reports (in different formats) what it would do.

Note: The action is idempotent. If no configuration diff exists then nothing needs to be done.

Note: The NSO general principle of minimum change applies.

reactive-re-deploy

This is a tailored re-deploy intended to be used in the reactive FASTMAP scenario. It differs from the ordinary re-deploy in that this action does not take any commit parameters.

touch

This action marks the service as changed.

Executing the action touch followed by a commit is the same as executing the action re-deploy shallow.

By using the action touch, several re-deploys can be performed in the same transaction.

un-deploy

Undo the effects of the service instance but keep the service itself. The service can later be re-deployed. This is a means to deactivate a service while keeping it in the system.

Basic Operations

Learn basic operational scenarios and common CLI commands.

This section helps you to get started with NSO, learn basic operational scenarios, and get acquainted with the most common CLI commands.

Setup

Make sure that you have installed NSO and that you have sourced the ncsrc file in $NCS_DIR. This sets up the paths and environment variables to run NSO. As this must be done every time before running NSO, it is recommended to add it to your profile.

We will use the NSO network simulator to simulate three Cisco IOS routers. NSO will talk Cisco CLI to those devices. You will use the NSO CLI and Web UI to perform the tasks. Sometimes you will use the native Cisco device CLI to inspect configuration or do out-of-band changes.

Note that both the NSO software (NCS) and the simulated network devices run on your local machine.

Starting the Simulator

To start the simulator:

Go to examples.ncs/getting-started/using-ncs/1-simulated-cisco-ios. First of all, we will generate a network simulator with three Cisco devices. They will be called c0, c1, and c2.
Most of this section follows the procedure in the README file, so it is useful to have it opened as well.
Perform the following command:
```
$ ncs-netsim create-network $NCS_DIR/packages/neds/cisco-ios 3 c
```
This creates three simulated devices all running Cisco IOS and they will be named c0, c1, c2.

Start the simulator.

$ ncs-netsim start
DEVICE c0 OK STARTED
DEVICE c1 OK STARTED
DEVICE c2 OK STARTED

Run the CLI toward one of the simulated devices.

$ ncs-netsim cli-i c1
admin connected from 127.0.0.1 using console *

c1> enable
c1# show running-config
class-map m
match mpls experimental topmost 1
match packet length max 255
match packet length min 2
match qos-group 1
!
...
c1# exit

This shows that the device has some initial configurations.

Starting NSO and Reading Device Configuration

The previous step started the simulated Cisco devices. It is now time to start NSO.

The first action is to prepare directories needed for NSO to run and populate NSO with information on the simulated devices. This is all done with the ncs-setup command. Make sure that you are in the examples.ncs/getting-started/using-ncs/1-simulated-cisco-ios directory. (Again ignore the details for the time being).
```
$ ncs-setup --netsim-dir ./netsim --dest . 
```
Note the . at the end of the command referring to the current directory. What the command does is create directories needed for NSO in the current directory and populate NSO with devices that are running in netsim. We call this the "run-time" directory.
Start NSO.
```
$ ncs
```
Start the NSO CLI as the user admin with a Cisco XR-style CLI.
```
$ ncs_cli -C -u admin
```
NSO also supports a J-style CLI, that is started by using a -J modification to the command like this.
```
$ ncs_cli -J -u admin
```
Throughout this user guide, we will show the commands in Cisco XR style.
At this point, NSO only knows the address, port, and authentication information of the devices. This management information was loaded to NSO by the setup utility. It also tells NSO how to communicate with the devices by using NETCONF, SNMP, Cisco IOS CLI, etc. However, at this point, the actual configuration of the individual devices is unknown.
```
admin@ncs# show running-config devices device
devices device c0
 address   127.0.0.1
 port      10022
...
 authgroup default
 device-type cli ned-id cisco-ios
 state admin-state unlocked
 config
  no ios:service pad
  no ios:ip domain-lookup
  no ios:ip http secure-server
  ios:ip source-route
 !
!  ...
```

Let us analyze the above CLI command. First of all, when you start the NSO CLI it starts in operational mode, so to show configuration data, you have to explicitly run show running-config.

NSO manages a list of devices, each device is reached by the path devices device "name" . You can use standard tab completion in the CLI to learn this.

The address and port fields tells NSO where to connect to the device. For now, they all live in local host with different ports. The device-type structure tells NSO it is a CLI device and the specific CLI is supported by the Network Element Driver (NED) cisco-ios. A more detailed explanation of how to configure the device-type structure and how to choose NEDs will be addressed later in this guide.

So now NSO can try to connect to the devices:

admin@ncs# devices connect
connect-result {
    device c0
    result true
    info (admin) Connected to c0 - 127.0.0.1:10022
}
connect-result {
    device c1
    result true
    info (admin) Connected to c1 - 127.0.0.1:10023
}
connect-result {
    device c2
    result true
    info (admin) Connected to c2 - 127.0.0.1:10024
}....

NSO does not need to have the connections active continuously, instead, NSO will establish a connection when needed and connections are pooled to conserve resources. At this time, NSO can read the configurations from the devices and populate the configuration database, CDB.

The following command will synchronize the configurations of the devices with the CDB and respond with true if successful:

admin@ncs# devices sync-from
sync-result {
    device c0
    result true
}....

The NSO data store, CDB, will store the configuration for every device at the path devices device "name" config, everything after this path is the configuration in the device. NSO keeps this synchronized. The synchronization is managed with the following principles:

At initialization, NSO can discover the configuration as shown above.
The modus operandi when using NSO to perform configuration changes is that the network engineer uses NSO (CLI, WebUI, REST,...) to modify the representation in NSO CDB. The changes are committed to the network as a transaction that includes the actual devices. Only if all changes happen on the actual devices, will it be committed to the NSO data store. The transaction also covers the devices so if any of the devices participating in the transaction fails, NSO will roll back the configuration changes on all modified devices. This works even in the case of devices that do not natively support roll-back like Cisco IOS CLI.
NSO can detect out-of-band changes and reconcile them by either updating the CDB or modifying the configuration on the devices to reflect the currently stored configuration.

NSO only needs to be synchronized with the devices in the event of a change being made outside of NSO. Changes made using NSO will reflected in both the CDB and the devices. The following actions do not need to be taken:

Perform configuration change via NSO.
Perform sync-from action.

The above incorrect (or not necessary) sequence stems from the assumption that the NSO CLI talks directly to the devices. This is not the case; the northbound interfaces in NSO modify the configuration in the NSO data store, NSO calculates a minimum difference between the current configuration and the new configuration, giving only the changes to the configuration to the NEDS that runs the commands to the devices. All this as one single change-set.

View the configuration of the c0 device using the command:

admin@ncs# show running-config devices device c0 config
devices device c0
 config
  no ios:service pad
  ios:ip vrf my-forward
   bgp next-hop Loopback 1
  !
...

Or, show a particular piece of configuration from several devices:

admin@ncs# show running-config devices device c0..2 config ios:router
devices device c0
 config
  ios:router bgp 64512
   aggregate-address 10.10.10.1 255.255.255.251
   neighbor 1.2.3.4 remote-as 1
   neighbor 1.2.3.4 ebgp-multihop 3
   neighbor 2.3.4.5 remote-as 1
   neighbor 2.3.4.5 activate
   neighbor 2.3.4.5 capability orf prefix-list both
   neighbor 2.3.4.5 weight 300
  !
 !
!
devices device c1
 config
  ios:router bgp 64512
...

Or, show a particular piece of configuration from all devices:

admin@ncs# show running-config devices device config ios:router

The CLI can pipe commands, try TAB after | to see various pipe targets:

admin@ncs# show running-config devices device config ios:router \
                     | display xml | save router.xml

The above command shows the router config of all devices as XML and then saves it to a file router.xml.

Writing Device Configuration

To change the configuration, enter configure mode.

admin@ncs# config
Entering configuration mode terminal
admin@ncs(config)#

Change or add some configuration across the devices, for example:

 admin@ncs(config)# devices device c0..2 config ios:router bgp 64512
                       neighbor 10.10.10.0 remote-as 64502
admin@ncs(config-router)#

Transaction Commit

It is important to understand how NSO applies configuration changes to the network. At this point, the changes are local to NSO, no configurations have been sent to the devices yet. Since the NSO Configuration Database, CDB is in sync with the network, NSO can calculate the minimum diff to apply the changes to the network.

The command below compares the ongoing changes with the running database:

admin@ncs(config-router)# top
admin@ncs(config)# show configuration
devices device c0
 config
  ios:router bgp 64512
   neighbor 10.10.10.0 remote-as 64502
...

It is possible to dry-run the changes to see the native Cisco CLI output (in this case almost the same as above):

admin@ncs(config)# commit dry-run outformat native
native {
    device {
        name c0
        data router bgp 64512
              neighbor 10.10.10.0 remote-as 64502
             !
...

The changes can be committed to the devices and the NSO CDB simultaneously with a single commit. In the commit command below, we pipe to details to understand the actions being taken.

admin@ncs% commit | details

Transaction Rollback

Changes are committed to the devices and the NSO database as one transaction. If any of the device configurations fail, all changes will be rolled back and the devices will be left in the state that they were in before the commit and the NSO CDB will not be updated.

There are numerous options to the commit command which will affect the behavior of the atomic transactions:

admin@ncs(config)# commit TAB
Possible completions:
  and-quit               Exit configuration mode
  check                  Validate configuration
  comment                Add a commit comment
  commit-queue           Commit through commit queue
  label                  Add a commit label
  no-confirm             No confirm
  no-networking          Send nothing to the devices
  no-out-of-sync-check   Commit even if out of sync
  no-overwrite           Do not overwrite modified data on the device
  no-revision-drop       Fail if device has too old data model
  save-running           Save running to file
  ---
  dry-run                Show the diff but do not perform commit

As seen by the details output, NSO stores a roll-back file for every commit so that the whole transaction can be rolled back manually. The following is an example of a rollback file:

admin@ncs(config)# do file show logs/rollback1000
Possible completions:
     rollback10001  rollback10002  rollback10003  \
                           rollback10004  rollback10005
admin@ncs(config)# do file show logs/rollback10005
# Created by: admin
# Date: 2014-09-03 14:35:10
# Via: cli
# Type: delta
# Label:
# Comment:
# No: 10005

ncs:devices {
    ncs:device c0 {
        ncs:config {
            ios:router {
                ios:bgp 64512 {
                    delete:
                    ios:neighbor 10.10.10.0;
                }
            }
        }
    }

(Viewing files as an operational command, prefixing a command in configuration mode with do executes in operational mode.) To perform a manual rollback, first load the rollback file:

admin@ncs(config)# rollback-files apply-rollback-file fixed-number 10005

apply-rollback-file by default restores to that saved configuration, adding selective as a parameter allows you to just roll back the delta in that specific rollback file. Show the differences:

admin@ncs(config)# show configuration
devices device c0
 config
  ios:router bgp 64512
   no neighbor 10.10.10.0 remote-as 64502
  !
 !
!
devices device c1
 config
  ios:router bgp 64512
   no neighbor 10.10.10.0 remote-as 64502
  !
 !
!
devices device c2
 config
  ios:router bgp 64512
   no neighbor 10.10.10.0 remote-as 64502
  !
 !
!

Commit the rollback:

admin@ncs(config)# commit
Commit complete.

Trace Log

A trace log can be created to see what is going on between NSO and the device CLI enable trace. Use the following command to enable trace:

admin@ncs(config)# devices global-settings trace raw trace-dir logs
admin@ncs(config)# commit
Commit complete.
admin@ncs(config)# devices disconnect

Note that the trace settings only take effect for new connections, so it is important to disconnect the current connections. Make a change to for example c0:

admin@ncs(config)# devices device c0 config ios:interface FastEthernet
                                1/2 ip address  192.168.1.1 255.255.255.0
admin@ncs(config-if)# commit dry-run outformat native
admin@ncs(config-if)# commit

Note the use of the command commit dry-run outformat native. This will display the net result device commands that will be generated over the native interface without actually committing them to the CDB or the devices. In addition, there is the possibility to append the reverse flag that will display the device commands for getting back to the current running state in the network if the commit is successfully executed.

Exit from the NSO CLI and return to the Unix Shell. Inspect the CLI trace:

 less logs/ned-cisco-ios-c0.trace

More on Device Management

Device Groups

As seen above, ranges can be used to send configuration commands to several devices. Device groups can be created to allow for group actions that do not require naming conventions. A group can reference any number of devices. A device can be part of any number of groups, and groups can be hierarchical.

The command sequence below creates a group of core devices and a group with all devices. Note that you can use tab completion when adding the device names to the group. Also, note that it requires configuration mode. (If you are still in the Unix Shell from the steps above, do $ncs_cli -C -u admin).

admin@ncs(config)# devices device-group core device-name [ c0 c1 ]
admin@ncs(config-device-group-core)# commit

admin@ncs(config)# devices device-group all device-name c2 device-group core
admin@ncs(config-device-group-all)# commit

admin@ncs(config)# show full-configuration devices device-group
devices device-group all
 device-name  [ c2 ]
 device-group [ core ]
!
devices device-group core
 device-name [ c0 c1 ]
!

admin@ncs(config)# do show devices device-group
NAME  MEMBER        INDETERMINATES  CRITICALS  MAJORS  MINORS  WARNINGS
-------------------------------------------------------------------------
all   [ c0 c1 c2 ]  0               0          0       0       0
core  [ c0 c1 ]     0               0          0       0       0

Note well the do show which shows the operational data for the groups. Device groups have a member attribute that shows all member devices, flattening any group members.

Device groups can contain different devices as well as devices from different vendors. Configuration changes will be committed to each device in its native language without needing to be adjusted in NSO.

You can, for example, at this point use the group to check if all core are in sync:

admin@ncs# devices device-group core check-sync
sync-result {
    device c0
    result in-sync
}
sync-result {
    device c1
    result in-sync
}

Device Templates

Assume that we would like to manage permit lists across devices. This can be achieved by defining templates and applying them to device groups. The following CLI sequence defines a tiny template, called community-list :

admin@ncs(config)# devices template community-list
                                ned-id cisco-ios-cli-3.0
                                config ios:ip
                                community-list standard test1
                                permit permit-list 64000:40

admin@ncs(config-permit-list-64000:40)# commit
Commit complete.
admin@ncs(config-permit-list-64000:40)# top

admin@ncs(config)# show full-configuration devices template
devices template community-list
 config
  ios:ip community-list standard test1
   permit permit-list 64000:40
   !
  !
 !
!
[ok][2013-08-09 11:27:28]

This can now be applied to a device group:

admin@ncs(config)# devices device-group core apply-template \
                                 template-name community-list
admin@ncs(config)# show configuration
devices device c0
 config
  ios:ip community-list standard test1 permit 64000:40
 !
!
devices device c1
 config
  ios:ip community-list standard test1 permit 64000:40
 !
!
admin@ncs(config)# commit dry-run outformat native
native {
    device {
        name c0
        data ip community-list standard test1 permit 64000:40
    }
    device {
        name c1
        data ip community-list standard test1 permit 64000:40
    }
}
admin@ncs(config)# commit
Commit complete.

What if the device group core contained different vendors? Since the configuration is written in IOS the above template would not work on Juniper devices. Templates can be used on different device types (read NEDs) by using a prefix for the device model. The template would then look like:

template community-list {
  config {
    junos:configuration {
    ...
    }
    ios:ip {
    ...
    }

The above indicates how NSO manages different models for different device types. When NSO connects to the devices, the NED checks the device type and revision and returns that to NSO. This can be inspected (note, in operational mode):

admin@ncs# show devices device module
NAME  NAME                       REVISION    FEATURES  DEVIATIONS
-------------------------------------------------------------------
c0    tailf-ned-cisco-ios        2014-02-12  -         -
      tailf-ned-cisco-ios-stats  2014-02-12  -         -
c1    tailf-ned-cisco-ios        2014-02-12  -         -
      tailf-ned-cisco-ios-stats  2014-02-12  -         -
c2    tailf-ned-cisco-ios        2014-02-12  -         -
      tailf-ned-cisco-ios-stats  2014-02-12  -         -

So here we see that c0 uses a tailf-ned-cisco-ios module which tells NSO which data model to use for the device. Every NED package comes with a YANG data model for the device (except for third-party YANG NED for which the YANG device model must be downloaded and fixed before it can be used). This renders the NSO data store (CDB) schema, the NSO CLI, WebUI, and southbound commands.

The model introduces namespace prefixes for every configuration item. This also resolves issues around different vendors using the same configuration command for different configuration elements. Note that every item is prefixed with ios:

admin@ncs# show running-config devices device c0 config ios:ip community-list
devices device c0
 config
  ios:ip community-list 1 permit
  ios:ip community-list 2 deny
  ios:ip community-list standard s permit
  ios:ip community-list standard test1 permit 64000:40
 !
!

Another important question is how to control if the template merges the list or replaces the list. This is managed via tags. The default behavior of templates is to merge the configuration. Tags can be inserted at any point in the template. Tag values are merge, replace, delete, create and nocreate.

Assume that c0 has the following configuration:

admin@ncs# show running-config devices device c0 config ios:ip community-list
devices device c0
 config
  ios:ip community-list 1 permit
  ios:ip community-list 2 deny
  ios:ip community-list standard s permit}

If we apply the template the default result would be:

admin@ncs# show running-config devices device c0 config ios:ip community-list
devices device c0
 config
  ios:ip community-list 1 permit
  ios:ip community-list 2 deny
  ios:ip community-list standard s permit
  ios:ip community-list standard test1 permit 64000:40
 !
!

We could change the template in the following way to get a result where the permit list would be replaced rather than merged. When working with tags in templates, it is often helpful to view the template as a tree rather than a command view. The CLI has a display option for showing a curly-braces tree view that corresponds to the data-model structure rather than the command set. This makes it easier to see where to add tags.

admin@ncs(config)# show full-configuration devices template
devices template community-list
 config
  ios:ip community-list standard test1
   permit permit-list 64000:40
   !
  !
 !
!
admin@ncs(config)# show full-configuration devices \
                                 template | display curly-braces
template community-list {
    config {
        ios:ip {
            community-list {
                standard test1 {
                    permit {
                        permit-list 64000:40;
                    }
                }
            }
        }
    }
}


admin@ncs(config)# tag add devices template community-list
                                ned-id cisco-ios-cli-3.0
                                config ip community-list replace
admin@ncs(config)# commit
Commit complete.
admin@ncs(config)# show full-configuration devices
                                 template | display curly-braces
template community-list {
    config {
        ios:ip {
            /* Tags: replace */
            community-list {
                standard test1 {
                    permit {
                        permit-list 64000:40;
                    }
                }
            }
        }
    }
}

Different tags can be added across the template tree. If we now apply the template to the device c0 which already have community lists, the following happens:

admin@ncs(config)# show full-configuration devices device c0 \
                                 config ios:ip community-list
devices device c0
 config
  ios:ip community-list 1 permit
  ios:ip community-list 2 deny
  ios:ip community-list standard s permit
  ios:ip community-list standard test1 permit 64000:40
 !
!
admin@ncs(config)# devices device c0 apply-template \
                                 template-name community-list
admin@ncs(config)# show configuration
devices device c0
 config
  no ios:ip community-list 1 permit
  no ios:ip community-list 2 deny
  no ios:ip community-list standard s permit
 !
!

Any existing values in the list are replaced in this case. The following tags are available:

merge (default): the template changes will be merged with the existing template.
replace: the template configuration will be replaced by the new configuration.
create: the template will create those nodes that do not exist. If a node already exists this will result in an error.
nocreate: the merge will only affect configuration items that already exist in the template. It will never create the configuration with this tag, or any associated commands inside it. It will only modify existing configuration structures.
delete: delete anything from this point.

Note that a template can have different tags along the tree nodes.

A problem with the above template is that every value is hard-coded. What if you wanted a template where the community-list name and permit-list value are variables passed to the template when applied? Any part of a template can be a variable, (or actually an XPATH expression). We can modify the template to use variables in the following way:

admin@ncs(config)# no devices template community-list config ios:ip \
                                community-list standard test1
admin@ncs(config)# devices template community-list config ios:ip \
                                community-list standard \
                                {$LIST-NAME} permit permit-list {$AS}

admin@ncs(config-permit-list-{$AS})# commit
Commit complete.

admin@ncs(config-permit-list-{$AS})# top
admin@ncs(config)# show full-configuration devices template
devices template community-list
 config
  ios:ip community-list standard {$LIST-NAME}
   permit permit-list {$AS}
   !
  !
 !
!

The template now requires two parameters when applied (tab completion will prompt for the variable):

admin@ncs(config)# devices device-group all apply-template
template-name community-list variable { name LIST-NAME value 'test2' }
variable { name AS value '60000:30' }

admin@ncs(config)# commit

Note, that the replace tag was still part of the template and it would delete any existing community lists, which is probably not the desired outcome in the general case.

The template mechanism described so far is "fire-and-forget". The templates do not have any memory of what happened to the network, or which devices they touched. A user can modify the templates without anything happening to the network until an explicit apply-template action is performed. (Templates are of course, as all configuration changes, applied as a transaction). NSO also supports service templates that are more advanced in many ways, more information on this will be presented later in this guide.

Also, note that device templates have some additional restrictions on the values that can be supplied when applying the template. In particular, a value must either be a number or a single-quoted string. It is currently not possible to specify a value that contains a single quote (').

Policies

To make sure that configuration is applied according to site or corporate rules, you can use policies. Policies are validated at every commit, they can be of type error that implies that the change cannot go through or a warning which means that you have to confirm a configuration that gives a warning.

A policy is composed of:

Policy name.
Iterator: loop over a path in the model, for example, all devices, all services of a specific type.
Expression: a boolean expression that must be true for every node returned from the iterator, for example, SNMP must be turned on.
Warning or error: a message displayed to the user. If it is of the type warning, the user can still commit the change, if of type error the change cannot be made.

An example is shown below:

admin@ncs(config)# policy rule class-map
Possible completions:
  error-message     Error message to print on expression failure
  expr              XPath 1.0 expression that returns a boolean
  foreach           XPath 1.0 expression that returns a node set
  warning-message   Warning message to print on expression failure

admin@ncs(config)# policy rule class-map foreach /devices/device \
       expr config/ios:class-map[name='a'] \
       warning-message "Device {name} must have a class-map a"

admin@ncs(config-rule-class-map)# top

admin@ncs(config)# commit
Commit complete.

admin@ncs(config)# show full-configuration policy
policy rule class-map
 foreach         /devices/device
 expr            config/ios:class-map[ios:name='a']
 warning-message "Device {name} must have a class-map a"
!

Now, if we try to delete a class-map a, we will get a policy violation:

admin@ncs(config)# no devices device c2 config ios:class-map match-all a
admin@ncs(config)# validate
Validation completed with warnings:
  Device c2 must have a class-map a

admin@ncs(config)# commit
The following warnings were generated:
  Device c2 must have a class-map a
Proceed? [yes,no] yes
Commit complete.

admin@ncs(config)# validate
Validation completed with warnings:
  Device c2 must have a class-map a

The {name} variable refers to the node set from the iterator. This node-set will be the list of devices in NSO and the devices have an attribute called 'name'.

To understand the syntax for the expressions a pipe target in the CLI can be used:

admin@ncs(config)# show full-configuration devices device c2 config \
                                 ios:class-map | display xpath
/ncs:devices/ncs:device[ncs:name='c2']/ncs:config/ \
ios:class-map[ios:name='cmap1']/ios:prematch match-all
...

To debug policies look at the end of logs/xpath.trace. This file will show all validated XPATH expressions and any errors.

4-Sep-2014::11:05:30.103 Evaluating XPath for policy: class-map:
  /devices/device
get_next(/ncs:devices/device) = {c0}
XPath policy match: /ncs:devices/device{c0}
get_next(/ncs:devices/device{c0}) = {c1}
XPath policy match: /ncs:devices/device{c1}
get_next(/ncs:devices/device{c1}) = {c2}
XPath policy match: /ncs:devices/device{c2}
get_next(/ncs:devices/device{c2}) = false
exists("/ncs:devices/device{c2}/config/class-map{a}") = true
exists("/ncs:devices/device{c1}/config/class-map{a}") = true
exists("/ncs:devices/device{c0}/config/class-map{a}") = true

Validation scripts can also be defined in Python, see more about that in Plug-and-Play Scripting.

Out-of-band Changes, Transactions, and Pre-Provisioning

In reality, network engineers will still modify configurations using other tools like out-of-band CLI or other management interfaces. It is important to understand how NSO manages this. The NSO network simulator supports CLI towards the devices. For example, we can use the IOS CLI on say c0 and delete a permit-list.

From the UNIX shell, start a CLI session towards c0.

$ ncs-netsim cli-i c0

c0> enable
c0# configure
Enter configuration commands, one per line. End with CNTL/Z.

c0(config)# show full-configuration ip community-list
ip community-list standard test1 permit
ip community-list standard test2 permit 60000:30
c0(config)# no ip community-list standard test2
c0(config)#
c0# exit
$

Start the NSO CLI again:

$ ncs_cli -C -u admin

NSO detects if its configuration copy in CDB differs from the configuration in the device. Various strategies are used depending on device support: transaction IDs, time stamps, and configuration hash-sums. For example, an NSO user can request a check-sync operation:

admin@ncs# devices check-sync
sync-result {
    device c0
    result out-of-sync
    info got: e54d27fe58fda990797d8061aa4d5325 expected: 36308bf08207e994a8a83af710effbf0

}
sync-result {
    device c1
    result in-sync
}
sync-result {
    device c2
    result in-sync
}

admin@ncs# devices device-group core check-sync
sync-result {
    device c0
    result out-of-sync
    info got: e54d27fe58fda990797d8061aa4d5325 expected: 36308bf08207e994a8a83af710effbf0

}
sync-result {
    device c1
    result in-sync
}

NSO can also compare the configurations with the CDB and show the difference:

admin@ncs# devices device c0 compare-config
diff
 devices {
     device c0 {
         config {
             ios:ip {
                 community-list {
+                    standard test1 {
+                        permit {
+                        }
+                    }
-                    standard test2 {
-                        permit {
-                            permit-list 60000:30;
-                        }
-                    }
                 }
             }
         }
     }
 }

At this point, we can choose if we want to use the configuration stored in the CDB as the valid configuration or the configuration on the device:

admin@ncs# devices sync-
Possible completions:
  sync-from   Synchronize the config by pulling from the devices
  sync-to     Synchronize the config by pushing to the devices

admin@ncs# devices sync-to

In the above example, we chose to overwrite the device configuration from NSO.

NSO will also detect out-of-sync when committing changes. In the following scenario, a local c0 CLI user adds an interface. Later the NSO user tries to add an interface:

$ ncs-netsim cli-i c0

c0> enable
c0# configure
Enter configuration commands, one per line. End with CNTL/Z.
c0(config)#  interface FastEthernet 1/0 ip address 192.168.1.1 255.255.255.0
c0(config-if)#
c0# exit

$ ncs_cli -C -u admin

admin@ncs# config
Entering configuration mode terminal

admin@ncs(config)# devices device c0 config ios:interface \
       FastEthernet1/1 ip address 192.168.1.1 255.255.255.0

admin@ncs(config-if)# commit
Aborted: Network Element Driver: device c0: out of sync

At this point, we have two diffs:

The device and NSO CDB (devices device compare-config).
The ongoing transaction and CDB (show configuration).

admin@ncs(config)# devices device c0 compare-config
diff
 devices {
     device c0 {
         config {
             ios:interface {
                 FastEthernet 1/0 {
                     ip {
                         address {
                             primary {
+                                mask 255.255.255.0;
+                                address 192.168.1.1;
                             }
                         }
                     }
                 }
             }
         }
     }
 }

admin@ncs(config)# show configuration
devices device c0
 config
  ios:interface FastEthernet1/1
   ip address 192.168.1.1 255.255.255.0
  exit
 !
!

To resolve this, you can choose to synchronize the configuration between the devices and the CDB before committing. There is also an option to over-ride the out-of-sync check:

admin@ncs(config)# commit no-out-of-sync-check

Or:

admin@ncs(config)# devices global-settings out-of-sync-commit-behaviour
Possible completions:
  accept  reject

As noted before, all changes are applied as complete transactions of all configurations on all of the devices. Either all configuration changes are completed successfully or all changes are removed entirely. Consider a simple case where one of the devices is not responding. For the transaction manager, an error response from a device or a non-responding device, are both errors and the transaction should automatically rollback to the state before the commit command was issued.

Stop c0:

$ ncs-netsim stop c0
DEVICE c0 STOPPED

Go back to the NSO CLI and perform a configuration change over c0 and c1:

admin@ncs(config)# devices device c0 config ios:ip community-list \
                                 standard test3 permit 50000:30
admin@ncs(config-config)# devices device c1 config ios:ip \
                                community-list standard test3 permit 50000:30

admin@ncs(config-config)# top
admin@ncs(config)# show configuration
devices device c0
 config
  ios:ip community-list standard test3 permit 50000:30
 !
!
devices device c1
 config
  ios:ip community-list standard test3 permit 50000:30
 !
!

admin@ncs(config)# commit
Aborted: Failed to connect to device c0: connection refused: Connection refused
admin@ncs(config)# *** ALARM connection-failure: Failed to connect to
device c0: connection refused: Connection refused

NSO sends commands to all devices in parallel, not sequentially. If any of the devices fail to accept the changes or report an error, NSO will issue a rollback to the other devices. Note, that this works also for non-transactional devices like IOS CLI and SNMP. This works even for non-symmetrical cases where the rollback command sequence is not just the reverse of the commands. NSO does this by treating the rollback as it would any other configuration change. NSO can use the current configuration and previous configuration and generate the commands needed to roll back from the configuration changes.

The diff configuration is still in the private CLI session, it can be restored, modified (if the error was due to something in the config), or in some cases, fix the device.

NSO is not a best-effort configuration management system. The error reporting coupled with the ability to completely rollback failed changes to the devices, ensures that the configurations stored in the CDB and the configurations on the devices are always consistent and that no failed or orphan configurations are left on the devices.

First of all, if the above was not a multi-device transaction, meaning that the change should be applied independently device per device, then it is just a matter of performing the commit between the devices.

Second, NSO has a commit flag commit-queue async or commit-queue sync. The commit queue should primarily be used for throughput reasons when doing configuration changes in large networks. Atomic transactions come with a cost, the critical section of the database is locked when committing the transaction on the network. So, in cases where there are northbound systems of NSO that generate many simultaneous large configuration changes these might get queued. The commit queue will send the device commands after the lock has been released, so the database lock is much shorter. If any device fails, an alarm will be raised.

admin@ncs(config)# commit commit-queue async
commit-queue-id 2236633674
Commit complete.

admin@ncs(config)# do show devices commit-queue | notab
devices commit-queue queue-item 2236633674
 age       11
 status    executing
 devices   [ c0 c1 c2 ]
 transient c0
  reason "Failed to connect to device c0: connection refused"
 is-atomic true

Go to the UNIX shell, start the device, and monitor the commit queue:

$ncs-netsim start c0
DEVICE c0 OK STARTED

$ncs_cli -C -u admin

admin@ncs# show devices commit-queue
devices commit-queue queue-item 2236633674
 age       11
 status    executing
 devices   [ c0 c1 c2 ]
 transient c0
  reason "Failed to connect to device c0: connection refused"
 is-atomic true

admin@ncs# show devices commit-queue
devices commit-queue queue-item 2236633674
 age       11
 status    executing
 devices   [ c0 c1 c2 ]
 is-atomic true

admin@ncs# show devices commit-queue
% No entries found.

Devices can also be pre-provisioned, this means that the configuration can be prepared in NSO and pushed to the device when it is available. To illustrate this, we can start by adding a new device to NSO that is not available in the network simulator:

admin@ncs(config)# devices device c3 address 127.0.0.1 port 10030 \
                                authgroup default device-type cli
                                ned-id cisco-ios
admin@ncs(config-device-c3)# state admin-state southbound-locked
admin@ncs(config-device-c3)# commit

Above, we added a new device to NSO with an IP address local host, and port 10030. This device does not exist in the network simulator. We can tell NSO not to send any commands southbound by setting the admin-state to southbound-locked (actually the default). This means that all configuration changes will succeed, and the result will be stored in CDB. At any point in time when the device is available in the network, the state can be changed and the complete configuration pushed to the new device. The CLI sequence below also illustrates a powerful copy configuration command that can copy any configuration from one device to another. The from and to paths are separated by the keyword to.

admin@ncs(config)# copy cfg merge devices device c0 config \
                                ios:ip community-list to \
                                devices device c3 config ios:ip community-list
admin@ncs(config)# show configuration
devices device c3
 config
  ios:ip community-list standard test2 permit 60000:30
  ios:ip community-list standard test3 permit 50000:30
 !
!


admin@ncs(config)# commit

admin@ncs(config)# devices check-sync
...

sync-result {
    device c3
    result locked
}

As shown above, check-sync operations will tell the user that the device is southbound locked. When the device is available in the network, the device can be synchronized with the current configuration in the CDB using the sync-to action.

About Conflicts

Different users or management tools can of course run parallel sessions to NSO. All ongoing sessions have a logical copy of CDB. An important case needs to be understood if there is a conflict when multiple users attempt to modify the same device configuration at the same time with different changes. First, let's look at the CLI sequence below, user admin to the left, user joe to the right.

admin@ncs(config)# devices device c0 config ios:snmp-server community fozbar

      joe@ncs(config)# devices device c0 config ios:snmp-server community fezbar

admin@ncs(config-config)# commit

      System message at 2014-09-04 13:15:19...
      Commit performed by admin via console using cli.
      joe@ncs(config-config)# commit
      joe@ncs(config)# show full-configuration devices device c0 config ios:snmp-server
      devices device c0
        config
          ios:snmp-server community fezbar
          ios:snmp-server community fozbar
        !
      !

There is no conflict in the above sequence, community is a list so both joe and admin can add items to the list. Note that user joe gets information about the user admin committing.

On the other hand, if two users modify an ordered-by user list in such a way that one user rearranges the list, along with other non-conflicting modifications, and one user deletes the entire list, the following happens:

admin@ncs(config)# no devices device c0 config access-list 10

      joe@ncs(config)# move devices device c0 config access-list 10 permit 168.215.202.0 0.0.0.255 first
      joe@ncs(config)# devices device c0 config logging history informational
      joe@ncs(config)# devices device c0 config logging source-interface Vlan512
      joe@ncs(config)# devices device c0 config logging 10.1.22.122
      joe@ncs(config)# devices device c0 config logging 66.162.108.21
      joe@ncs(config)# devices device c0 config logging 50.58.29.21

admin@ncs% commit

      System message at 2022-09-01 14:17:59...
      Commit performed by admin via console using cli.
      joe@ncs(config-config)# commit
      Aborted: Transaction 542 conflicts with transaction 562 started by user admin: 'devices device c0 config access-list 10' read-op on-descendant write-op delete in work phase(s)
      --------------------------------------------------------------------------
      This transaction is in a non-resolvable state.
      To attempt to reapply the configuration changes made in the CLI,
      in a new transaction, revert the current transaction by running
      the command 'revert' followed by the command 'reapply-commands'.
      --------------------------------------------------------------------------

In this case, joe commits a change to access-list after admin and a conflict message is displayed. Since the conflict is non-resolvable, the transaction has to be reverted. To reapply the changes made by joe to logging in a new transaction, the following commands are entered:

      joe@ncs(config)# revert no-confirm
      joe@ncs(config)# reapply-commands best-effort
      move devices device c0 config access-list 10 permit 168.215.202.0 0.0.0.255 first
      Error: on line 1: move devices device c0 config access-list 10 permit 168.215.202.0 0.0.0.255 first
      devices device c0 config
      logging history informational
      logging facility local0
      logging source-interface Vlan512
      logging 10.1.22.122
      logging 66.162.108.21
      logging 50.58.29.21
      joe@ncs(config-config)# show config
      logging facility local0
      logging history informational
      logging 10.1.22.122
      logging 50.58.29.21
      logging 66.162.108.21
      logging source-interface Vlan512
      joe@ncs(config-config)# commit
      Commit complete.

In this case, joe tries to reapply the changes made in the previous transaction and since access-list 10 has been removed, the move command will fail when applied by the reapply-commands command. Since the mode is best-effort, the next command will be processed. The changes to logging will succeed and joe then commits the transaction.

Manage Network Services

Manage the life-cycle of network services.

NSO can also manage the life-cycle for services like VPNs, BGP peers, and ACLs. It is important to understand what is meant by service in this context:

NSO abstracts the device-specific details. The user only needs to enter attributes relevant to the service.
The service instance has configuration data itself that can be represented and manipulated.
A service instance configuration change is applied to all affected devices.

Service Configuration Features

The following are the features that NSO uses to support service configuration:

Service Modeling: Network engineers can model the service attributes and the mapping to device configurations. For example, this means that a network engineer can specify at data-model for VPNs with router interfaces, VLAN ID, VRF, and route distinguisher.
Service Life-cycle: While less sophisticated configuration management systems can only create an initial service instance in the network they do not support changing or deleting a service instance. With NSO you can at any point in time modify service elements like the VLAN id of a VPN and NSO can generate the corresponding changes to the network devices.
Service Instance: The NSO service instance has configuration data that can be represented and manipulated. The service model on run-time updates all NSO northbound interfaces so that a network engineer can view and manipulate the service instance over CLI, WebUI, REST, etc.
References between Service Instances and Device Configuration: NSO maintains references between service instances and device configuration. This means that a VPN instance knows exactly which device configurations it created or modified. Every configuration stored in the CDB is mapped to the service instance that created it.

Service Example

An example is the best method to illustrate how services are created and used in NSO. As described in the sections about devices and NEDs, it was said that NEDs come in packages. The same is true for services, either if you design the services yourself or use ready-made service applications, it ends up in a package that is loaded into NSO.

Watch a video presentation of this demo on YouTube.

The example examples.ncs/service-provider/mpls-vpn will be used to explain NSO Service Management features. This example illustrates Layer-3 VPNs in a service provider MPLS network. The example network consists of Cisco ASR 9k and Juniper core routers (P and PE) and Cisco IOS-based CE routers. The Layer-3 VPN service configures the CE/PE routers for all endpoints in the VPN with BGP as the CE/PE routing protocol. The layer-2 connectivity between CE and PE routers is expected to be done through a Layer-2 ethernet access network, which is out of scope for this example. The Layer-3 VPN service includes VPN connectivity as well as bandwidth and QOS parameters.

The service configuration only has references to CE devices for the end-points in the VPN. The service mapping logic reads from a simple topology model that is configuration data in NSO, outside the actual service model and derives what other network devices to configure.

The topology information has two parts:

The first part lists connections in the network and is used by the service mapping logic to find out which PE router to configure for an endpoint. The snippets below show the configuration output in the Cisco-style NSO CLI.

 topology connection c0
 endpoint-1 device ce0 interface GigabitEthernet0/8 ip-address 192.168.1.1/30
 endpoint-2 device pe0 interface GigabitEthernet0/0/0/3 ip-address 192.168.1.2/30
 link-vlan 88
!
topology connection c1
 endpoint-1 device ce1 interface GigabitEthernet0/1 ip-address 192.168.1.5/30
 endpoint-2 device pe1 interface GigabitEthernet0/0/0/3 ip-address 192.168.1.6/30
 link-vlan 77
!

The second part lists devices for each role in the network and is in this example only used to dynamically render a network map in the Web UI.
```
topology role ce
 device [ ce0 ce1 ce2 ce3 ce4 ce5 ]
!
topology role pe
 device [ pe0 pe1 pe2 pe3 ]
!
```

The QOS configuration in service provider networks is complex and often requires a lot of different variations. It is also often desirable to be able to deliver different levels of QOS. This example shows how a QOS policy configuration can be stored in NSO and referenced from VPN service instances. Three different levels of QOS policies are defined; GOLD, SILVER, and BRONZE with different queuing parameters.

 qos qos-policy GOLD
 class BUSINESS-CRITICAL
  bandwidth-percentage 20
 !
 class MISSION-CRITICAL
  bandwidth-percentage 20
 !
 class REALTIME
  bandwidth-percentage 20
  priority
 !
!
qos qos-policy SILVER
 class BUSINESS-CRITICAL
  bandwidth-percentage 25
 !
 class MISSION-CRITICAL
  bandwidth-percentage 25
 !
 class REALTIME
  bandwidth-percentage 10
 !

Three different traffic classes are also defined with a DSCP value that will be used inside the MPLS core network as well as default rules that will match traffic to a class.

qos qos-class BUSINESS-CRITICAL
 dscp-value af21
 match-traffic ssh
  source-ip      any
  destination-ip any
  port-start     22
  port-end       22
  protocol       tcp
 !
!
qos qos-class MISSION-CRITICAL
 dscp-value af31
 match-traffic call-signaling
  source-ip      any
  destination-ip any
  port-start     5060
  port-end       5061
  protocol       tcp
 !
!

Running the Example

Run the example as follows:

Make sure that you start clean, i.e. no old configuration data is present. If you have been running this or some other example before, make sure to stop any NSO or simulated network nodes (ncs-netsim) that you may have running. Output like 'connection refused (stop)' means no previous NSO was running and 'DEVICE ce0 connection refused (stop)...' no simulated network was running, which is good.
```
Copy$ 
```
This will set up the environment and start the simulated network.
Before creating a new L3VPN service, we must sync the configuration from all network devices and then enter config mode. (A hint for this complete section is to have the README file from the example and cut and paste the CLI commands).
```
Copyncs# 
```

Add another VPN.

top
!
vpn l3vpn ford
as-number 65200
endpoint main-office
ce-device    ce2
ce-interface GigabitEthernet0/5
ip-network   192.168.1.0/24
bandwidth    10000000
!
endpoint branch-office1
ce-device    ce3
ce-interface GigabitEthernet0/5
ip-network   192.168.2.0/24
bandwidth    5500000
!
endpoint branch-office2
ce-device    ce5
ce-interface GigabitEthernet0/5
ip-network   192.168.7.0/24
bandwidth    1500000
!

The above sequence showed how NSO can be used to manipulate service abstractions on top of devices. Services can be defined for various purposes such as VPNs, Access Control Lists, firewall rules, etc. Support for services is added to NSO via a corresponding service package.

A service package in NSO comprises two parts:

Service model: the attributes of the service, and input parameters given when creating the service. In this example name, as-number, and end-points.
Mapping: what is the corresponding configuration of the devices when the service is applied. The result of the mapping can be inspected by the commit dry-run outformat native command.

We later show how to define this, for now, assume that the job is done.

Service-Life Cycle Management

Service Changes

When NSO applies services to the network, NSO stores the service configuration along with resulting device configuration changes. This is used as a base for the FASTMAP algorithm which automatically can derive device configuration changes from a service change.

Example 1

Going back to the example L3 VPN above, any part of volvo VPN instance can be modified.

A simple change like changing the as-number on the service results in many changes in the network. NSO does this automatically.

ncs(config)# vpn l3vpn volvo as-number 65102
ncs(config-l3vpn-volvo)# commit dry-run outformat native
native {
    device {
        name ce0
        data no router bgp 65101
             router bgp 65102
              neighbor 192.168.1.2 remote-as 100
              neighbor 192.168.1.2 activate
              network 10.10.1.0
             !
...
ncs(config-l3vpn-volvo)# commit

Example 2

Let us look at a more challenging modification.

A common use case is of course to add a new CE device and add that as an end-point to an existing VPN. Below is the sequence to add two new CE devices and add them to the VPNs. (In the CLI snippets below we omit the prompt to enhance readability).

First, we add them to the topology:

top
!
topology connection c7
endpoint-1 device ce7 interface GigabitEthernet0/1 ip-address 192.168.1.25/30
endpoint-2 device pe3 interface GigabitEthernet0/0/0/2 ip-address 192.168.1.26/30
link-vlan 103
!
topology connection c8
endpoint-1 device ce8 interface GigabitEthernet0/1 ip-address 192.168.1.29/30
endpoint-2 device pe3 interface GigabitEthernet0/0/0/2 ip-address 192.168.1.30/30
link-vlan 104
!
ncs(config)#commit

Note well that the above just updates NSO local information on topological links. It has no effect on the network. The mapping for the L3 VPN services does a look-up in the topology connections to find the corresponding pe router.

Next, we add them to the VPNs:

top
!
vpn l3vpn ford
endpoint new-branch-office
ce-device    ce7
ce-interface GigabitEthernet0/5
ip-network   192.168.9.0/24
bandwidth    4500000
!
vpn l3vpn volvo
endpoint new-branch-office
ce-device    ce8
ce-interface GigabitEthernet0/5
ip-network   10.8.9.0/24
bandwidth    4500000
!

Before we send anything to the network, let's look at the device configuration using a dry run. As you can see, both new CE devices are connected to the same PE router, but for different VPN customers.

ncs(config)#commit dry-run outformat native

Finally, commit the configuration to the network

(config)#commit

Service Impacting Out-of-band Changes

Next, we will show how NSO can be used to check if the service configuration in the network is up to date.

In a new terminal window, we connect directly to the device ce0 which is a Cisco device emulated by the tool ncs-netsim.

$ ncs-netsim cli-c ce0

We will now reconfigure an edge interface that we previously configured using NSO.

 enable
ce0# configure
Enter configuration commands, one per line. End with CNTL/Z.
ce0(config)# no policy-map volvo
ce0(config)# exit
ce0# exit

Going back to the terminal with NSO, check the status of the network configuration:

ncs# devices check-sync
sync-result {
    device ce0
    result out-of-sync
    info got: c5c75ee593246f41eaa9c496ce1051ea expected: c5288cc0b45662b4af88288d29be8667
...

ncs# vpn l3vpn * check-sync
vpn l3vpn ford check-sync
    in-sync true
vpn l3vpn volvo check-sync
    in-sync true

ncs# vpn l3vpn * deep-check-sync
vpn l3vpn ford deep-check-sync
    in-sync true
vpn l3vpn volvo deep-check-sync
    in-sync false

The CLI sequence above performs 3 different comparisons:

Real device configuration versus device configuration copy in NSO CDB.
Expected device configuration from the service perspective and device configuration copy in CDB.
Expected device configuration from the service perspective and real device configuration.

Notice that the service volvo is out of sync with the service configuration. Use the check-sync outformat cli to see what the problem is:

ncs# vpn l3vpn volvo deep-check-sync outformat cli
cli  devices {
         devices {
             device ce0 {
                 config {
    +                ios:policy-map volvo {
    +                    class class-default {
    +                        shape {
    +                            average {
    +                                bit-rate 12000000;
    +                            }
    +                        }
    +                    }
    +                }
                 }
             }
         }
     }

Assume that a network engineer considers the real device configuration to be authoritative:

ncs# devices device ce0 sync-from
result true

And then restore the service:

ncs# vpn l3vpn volvo re-deploy dry-run { outformat native } 
native {
    device {
        name ce0
        data policy-map volvo
               class class-default
                shape average 12000000
               !
              !

    }
}
ncs# vpn l3vpn volvo re-deploy

Service Deletion

In the same way, as NSO can calculate any service configuration change, it can also automatically delete the device configurations that resulted from creating services:

ncs(config)# no vpn l3vpn ford
ncs(config)# commit dry-run
cli  devices {
         device ce7
             config {
    -            ios:policy-map ford {
    -                class class-default {
    -                    shape {
    -                        average {
    -                            bit-rate 4500000;
    -                    }
    -                }
    -            }
    -        }
...

It is important to understand the two diffs shown above. The first diff as an output to show configuration shows the diff at the service level. The second diff shows the output generated by NSO to clean up the device configurations.

Finally, we commit the changes to delete the service.

(config)# commit

Viewing Service Configurations

Service instances live in the NSO data store as well as a copy of the device configurations. NSO will maintain relationships between these two.

Show the configuration for a service

ncs(config)# show full-configuration vpn l3vpn
vpn l3vpn volvo
 as-number 65102
 endpoint branch-office1
  ce-device    ce1
  ce-interface GigabitEthernet0/11
  ip-network   10.7.7.0/24
  bandwidth    6000000
 !
...

You can ask NSO to list all devices that are touched by a service and vice versa:

ncs# show vpn l3vpn device-list
NAME   DEVICE LIST
----------------------------------------
volvo  [ ce0 ce1 ce4 ce8 pe0 pe2 pe3 ]


ncs# show devices device service-list
NAME  SERVICE LIST
-------------------------------------
ce0   [ "/l3vpn:vpn/l3vpn{volvo}" ]
ce1   [ "/l3vpn:vpn/l3vpn{volvo}" ]
ce2   [  ]
ce3   [  ]
ce4   [ "/l3vpn:vpn/l3vpn{volvo}" ]
ce5   [  ]
ce6   [  ]
ce7   [  ]
ce8   [ "/l3vpn:vpn/l3vpn{volvo}" ]
p0    [  ]
p1    [  ]
p2    [  ]
p3    [  ]
pe0   [ "/l3vpn:vpn/l3vpn{volvo}" ]
pe1   [  ]
pe2   [ "/l3vpn:vpn/l3vpn{volvo}" ]
pe3   [ "/l3vpn:vpn/l3vpn{volvo}" ]

Note that operational mode in the CLI was used above. Every service instance has an operational attribute that is maintained by the transaction manager and shows which device configuration it created. Furthermore, every device configuration has backward pointers to the corresponding service instances:

ncs(config)# show full-configuration devices device ce3 \
                    config | display service-meta-data
devices device ce3
 config
  ...
  /* Refcount: 1 */
  /* Backpointer: [ /l3vpn:vpn/l3vpn:l3vpn[l3vpn:name='ford'] ] */
  ios:interface GigabitEthernet0/2.100
   /* Refcount: 1 */
   description Link to PE / pe1 - GigabitEthernet0/0/0/5
   /* Refcount: 1 */
   encapsulation dot1Q 100
   /* Refcount: 1 */
   ip address 192.168.1.13 255.255.255.252
   /* Refcount: 1 */
   service-policy output ford
  exit

ncs(config)# show full-configuration devices device ce3 config \
                     | display curly-braces | display service-meta-data
...
ios:interface {
    GigabitEthernet 0/1;
    GigabitEthernet 0/10;
    GigabitEthernet 0/11;
    GigabitEthernet 0/12;
    GigabitEthernet 0/13;
    GigabitEthernet 0/14;
    GigabitEthernet 0/15;
    GigabitEthernet 0/16;
    GigabitEthernet 0/17;
    GigabitEthernet 0/18;
    GigabitEthernet 0/19;
    GigabitEthernet 0/2;
    /* Refcount: 1 */
    /* Backpointer: [ /l3vpn:vpn/l3vpn:l3vpn[l3vpn:name='ford'] ] */
    GigabitEthernet 0/2.100 {
        /* Refcount: 1 */
        description "Link to PE / pe1 - GigabitEthernet0/0/0/5";
        encapsulation {
            dot1Q {
                /* Refcount: 1 */
                vlan-id 100;
            }
        }
        ip {
            address {
                primary {
                    /* Refcount: 1 */
                    address 192.168.1.13;
                    /* Refcount: 1 */
                    mask    255.255.255.252;
                }
            }
        }
        service-policy {
            /* Refcount: 1 */
            output ford;
        }
    }

ncs(config)# show full-configuration devices device ce3 config \
                   | display service-meta-data | context-match Backpointer
devices device ce3
  /* Refcount: 1 */
  /* Backpointer: [ /l3vpn:vpn/l3vpn:l3vpn[l3vpn:name='ford'] ] */
  ios:interface GigabitEthernet0/2.100
devices device ce3
  /* Refcount: 2 */
  /* Backpointer: [ /l3vpn:vpn/l3vpn:l3vpn[l3vpn:name='ford'] ] */
  ios:interface GigabitEthernet0/5

The reference counter above makes sure that NSO will not delete shared resources until the last service instance is deleted. The context-match search is helpful, it displays the path to all matching configuration items.

Using Commit Queues

As described in Commit Queue, the commit queue can be used to increase the transaction throughput. When the commit queue is for service activation, the services will have states reflecting outstanding commit queue items.

When committing a service using the commit queue in async mode the northbound system can not rely on the service being fully activated in the network when the activation requests return.

We will now commit a VPN service using the commit queue and one device is down.

$ ncs-netsim stop ce0
DEVICE ce0 STOPPED

ncs(config)# show configuration
vpn l3vpn volvo
 as-number 65101
 endpoint branch-office1
  ce-device    ce1
  ce-interface GigabitEthernet0/11
  ip-network   10.7.7.0/24
  bandwidth    6000000
 !
 endpoint main-office
  ce-device    ce0
  ce-interface GigabitEthernet0/11
  ip-network   10.10.1.0/24
  bandwidth    12000000
 !
!

ncs# commit commit-queue async
commit-queue-id 10777927137
Commit complete.
ncs(config)# *** ALARM connection-failure: Failed to connect to device ce0: connection refused: Connection refused

This service is not provisioned fully in the network, since ce0 was down. It will stay in the queue either until the device starts responding or when an action is taken to remove the service or remove the item. The commit queue can be inspected. As shown below we see that we are waiting for ce0. Inspecting the queue item shows the outstanding configuration.

ncs# show devices commit-queue | notab
devices commit-queue queue-item 10777927137
 age       1934
 status    executing
 devices   [ ce0 ce1 pe0 ]
 transient ce0
  reason "Failed to connect to device ce0: connection refused"
 is-atomic true

ncs# show vpn l3vpn volvo commit-queue | notab
commit-queue queue-item 1498812003922

The commit queue will constantly try to push the configuration towards the devices. The number of retry attempts and at what interval they occur can be configured.

ncs# show full-configuration devices global-settings commit-queue | details
devices global-settings commit-queue enabled-by-default false
devices global-settings commit-queue atomic true
devices global-settings commit-queue retry-timeout 30
devices global-settings commit-queue retry-attempts unlimited

If we start ce0 and inspect the queue, we will see that the queue will finally be empty and that the commit-queue status for the service is empty.

ncs# show devices commit-queue | notab
devices commit-queue queue-item 10777927137
 age       3357
 status    executing
 devices   [ ce0 ce1 pe0 ]
 transient ce0
  reason "Failed to connect to device ce0: connection refused"
 is-atomic true

ncs# show devices commit-queue | notab
devices commit-queue queue-item 10777927137
 age       3359
 status    executing
 devices   [ ce0 ce1 pe0 ]
 is-atomic true

ncs# show devices commit-queue
% No entries found.

ncs# show vpn l3vpn volvo commit-queue
% No entries found.

ncs# show devices commit-queue completed | notab
devices commit-queue completed queue-item 10777927137
 when               2015-02-09T16:48:17.915+00:00
 succeeded          true
 devices            [ ce0 ce1 pe0 ]
 completed          [ ce0 ce1 pe0 ]
 completed-services [ /l3vpn:vpn/l3vpn:l3vpn[l3vpn:name='volvo'] ]

Un-deploying Services

In some scenarios, it makes sense to remove the service configuration from the network but keep the representation of the service in NSO. This is called to un-deploy a service.

ncs# vpn l3vpn volvo check-sync
in-sync false
ncs# vpn l3vpn volvo re-deploy
ncs# vpn l3vpn volvo check-sync
in-sync true

Defining Your Own Services

Overview

To have NSO deploy services across devices, two pieces are needed:

A service model in YANG: the service model shall define the black-box view of a service; which are the input parameters given when creating the service? This YANG model will render an update of all NSO northbound interfaces, for example, the CLI.
Mapping, given the service input parameters, what is the resulting device configuration? This mapping can be defined in templates, code, or a combination of both.

Defining the Service Model

The first step is to generate a skeleton package for a service (for details, see Packages). Create a directory under, for example, ~/my-sim-iossimilar to how it is done for the 1-simulated-cisco-ios/ example. Make sure that you have stopped any running NSO and netsim.

Navigate to the simulated ios directory and create a new package for the VLAN service model:

$ cd examples.ncs/getting-started/using-ncs/1-simulated-cisco-ios/packages

If the packages folder does not exist yet, such as when you have not run this example before, you will need to invoke the ncs-setup and ncs-netsim create-network commands as described in the 1-simulated-cisco-ios README file.

The next step is to create the template skeleton by using the ncs-make-package utility:

$ ncs-make-package --service-skeleton template --root-container vlans --no-test  vlan

This results in a directory structure:

vlan
   package-meta-data.xml
   src
   templates

For now, let's focus on the src/yang/vlan.yang file.

            module vlan {
              namespace "http://com/example/vlan";
              prefix vlan;

              import ietf-inet-types {
                prefix inet;
              }
              import tailf-ncs {
                prefix ncs;
              }

              container vlans {
              list vlan {
                key name;

                uses ncs:service-data;
                ncs:servicepoint "vlan";

                leaf name {
                  type string;
                }

                // may replace this with other ways of refering to the devices.
                leaf-list device {
                  type leafref {
                    path "/ncs:devices/ncs:device/ncs:name";
                  }
                }

                // replace with your own stuff here
                leaf dummy {
                  type inet:ipv4-address;
                }
              }
              } // container vlans {
            }

If this is your first exposure to YANG, you can see that the modeling language is very straightforward and easy to understand. See RFC 7950 for more details and examples for YANG. The concept to understand in the above-generated skeleton is that the two lines of uses ncs:service-data and ncs:servicepoint "vlan" tells NSO that this is a service. The ncs:service-data grouping together with the ncs:servicepoint YANG extension provides the common definitions for a service. The two are implemented by the $NCS_DIR/src/ncs/yang/tailf-ncs-services.yang. So if a user wants to create a new VLAN in the network what should be the parameters? - A very simple service model would look like below (modify the src/yang/vlan.yang file):

  augment /ncs:services {
    container vlans {
      key name;

      uses ncs:service-data;
      ncs:servicepoint "vlan";
      leaf name {
        type string;
      }

      leaf vlan-id {
        type uint32 {
          range "1..4096";
        }
      }

      list device-if {
        key "device-name";
          leaf device-name {
            type leafref {
              path "/ncs:devices/ncs:device/ncs:name";
            }
          }
          leaf interface-type {
            type enumeration {
              enum FastEthernet;
              enum GigabitEthernet;
              enum TenGigabitEthernet;
            }
          }
          leaf interface {
            type string;
          }
      }
    }
}

This simple VLAN service model says:

We give a VLAN a name, for example, net-1, this must also be unique, it is specified as key.
The VLAN has an id from 1 to 4096.
The VLAN is attached to a list of devices and interfaces. To make this example as simple as possible the interface reference is selected by picking the type and then the name as a plain string.

The good thing with NSO is that already at this point you could load the service model to NSO and try if it works well in the CLI etc. Nothing would happen to the devices since we have not defined the mapping, but this is normally the way to iterate a model and test the CLI towards the network engineers.

To build this service model cd to $NCS_DIR/examples.ncs/getting-started/using-ncs/1-simulated-cisco-ios/packages/vlan/src and type make (assuming you have the make build system installed).

$ make

Go to the root directory of the simulated-ios example:

$ cd $NCS_DIR/examples.ncs/getting-started/using-ncs/1-simulated-cisco-ios

Start netsim, NSO, and the CLI:

$ncs-netsim start
$ncs --with-package-reload
$ncs_cli -C -u admin

When starting NSO above we give NSO a parameter to reload all packages so that our newly added vlan package is included. Packages can also be reloaded without restart. At this point we have a service model for VLANs, but no mapping of VLAN to device configurations. This is fine, we can try the service model and see if it makes sense. Create a VLAN service:

admin@ncs(config)# services vlan net-0 vlan-id 1234 \
device-if c0 interface-type FastEthernet interface 1/0
admin@ncs(config-device-if-c0)# top
admin@ncs(config)# show configuration
services vlan net-0
 vlan-id 1234
 device-if c0
  interface-type FastEthernet
  interface      1/0
 !
!
admin@ncs(config)# services vlan net-0 vlan-id 1234 \
device-if c1 interface-type FastEthernet interface 1/0
admin@ncs(config-device-if-c1)# top
admin@ncs(config)# show configuration
services vlan net-0
 vlan-id 1234
 device-if c0
  interface-type FastEthernet
  interface      1/0
 !
 device-if c1
  interface-type FastEthernet
  interface      1/0
 !
!
admin@ncs(config)# commit dry-run outformat cli
cli {
    local-node {
        data  services {
             +    vlan net-0 {
             +        vlan-id 1234;
             +        device-if c0 {
             +            interface-type FastEthernet;
             +            interface 1/0;
             +        }
             +        device-if c1 {
             +            interface-type FastEthernet;
             +            interface 1/0;
             +        }
             +    }
              }
    }
}
admin@ncs(config)# commit
Commit complete.
admin@ncs(config)# no services vlan
admin@ncs(config)# commit
Commit complete.

Committing service changes does not affect the devices since we have not defined the mapping. The service instance data will just be stored in NSO CDB.

Note that you get tab completion on the devices since they are leafrefs to device names in CDB, the same for interface-type since the types are enumerated in the model. However the interface name is just a string, and you have to type the correct interface name. For service models where there is only one device type like in this simple example, we could have used a reference to the ios interface name according to the IOS model. However that makes the service model dependent on the underlying device types and if another type is added, the service model needs to be updated and this is most often not desired. There are techniques to get tab completion even when the data type is a string, but this is omitted here for simplicity.

Make sure you delete the vlan service instance as above before moving on with the example.

Defining the Mapping

Now it is time to define the mapping from service configuration to actual device configuration. The first step is to understand the actual device configuration. Hard-wire the VLAN towards a device as example. This concrete device configuration is a boilerplate for the mapping, it shows the expected result of applying the service.

admin@ncs(config)# devices device c0 config ios:vlan 1234
admin@ncs(config-vlan)# top
admin@ncs(config)# devices device c0 config ios:interface \
                   FastEthernet 10/10 switchport trunk allowed vlan 1234
admin@ncs(config-if)# top
admin@ncs(config)# show configuration
devices device c0
 config
  ios:vlan 1234
  !
  ios:interface FastEthernet10/10
   switchport trunk allowed vlan 1234
  exit
 !
!
admin@ncs(config)# commit

The concrete configuration above has the interface and VLAN hard-wired. This is what we now will make into a template instead. It is always recommended to start like the above and create a concrete representation of the configuration the template shall create. Templates are device-configuration where parts of the config are represented as variables. These kinds of templates are represented as XML files. Show the above as XML:

admin@ncs(config)# show full-configuration devices device c0 \
                                 config ios:vlan | display xml
<config xmlns="http://tail-f.com/ns/config/1.0">
  <devices xmlns="http://tail-f.com/ns/ncs">
  <device>
    <name>c0</name>
      <config>
      <vlan xmlns="urn:ios">
        <vlan-list>
          <id>1234</id>
        </vlan-list>
      </vlan>
      </config>
  </device>
  </devices>
</config>

admin@ncs(config)# show full-configuration devices device c0 \
                                config ios:interface FastEthernet 10/10 | display xml
<config xmlns="http://tail-f.com/ns/config/1.0">
  <devices xmlns="http://tail-f.com/ns/ncs">
  <device>
    <name>c0</name>
      <config>
      <interface xmlns="urn:ios">
      <FastEthernet>
        <name>10/10</name>
        <switchport>
          <trunk>
            <allowed>
              <vlan>
                <vlans>1234</vlans>
              </vlan>
            </allowed>
          </trunk>
        </switchport>
      </FastEthernet>
      </interface>
      </config>
  </device>
  </devices>
</config>
admin@ncs(config)#

Now, we shall build that template. When the package was created a skeleton XML file was created in packages/vlan/templates/vlan.xml

<config-template xmlns="http://tail-f.com/ns/config/1.0"
                 servicepoint="vlan">
  <devices xmlns="http://tail-f.com/ns/ncs">
    <device>
      <!--
          Select the devices from some data structure in the service
          model. In this skeleton the devices are specified in a leaf-list.
          Select all devices in that leaf-list:
      -->
      <name>{/device}</name>
      <config>
        <!--
            Add device-specific parameters here.
            In this skeleton the service has a leaf "dummy"; use that
            to set something on the device e.g.:
            <ip-address-on-device>{/dummy}</ip-address-on-device>
        -->
      </config>
    </device>
  </devices>
</config-template>

We need to specify the right path to the devices. In our case, the devices are identified by /device-if/device-name (see the YANG service model).

For each of those devices, we need to add the VLAN and change the specified interface configuration. Copy the XML config from the CLI and replace it with variables:

<config-template xmlns="http://tail-f.com/ns/config/1.0"
                 servicepoint="vlan">
  <devices xmlns="http://tail-f.com/ns/ncs">
    <device>
      <name>{/device-if/device-name}</name>
      <config>
        <vlan xmlns="urn:ios">
          <vlan-list tags="merge">
            <id>{../vlan-id}</id>
          </vlan-list>
        </vlan>
        <interface xmlns="urn:ios">
          <?if {interface-type='FastEthernet'}?>
            <FastEthernet tags="nocreate">
              <name>{interface}</name>
              <switchport>
                <trunk>
                  <allowed>
                    <vlan tags="merge">
                      <vlans>{../vlan-id}</vlans>
                    </vlan>
                  </allowed>
                </trunk>
              </switchport>
            </FastEthernet>
          <?end?>
          <?if {interface-type='GigabitEthernet'}?>
            <GigabitEthernet tags="nocreate">
              <name>{interface}</name>
              <switchport>
                <trunk>
                  <allowed>
                    <vlan tags="merge">
                      <vlans>{../vlan-id}</vlans>
                    </vlan>
                  </allowed>
                </trunk>
              </switchport>
            </GigabitEthernet>
          <?end?>
          <?if {interface-type='TenGigabitEthernet'}?>
            <TenGigabitEthernet tags="nocreate">
              <name>{interface}</name>
              <switchport>
                <trunk>
                  <allowed>
                    <vlan tags="merge">
                      <vlans>{../vlan-id}</vlans>
                    </vlan>
                  </allowed>
                </trunk>
              </switchport>
            </TenGigabitEthernet>
          <?end?>
        </interface>
      </config>
    </device>
  </devices>
</config-template>

Walking through the template can give a better idea of how it works. For every /device-if/device-name from the service model do the following:

Add the VLAN to the VLAN list, the tag merge tells the template to merge the data into an existing list (the default is to replace).
For every interface within that device, add the VLAN to the allowed VLANs and set the mode to trunk. The tag nocreate tells the template to not create the named interface if it does not exist

It is important to understand that every path in the template above refers to paths from the service model in vlan.yang.

Request NSO to reload the packages:

admin@ncs# packages reload
reload-result {
    package cisco-ios
    result true
}
reload-result {
    package vlan
    result true
}

Previously we started NCS with a reload package option, the above shows how to do the same without starting and stopping NSO.

We can now create services that will make things happen in the network. (Delete any dummy service from the previous step first). Create a VLAN service:

admin@ncs(config)# services vlan net-0 vlan-id 1234 device-if c0 \
                                 interface-type FastEthernet interface 1/0
admin@ncs(config-device-if-c0)# top
admin@ncs(config)# services vlan net-0 device-if c1 \
                                 interface-type FastEthernet interface 1/0
admin@ncs(config-device-if-c1)# top
admin@ncs(config)# show configuration
services vlan net-0
 vlan-id 1234
 device-if c0
  interface-type FastEthernet
  interface      1/0
 !
 device-if c1
  interface-type FastEthernet
  interface      1/0
 !
!
admin@ncs(config)# commit dry-run outformat native
native {
    device {
        name c0
        data interface FastEthernet1/0
              switchport trunk allowed vlan 1234
             exit
    }
    device {
        name c1
        data vlan 1234
             !
             interface FastEthernet1/0
              switchport trunk allowed vlan 1234
             exit
    }
}
admin@ncs(config)# commit
Commit complete.

When working with services in templates, there is a useful debug option for commit which will show the template and XPATH evaluation.

admin@ncs(config)# commit | debug
Possible completions:
 template   Display template debug info
 xpath      Display XPath debug info
admin@ncs(config)# commit | debug template

We can change the VLAN service:

admin@ncs(config)# services vlan net-0 vlan-id 1222
admin@ncs(config-vlan-net-0)# top
admin@ncs(config)# show configuration
services vlan net-0
 vlan-id 1222
!
admin@ncs(config)# commit dry-run outformat native
native {
    device {
        name c0
        data no vlan 1234
             vlan 1222
             !
             interface FastEthernet1/0
              switchport trunk allowed vlan 1222
             exit
    }
    device {
        name c1
        data no vlan 1234
             vlan 1222
             !
             interface FastEthernet1/0
              switchport trunk allowed vlan 1222
             exit
    }
}

It is important to understand what happens above. When the VLAN ID is changed, NSO can calculate the minimal required changes to the configuration. The same situation holds true for changing elements in the configuration or even parameters of those elements. In this way, NSO does not need explicit mapping to define a VLAN change or deletion. NSO does not overwrite a new configuration on the old configuration. Adding an interface to the same service works the same:

admin@ncs(config)# services vlan net-0 device-if c2 interface-type FastEthernet interface 1/0
admin@ncs(config-device-if-c2)# top
admin@ncs(config)# commit dry-run outformat native
native {
    device {
        name c2
        data vlan 1222
             !
             interface FastEthernet1/0
              switchport trunk allowed vlan 1222
             exit
    }
}
admin@ncs(config)# commit
Commit complete.

To clean up the configuration on the devices, run the delete command as shown below:

admin@ncs(config)# no services vlan net-0
admin@ncs(config)# commit dry-run outformat native
native {
    device {
        name c0
        data no vlan 1222
             interface FastEthernet1/0
              no switchport trunk allowed vlan 1222
             exit
    }
    device {
        name c1
        data no vlan 1222
             interface FastEthernet1/0
              no switchport trunk allowed vlan 1222
             exit
    }
    device {
        name c2
        data no vlan 1222
             interface FastEthernet1/0
              no switchport trunk allowed vlan 1222
             exit
    }
}
admin@ncs(config)# commit
Commit complete.

To make the VLAN service package complete edit the package-meta-data.xml to reflect the service model purpose. This example showed how to use template-based mapping. NSO also allows for programmatic mapping and also a combination of the two approaches. The latter is very flexible if some logic needs to be attached to the service provisioning that is expressed as templates and the logic applies device agnostic templates.

Reactive FASTMAP and Nano Services

FASTMAP is the NSO algorithm that renders any service change from the single definition of the create service. As seen above, the template or code only has to define how the service shall be created, NSO is then capable of defining any change from that single definition.

A limitation in the scenarios described so far is that the mapping definition could immediately do its work as a single atomic transaction. This is sometimes not possible. Typical examples are external allocation of resources such as IP addresses from an IPAM, spinning up VMs, and sequencing in general.

Nano services using Reactive FASTMAP handle these scenarios with an executable plan that the system can follow to provision the service. The general idea is to implement the service as several smaller (nano) steps or stages, by using reactive FASTMAP and provide a framework to safely execute actions with side effects.

The example in examples.ncs/development-guide/nano-services/netsim-sshkey implements key generation to files and service deployment of the key to set up network elements and NSO for public key authentication to illustrate this concept. The example is described in more detail in Develop and Deploy a Nano Service.

Reconciling Existing Services

A very common situation when we wish to deploy NSO in an existing network is that the network already has existing services implemented in the network. These services may have been deployed manually or through another provisioning system. The task is to introduce NSO and import the existing services into NSO. The goal is to use NSO to manage existing services, and to add additional instances of the same service type, using NSO. This is a non-trivial problem since existing services may have been introduced in various ways. Even if the service configuration has been done consistently it resembles the challenges of a general solution for rendering a corresponding C-program from assembler.

One of the prerequisites for this to work is that it is possible to construct a list of the already existing services. Maybe such a list exists in an inventory system, an external database, or maybe just an Excel spreadsheet. It may also be the case that we can:

Import all managed devices into NSO.
Execute a full sync-from on the entire network.
Write a program, using Python/Maapi or Java/Maapi that traverses the entire network configuration and computes the services list.

The first thing we must do when we wish to reconcile existing services is to define the service YANG model. The second thing is to implement the service mapping logic and do it in such a way that given the service input parameters when we run the service code, they would all result in a configuration that is already there in the existing network.

The basic principles for reconciliation are:

Read the device configuration to NSO using the sync-from action. This will get the device configuration that is a result of any existing services as well.
Instantiate the services according to the principles above.

Performing the above actions with the default behavior would not render the correct reference counters since NSO did not create the original configuration. The service activation can be run with dedicated flags to take this into account. See the NSO User Guide for a detailed process.

Brown-field Networks

In many cases, a service activation solution like NSO is deployed in parallel with existing activation solutions. It is then desirable to make sure that NSO does not conflict with the device configuration rendered from the existing solution.

NSO has a commit flag that will restrict the device configuration to not overwrite data that NSO did not create: commit no-overwrite

Advanced Services Orchestration

Some services need to be set up in stages where each stage can consist of setting up some device configuration and then waiting for this configuration to take effect before performing the next stage. In this scenario, each stage must be performed in a separate transaction which is committed separately. Most often an external notification or other event must be detected and trigger the next stage in the service activation.

NSO supports the implementation of such staged services with the use of Reactive FASTMAP patterns in nano services.

From the user's perspective, it is not important how a certain service is implemented. The implementation should not have an impact on how the user creates or modifies a service. However, knowledge about this can be necessary to explain the behavior of a certain service.

In short the life-cycle of an RFM nano service in not only controlled by the direct create/set/delete operations. Instead, there are one or many implicit reactive-re-deploy requests on the service that are triggered by external event detection. If the user examines an RFM service, e.g. using get-modification, the device impact will grow over time after the initial create.

Nano Service Plans

Nano services autonomously will do reactive-re-deploy until all stages of the service are completed. This implies that a nano service normally is not completed when the initial create is committed. For the operator to understand that a nano service has run to completion there must typically be some service-specific operational data that can indicate this.

Plans are introduced to standardize the operational data that can show the progress of the nano service. This gives the user a standardized view of all nano services and can directly answer the question of whether a service instance has run to completion or not.

A plan consists of one or many component entries. Each component consists of two or many state entries where the state can be in status not-reached, reached, or failed. A plan must have a component named self and can have other components with arbitrary names that have meaning for the implementing nano service. A plan component must have a first state named init and a last state named ready. In between init and ready, a plan component can have additional state entries with arbitrary naming.

The purpose of the self component is to describe the main progress of the nano service as a whole. Most importantly the self component last state named ready must have the status reached if and only if the nano service as a whole has been completed. Other arbitrary components as well as states are added to the plan if they have meaning for the specific nano service i.e. more specific progress reporting.

A plan also defines an empty leaf failed which is set if and only if any state in any component has a status set to failed. As such this is an aggregation to make it easy to verify if a RFM service is progressing without problems or not.

The following is an illustration of using the plan to report the progress of a nano service:

ncs# show vpn l3vpn volvo plan
NAME                    TYPE   STATE              STATUS       WHEN
------------------------------------------------------------------------------------
self                    self   init               reached      2016-04-08T09:22:40
                               ready              not-reached  -
endpoint-branch-office  l3vpn  init               reached      2016-04-08T09:22:40
                               qos-configured     reached      2016-04-08T09:22:40
                               ready              reached      2016-04-08T09:22:40
endpoint-head-office    l3vpn  init               reached      2016-04-08T09:22:40
                               pe-created         not-reached  -
                               ce-vpe-topo-added  not-reached  -
                               vpe-p0-topo-added  not-reached  -
                               qos-configured     not-reached  -
                               ready              not-reached  -

Service Progress Monitoring

Plans were introduced to standardize the operational data that show the progress of reactive fastmap (RFM) nano services. This gives the user a standardized view of all nano services and can answer the question of whether a service instance has run to completion or not. To keep track of the progress of plans, Service Progress Monitoring (SPM) is introduced. The idea with SPM is that time limits are put on the progress of plan states. To do so, a policy and a trigger are needed.

A policy defines what plan components and states need to be in what status for the policy to be true. A policy also defines how long time it can be false without being considered jeopardized and how long time it can be false without being considered violated. Further, it may define an action, that is called in case of a policy being jeopardized, violated, or successful.

A trigger is used to associate a policy with a service and a component.

The following is an illustration of using an SPM to track the progress of an RFM service, in this case, the policy specifies that the self-components ready state must be reached for the policy to be true:

ncs# show vpn l3vpn volvo service-progress-monitoring
                                                               JEOPARDY                       VIOLATION           SUCCESS
NAME  POLICY         START TIME           JEOPARDY TIME        RESULT    VIOLATION TIME       RESULT     STATUS   TIME
---------------------------------------------------------------------------------------------------------------------------
self  service-ready  2016-04-08T09:22:40  2016-04-08T09:22:40  -         2016-04-08T09:22:40  -          running  -

NSO Device Manager

Learn the concepts of NSO device management.

The NEDs fall into the following categories:

NETCONF-capable device: The Device Manager will produce NETCONF edit-config RPC operations for each participating device.
SNMP device: The Device Manager translates the changes made to the configuration into the corresponding SNMP SET PDUs.
Device with Cisco CLI: The device has a CLI with the same structure as Cisco IOS or XR routers. The Device Manager and a CLI NED are used to produce the correct sequence of CLI commands which reflects the changes made to the configuration.
Other devices: For devices that do not fit into any of the above-mentioned categories, a corresponding Generic NED is invoked. Generic NEDs are used for proprietary protocols like REST and for CLI flavors that do not resemble IOS or XR. The Device Manager will inform the Generic NED about the made changes and the NED will translate these to the appropriate operations toward the device.

To understand the main idea behind the NSO device manager it is necessary to understand the NSO data model and how NSO incorporates the YANG data models from the different managed devices.

Managed Device Tree

The central part of the NSO YANG model, in the file tailf-ncs-devices.yang, has the following structure:

tailf-ncs-devices.yang

submodule tailf-ncs-devices {
  belongs-to tailf-ncs {
    prefix ncs;
  }
  ...
  container devices {
    ......
    list device {
      key name;

      description
        "This list contains all devices managed by NCS.";

      leaf name {
        type string;
        description
          "A string uniquely identifying the managed device.";
      }

      leaf address {
        type inet:host;
        mandatory true;
        description
          "IP address or host name for the management interface on
           the device.";
      }
      leaf port {
        type inet:port-number;
        description
          "Port for the management interface on the device.  If this leaf
           is not configured, NCS will use a default value based on the
           type of device.  For example, a NETCONF device uses port 830,
           a CLI device over SSH uses port 22, and a SNMP device uses
           port 161.";
      }
      ....
      leaf authgroup {
        ....
      }
      container device-type {
      .......
      container config {
         ...
      }
  }
}

The following device types are available:

NETCONF
CLI: A corresponding CLI NED is needed to communicate with the device. This requires YANG models with the appropriate annotations for the device CLI.
SNMP: The device speaks SNMP, preferably in read-write mode.
Generic NED: A corresponding Generic NED is needed to communicate with the device. This requires YANG models and Java code.

The NSO CLI command below lists the NED types for the devices in the example network.

ncs(config)# show full-configuration devices device device-type
devices device ce0
 device-type cli ned-id cisco-ios-cli-3.8
!
...
devices device p0
 device-type cli ned-id cisco-iosxr-cli-3.5
!
devices device p1
 device-type cli ned-id cisco-iosxr-cli-3.5
!
...
devices device pe2
 device-type netconf ned-id juniper-junos-nc-3.0
!

The empty container /ncs:devices/device/config is used as a mount point for the YANG models from the different managed devices.

As previously mentioned, NSO needs the following information to manage a device:

The IP/Port of the device and authentication information.
Some or all of the YANG data models for the device.

Once NSO has started you can inspect the meta information for the managed devices through the NSO CLI. This is an example session:

Example: Show Device Configuration in NSO CLI

ncs(config)# show full-configuration devices device
devices device ce0
 address   127.0.0.1
 port      10022
 ssh host-key ssh-dss
 ...
 authgroup default
 device-type cli ned-id cisco-ios-cli-3.8
 state admin-state unlocked
 config
 ...
 !
!
devices device ce1
 address   127.0.0.1
 port      10023
 ssh host-key ssh-dss
...
 !
 authgroup default
 device-type cli ned-id cisco-ios-cli-3.8
 state admin-state unlocked
 config
 ...
 !
!

Alternatively, this information could be retrieved from the NSO northbound NETCONF interface by running the simple Python-based netconf-console program towards the NSO NETCONF server.

Example: Show Device Configuration in NETCONF

$ netconf-console --get-config -x "/devices/device[name='ce0']"
<?xml version="1.0" encoding="UTF-8"?>
<rpc-reply xmlns="urn:ietf:params:xml:ns:netconf:base:1.0" message-id="1">
  <data>
    <devices xmlns="http://tail-f.com/ns/ncs">
      <device>
        <name>ce0</name>
        <address>127.0.0.1</address>
        <port>10022</port>
        <ssh>
          <host-key>
            <algorithm>ssh-dss</algorithm>

            ...

        <authgroup>default</authgroup>
        <device-type>
          <cli>
          <ned-id xmlns:cisco-ios-cli-3.8="http://tail-f.com/ns/ned-id/cisco-ios-cli-3.8">
            cisco-ios-cli-3.8:cisco-ios-cli-3.8
          </ned-id>
          </cli>
        </device-type>
        <state>
          <admin-state>unlocked</admin-state>
        </state>
        <config>

        ...

        </config>
      </device>
    </devices>
  </data>
</rpc-reply>

The NED Packages

Example: Installed Packages

ncs# show packages
packages package cisco-ios-cli-3.8
 package-version 3.8.0.1
 description     "NED package for Cisco IOS"
 ncs-min-version [ 3.2.2 3.3 3.4 ]
 directory       ./state/packages-in-use/1/cisco-ios-cli-3.8
 component IOSDp2
  callback java-class-name [ com.tailf.packages.ned.ios.IOSDp2 ]
 component IOSDp
  callback java-class-name [ com.tailf.packages.ned.ios.IOSDp ]
 component cisco-ios
  ned cli ned-id  cisco-ios-cli-3.8
  ned cli java-class-name com.tailf.packages.ned.ios.IOSNedCli
  ned device vendor Cisco
 ...
 oper-status up
packages package cisco-iosxr-cli-3.5
 package-version 3.5.0.7
 description     "NED package for Cisco IOS XR"
 ncs-min-version [ 3.2.2 3.3 ]
 directory       ./state/packages-in-use/1/cisco-iosxr-cli-3.5
 component cisco-ios-xr
  ned cli ned-id  cisco-iosxr-cli-3.5
  ned cli java-class-name com.tailf.packages.ned.iosxr.IosxrNedCli
  ned device vendor Cisco
 ...
 oper-status up
packages package juniper-junos-nc-3.0
 package-version 3.0.14.2
 description     "NED package for all JunOS based Juniper routers"
 ncs-min-version [ 3.0.0.1 3.1 3.2 3.3 3.4 ]
 directory       ./state/packages-in-use/1/juniper-junos-nc-3.0
 component junos
  ned netconf ned-id juniper-junos-nc-3.0
  ned device vendor Juniper
 oper-status up
 ...

$ ls -l $NCS_DIR/examples.ncs/service-provider/mpls-vpn
total 160
...
drwxr-xr-x   8 stefan  staff    272 Oct  1 16:57 packages
...
$ ls -l $NCS_DIR/examples.ncs/service-provider/mpls-vpn/packages
total 24
cisco-ios
cisco-iosxr
juniper-junos
...

Starting the NSO Daemon

You start the ncs daemon in a terminal like:

% ncs

Which is the same as, NSO loads it config from a ncs.conf file

% ncs -c ./ncs.conf

During development, it is sometimes convenient to run ncs in the foreground as:

% ncs -c ./ncs.conf --foregound --verbose

Once the daemon is running, you can issue the command:

% ncs --status
vsn: 7.1
SMP support: yes, using 8 threads
Using epoll: yes
available modules: backplane,netconf,cdb,cli,snmp,webui
...
... lots of output

To get more information about options to ncs do:

% ncs --help

The ncs --status command produces a lengthy list describing for example which YANG modules are loaded in the system. This is a valuable debug tool.

The same information is also available in the NSO CLI (and thus through all available northbound interfaces, including Maapi for Java programmers)

ncs# show ncs-state
ncs-state version 7.1
ncs-state smp number-of-threads 8
ncs-state epoll true
ncs-state daemon-status started
...

Synchronizing Devices

When the NSO daemon is running and has been initialized with IP/Port and authentication information as well as imported all modules you can start to manage devices through NSO.

In the normal case, the configuration on the device and the copy of the configuration inside NSO should be identical.

Example: Synchronize From Devices

ncs(config)# devices sync-from
sync-result {
    device ce0
    result true
}
sync-result {
    device ce1
    result true
}
sync-result {
    device ce2
    result true
...
ncs(config)# show full-configuration devices device ce0
devices device ce0
...
 config
  no ios:service pad
  no ios:ip domain-lookup
  no ios:ip http secure-server
  ios:ip source-route
  ios:interface GigabitEthernet0/1
  exit
  ios:interface GigabitEthernet0/10
  exit
  ios:interface GigabitEthernet0/11
  exit
...
[ok][2010-04-13 16:29:15]

$ls $NCS_DIR/src/ncs/yang/

All packages comes with YANG files as well. For example the directory packages/cisco-ios/src/yang/ contains the YANG definition of an IOS device.

The tailf-ncs.yang is the main part of the NSO YANG data model. The file mode tailf-ncs.yang includes all parts of the model from different files.

The actions sync-from and sync-to are modeled in the file tailf-ncs-devices.yang. The sync action(s) are defined as:

Example: tailf-ncs-devices.yang sync actions

  grouping sync-from-output {
    list sync-result {
      key device;
      leaf device {
        type leafref {
          path "/devices/device/name";
        }
      }
      uses sync-result;
    }
  }

  grouping sync-result {
    description
      "Common result data from a 'sync' action.";

    choice outformat {
      leaf result {
        type boolean;
      }
      anyxml result-xml;
      leaf cli {
        tailf:cli-preformatted;
        type string;
      }
    }
    leaf info {
      type string;
      description
        "If present, contains additional information about the result.";
    }
  }

  ...

  container devices {

    ...

    tailf:action sync-from {
      description
        "Synchronize the configuration by pulling from all unlocked
         devices.";
      tailf:info "Synchronize the config by pulling from the devices";
      tailf:actionpoint ncsinternal {
        tailf:internal;
      }
      input {
        leaf suppress-positive-result {
          type empty;
          description
            "Use this additional parameter to only return
             devices that failed to sync.";
        }
        container dry-run {
          presence "";
          leaf outformat {
            type outformat2;
            description
              "Report what would be done towards CDB, without
               actually doing anything.";
          }
        }
      }
      output {
        uses sync-from-output;
      }
    }

    ...

    tailf:action sync-to {
      ...
    }

    ...

    list device {
      description
        "This list contains all devices managed by NCS.";

      key name;

      leaf name {
        description "A string uniquely identifying the managed device";
        type string;
      }

      ...

      tailf:action sync-from {
        description
          "Synchronize the configuration by pulling from the device.";
        tailf:info "Synchronize the config by pulling from the device";
        tailf:actionpoint ncsinternal {
          tailf:internal;
        }
        input {
          container dry-run {
            presence "";
            leaf outformat {
              type outformat2;
              description
                "Report what would be done towards CDB, without
                 actually doing anything.";
            }
          }
        }
        output {
          uses sync-result;
        }
      }
      tailf:action sync-to {

      ...

The command to do that is:

ncs# devices device ce0 sync-to
result true

A dry-run option is available for the action sync-to.

ncs# devices device ce0 sync-to dry-run
data {
    ...
}

This makes it possible to investigate the changes before they are transmitted to the devices.

Partial `sync-from`

It is possible to synchronize a part of the configuration (a certain subtree) from the device using the partial-sync-from action located under /devices. While it is primarily intended to be used by service developers as described in , it is also possible to use directly from the NSO CLI (or any other northbound interface). The example below (Example of Running partial-sync-from Action via CLI) illustrates using this action via CLI, using a router device from examples.ncs/getting-started/developing-with-ncs/0-router-network.

Example: Example of Running partial-sync-from Action via CLI

$ ncs_cli -C -u admin
ncs# devices partial-sync-from path [ \
/devices/device[name='ex0']/config/r:sys/interfaces/interface[name='eth0'] \
/devices/device[name='ex1']/config/r:sys/dns/server ]
sync-result {
    device ex0
    result true
}
sync-result {
    device ex1
    result true
}
ncs# show running-config devices device ex0..1 config
devices device ex0
 config
  r:sys interfaces interface eth0
   unit 0
    enabled
   !
   unit 1
    enabled
   !
   unit 2
    enabled
    description "My Vlan"
    vlan-id     18
   !
  !
 !
!
devices device ex1
 config
  r:sys dns server 10.2.3.4
  !
 !
!

Configuring Devices

It is now possible to configure several devices through the NSO inside the same network transaction. To illustrate this, start the NSO CLI from a terminal application.

Example: Configure Devices

$ ncs_cli -C -u admin
ncs# config
Entering configuration mode terminal
ncs(config)# devices device pe1 config cisco-ios-xr:snmp-server \
     community public RO
ncs(config-config)# top
ncs(config)# devices device ce0 config ios:snmp-server community public RO
ncs(config-config)# devices device pe2 config junos:configuration \
      snmp community public view RO
ncs(config-community-public)# top
ncs(config)# show configuration
devices device ce0
 config
  ios:snmp-server community public RO
 !
!
devices device pe1
 config
  cisco-ios-xr:snmp-server community public RO
 !
!
devices device pe2
 config
  ! first
  junos:configuration snmp community public
   view RO
  !
 !
!
ncs(config)# commit dry-run outformat native
native {
    device {
        name ce0
        data snmp-server community public RO
    }
    device {
        name pe1
        data snmp-server community public RO
    }
    device {
        name pe2
        data <rpc xmlns="urn:ietf:params:xml:ns:netconf:base:1.0"
                  message-id="1">
               <edit-config xmlns:nc="urn:ietf:params:xml:ns:netconf:base:1.0">
                 <target>
                   <candidate/>
                 </target>
                 <test-option>test-then-set</test-option>
                 <error-option>rollback-on-error</error-option>
                 <config>
                   <configuration xmlns="http://xml.juniper.net/xnm/1.1/xnm">
                     <snmp>
                       <community>
                         <name>public</name>
                         <view>RO</view>
                       </community>
                     </snmp>
                   </configuration>
                 </config>
               </edit-config>
             </rpc>
    }
}
ncs(config)# commit

As seen from the output of the command commit dry-run outformat native, NSO generates the native CLI and NETCONF commands which will be sent to each device when the transaction is committed.

Connection Management

Each managed device needs to be configured with the IP address and the port where the CLI, NETCONF server, etc. of the managed device listens for incoming requests.

ncs# devices connect
connect-result {
    device ce0
    result true
    info (admin) Connected to ce0 - 127.0.0.1:10022
}
connect-result {
    device ce1
    result true
    info (admin) Connected to ce1 - 127.0.0.1:10023
}
...

We were able to connect to all managed devices. It is also possible to explicitly attempt to test connections to individual managed devices:

ncs# devices device ce0 connect
result true
info (admin) Connected to ce0 - 127.0.0.1:10022

Established connections are typically not closed right away when not needed, but rather pooled according to the rules described in . This applies to NETCONF sessions as well as sessions established by CLI or generic NEDs via a connection-oriented protocol. In addition to session pooling, underlying SSH connections for NETCONF devices are also reused. Note that a single NETCONF session occupies one SSH channel inside an SSH connection, so multiple NETCONF sessions can co-exist in a single connection. When an SSH connection has been idle (no SSH channels open) for 2 minutes, the SSH connection is closed. If a new connection is needed later, a connection is established on demand.

submodule tailf-ncs-devices {
  ...
  container devices {
    ...
    grouping timeouts {
      description
        "Timeouts used when communicating with a managed device.";

      leaf connect-timeout {
        type uint32;
        units "seconds";
        description
          "The timeout in seconds for new connections to managed
           devices.";
      }
      leaf read-timeout {
        type uint32;
        units "seconds";
        description
          "The timeout in seconds used when reading data from a
           managed device.";
      }
      leaf write-timeout {
        type uint32;
        units "seconds";
        description
          "The timeout in seconds used when writing data to a
           managed device.";
      }
    }
    ...
    container global-settings {
      ...
      uses timeouts {
        description
          "These timeouts can be overridden per device.";

        refine connect-timeout {
          default 20;
        }
        refine read-timeout {
          default 20;
        }
        refine write-timeout {
          default 20;
        }
      }
      ....

Thus, to change these parameters (globally for all managed devices) you do:

ncs(config)# devices global-settings connect-timeout 30
ncs(config)# devices global-settings read-timeout 30
ncs(config)# commit

Or, to use a profile:

ncs(config)# devices profiles profile slow-devices connect-timeout 60
ncs(config-profile-slow-devices)# read-timeout 60
ncs(config-profile-slow-devices)# write-timeout 60
ncs(config-profile-slow-devices)# commit

ncs(config)# devices device ce3 device-profile slow-devices
ncs(config-device-ce3)# commit

Authentication Groups

When NSO connects to a managed device, it requires authentication information for that device. The authgroups are modeled in the NSO data model:

Example: tailf-ncs-devices.yang - Authgroups

submodule tailf-ncs-devices {
  ...
  container devices {
    ...

    container authgroups {
      description
        "Named authgroups are used to decide how to map a local NCS user to
         remote authentication credentials on a managed device.

         The list 'group' is used for NETCONF and CLI managed devices.

         The list 'snmp-group' is used for SNMP managed devices.";

      list group {
        key name;

        description
          "When NCS connects to a managed device, it locates the
           authgroup configured for that device.  Then NCS looks up
           the local NCS user name in the 'umap' list.  If an entry is
           found, the credentials configured is used when
           authenticating to the managed device.

           If no entry is found in the 'umap' list, the credentials
           configured in 'default-map' are used.

           If no 'default-map' has been configured, and the local NCS
           user name is not found in the 'umap' list, the connection
           to the managed device fails.";

        grouping remote-user-remote-auth {
          description
            "Remote authentication credentials.";

          choice login-credentials {
            mandatory true;
            case stored {
              choice remote-user {
                mandatory true;
                leaf same-user {
                  type empty;
                  description
                    "If this leaf exists, the name of the local NCS user is used
                     as the remote user name.";
                }
                leaf remote-name {
                  type string;
                  description
                    "Remote user name.";
                }
              }

              choice remote-auth {
                mandatory true;
                leaf same-pass {
                  type empty;
                  description
                    "If this leaf exists, the password used by the local user
                     when logging in to NCS is used as the remote password.";
                }
                leaf remote-password {
                  type tailf:aes-256-cfb-128-encrypted-string;
                  description
                    "Remote password.";
                }
                case public-key {
                  uses public-key-auth;
                }
              }
              leaf remote-secondary-password {
                type tailf:aes-256-cfb-128-encrypted-string;
                description
                  "Some CLI based devices require a second
                   additional password to enter config mode";
              }
            }
            case callback {
              leaf callback-node {
                description
                  "Invoke a standalone action to retrieve login credentials for
                  managed devices on the 'callback-node' instance.

                  The 'action-name' action is invoked on the callback node that
                  is specified by an instance identifer.";
                mandatory true;
                type instance-identifier;
              }
              leaf action-name {
                description
                  "The action to call when a notification is received.

                  The action must use 'authgroup-callback-input-params'
                  grouping for input and 'authgroup-callback-output-params'
                  grouping for output from tailf-ncs-devices.yang.";
                type yang:yang-identifier;
                mandatory true;
                tailf:validate ncs {
                   tailf:internal;
                   tailf:dependency "../callback-node";
                }
              }
            }
          }
        }

        grouping mfa-grouping {
          container mfa {
            presence "MFA";
            description
              "Settings for handling multi-factor authentication towards
               the device";
            leaf executable {
              description "Path to the external executable handling MFA";
              type string;
              mandatory true;
            }
            leaf opaque {
              description
                "Opaque data for the external MFA executable.
                 This string will be base64 encoded and passed to the MFA
                 executable along with other parameters";
              type string;
            }
          }
        }

        leaf name {
          type string;
          description
            "The name of the authgroup.";
        }

        container default-map {
          presence "Map unknown users";
          description
            "If an authgroup has a default-map, it is used if a local
             NCS user is not found in the umap list.";
          tailf:info "Remote authentication parameters for users not in umap";
          uses remote-user-remote-auth;
          uses mfa-grouping;
        }

        list umap {
          key local-user;
          description
            "The umap is a list with the local NCS user name as key.
             It maps the local NCS user name to remote authentication
             credentials.";
          tailf:info "Map NCS users to remote authentication parameters";
          leaf local-user {
            type string;
            description
              "The local NCS user name.";
          }
          uses remote-user-remote-auth;
          uses mfa-grouping;
        }
      }

Each managed device must refer to a named authgroup. The purpose of an authentication group is to map local users to remote users together with the relevant SSH authentication information.

Example: Configured authgroup

ncs(config)# show full-configuration devices authgroups
devices authgroups group default
 umap admin
  remote-name     admin
  remote-password $4$wIo7Yd068FRwhYYI0d4IDw==
 !
 umap oper
  remote-name     oper
  remote-password $4$zp4zerM68FRwhYYI0d4IDw==
 !
!
devices authgroups snmp-group default
 default-map community-name public
 umap admin
  usm remote-name admin
  usm security-level auth-priv
  usm auth md5 remote-password $4$wIo7Yd068FRwhYYI0d4IDw==
  usm priv des remote-password $4$wIo7Yd068FRwhYYI0d4IDw==
 !
!

Example: authgroup default-map

ncs(config)# devices authgroups group default default-map same-user same-pass
ncs(config-group-default)# commit
Commit complete.
ncs(config-group-default)# top
ncs(config)# show full-configuration devices authgroups
devices authgroups group default
 default-map same-user
 default-map same-pass
 umap admin
  remote-name     admin
  remote-password $4$wIo7Yd068FRwhYYI0d4IDw==
 !
 umap oper
  remote-name     oper
  remote-password $4$zp4zerM68FRwhYYI0d4IDw==
 !
!
devices authgroups snmp-group default
 default-map community-name public
 umap admin
  usm remote-name admin
  usm security-level auth-priv
  usm auth md5 remote-password $4$wIo7Yd068FRwhYYI0d4IDw==
  usm priv des remote-password $4$wIo7Yd068FRwhYYI0d4IDw==
 !
!

Regular password.
Public key. This means that a private key, either from a file in the user's SSH key directory, or one that is configured in the /ssh/private-key list in the NSO configuration, is used for authentication. Refer to for the details on how the private key is selected.
Finally, an interesting option is to use the 'same-pass' option. Since NSO runs its own SSH server and its own SSL server, NSO can pick up the password of a user in clear text. Hence, if the 'same-pass' option is chosen for an authgroup, NSO will reuse the same password when attempting to connect southbound to a managed device.

Connecting Using SSH Keyboard-Interactive (Multi-Factor) Authentication

Example: Configuring Authgroup For Keyboard-interactive Authentication

admin@ncs(config)# devices authgroups group mfa umap admin
admin@ncs(config-umap-admin)# remote-name admin remote-password
(<AES encrypted string>): *********
admin@ncs(config-umap-admin)# mfa executable ./handle_mfa.py opaque foobar
admin@ncs(config-umap-admin)# commit
Commit complete.

[ZGV2MA==;YWRtaW4=;YWRtaW4=;Zm9vYmFy;;;YWRtaW5AbG9jYWxob3N0J3MgcGFzc3dvcmQ6IA==;]

[dev0;admin;admin;foobar;;;admin@localhost's password:;]

A small Python program can be used to implement the keyboard-interactive authentication towards a device, such as:

#!/usr/bin/env python3
import base64
line = input()
(device, user, passwd, opaque, name, instr, prompt, _) = map(
        lambda x: base64.b64decode(x).decode('utf-8'),
        line.strip('[]').split(';'))
if prompt == "admin@localhost's password: ":
    print(passwd)
elif prompt == "Enter SMS passcode:":
    print("secretSMScode")
else:
    print("2")

This script will then be invoked with the above fields for every prompt from the server, and the corresponding output from the script will be sent as the reply to the server.

Using a Callback to Provide Device Credentials

With remote passwords, you may encounter issues if you use special characters, such as quotes (") and backslash (\) in your password. See for recommendations on how to avoid running into password issues.

Example: authgroup-callback

ncs(config)# devices authgroups group default umap oper
ncs(config-umap-oper)# callback-node /callback action-name auth-cb
ncs(config-group-oper)# commit
Commit complete.
ncs(config-group-oper)# top
ncs(config)# show full-configuration devices authgroups
devices authgroups group default
 default-map same-user
 default-map same-pass
 umap admin
  remote-name     admin
  remote-password $4$wIo7Yd068FRwhYYI0d4IDw==
 !
 umap oper
  callback-node /callback
  action-name   auth-cb
 !
!
devices authgroups snmp-group default
 default-map community-name public
 umap admin
  usm remote-name admin
  usm security-level auth-priv
  usm auth md5 remote-password $4$wIo7Yd068FRwhYYI0d4IDw==
  usm priv des remote-password $4$wIo7Yd068FRwhYYI0d4IDw==
 !
!

Example: authgroup-callback.yang

module authgroup-callback {
  namespace "http://com/example/authgroup-callback";
  prefix authgroup-callback;

  import tailf-common {
    prefix tailf;
  }

  import tailf-ncs {
    prefix ncs;
  }

  container callback {
    description
      "Example callback that defines an action to retrieve
       remote authentication credentials";
    tailf:action auth-cb {
      tailf:actionpoint auth-cb-point;
      input {
        uses ncs:authgroup-callback-input-params;
      }
      output {
        uses ncs:authgroup-callback-output-params;
      }
    }
  }
}

In the example above (authgroup-callback), the configuration for the umap entry of the oper user is changed to use a callback to retrieve southbound authentication credentials. Thus, NSO is going to invoke the action auth-cb defined in the callback-node callback. The callback node is of type instance-identifier and refers to the container called callback defined in the example, (authgroup-callback.yang), which includes an action defined by action-name auth-cb and uses groupings authgroup-callback-input-params and authgroup-callback-output-params for input and output parameters respectively. In the example, (authgroup-callback), authgroup-callback module was loaded in NSO within an example package. Package development and action callbacks are not described here but more can be read in , the section called and .

Caveats

Authentication groups and the functionality they bring come with some limitations on where and how it is used.

The callback option that enables authgroup-callback feature is not applicable for members of snmp-group list.
Generic devices that implement their own authentication scheme do not use any mapping or callback functionality provided by Authgroups.
Cluster nodes use their own authgroups and mapping model, thus functionality differs, e.g. callback option is not applicable.

Device Session Pooling

Changes from the default configuration of the NSO session pool should only be performed when absolutely necessary and when all effects of the change are understood.

NSO presents operational data that represent the current state of the session pool. To visualize this, we use the CLI to connect to NSO and force connection to all known devices:

$ ncs_cli -C -u admin

admin connected from 127.0.0.1 using console on ncs
ncs# devices connect suppress-positive-result

We can now list all open sessions in the session-pool. But note that this is a live pool. Sessions will only remain open for a certain amount of time, the idle time.

ncs# show devices session-pool
        DEVICE            MAX        IDLE
DEVICE  TYPE    SESSIONS  SESSIONS   TIME
-------------------------------------------
ce0     cli     1         unlimited  30
ce1     cli     1         unlimited  30
ce2     cli     1         unlimited  30
ce3     cli     1         unlimited  30
ce4     cli     1         unlimited  30
ce5     cli     1         unlimited  30
pe0     cli     1         unlimited  30
pe1     cli     1         unlimited  30
pe2     cli     1         unlimited  30

In addition to the idle time for sessions, we can also see the type of device, current number of pooled sessions, and maximum number of pooled sessions.

We can close pooled sessions for specific devices.

ncs# devices session-pool pooled-device pe0 close
ncs# devices session-pool pooled-device pe1 close
ncs# devices session-pool pooled-device pe2 close
ncs# show devices session-pool
        DEVICE            MAX        IDLE
DEVICE  TYPE    SESSIONS  SESSIONS   TIME
-------------------------------------------
ce0     cli     1         unlimited  30
ce1     cli     1         unlimited  30
ce2     cli     1         unlimited  30
ce3     cli     1         unlimited  30
ce4     cli     1         unlimited  30
ce5     cli     1         unlimited  30

And we can close all pooled sessions in the session pool.

ncs# devices session-pool close
ncs# show devices session-pool
% No entries found.

The session pool configuration is found in the tailf-ncs-devices.yang submodel. The following part of the YANG device-profile-parameters grouping controls how the session pool is configured:

grouping device-profile-parameters {

  ...

    container session-pool {
      tailf:info "Control how sessions to related devices can be pooled.";
      description
        "NCS uses NED sessions when performing transactions, actions
         etc towards a device. When such a task is completed the NED
         session can either be closed or pooled.

         Pooling a NED session means that the session to the
         device is kept open for a configurable amount of
         time. During this time the session can be re-used for a new
         task. Thus the pooling concept exists to reduce the number
         of new connections needed towards a device that is often
         used.

         By default NCS uses pooling for all device types except
         SNMP. Normally there is no need to change the default
         values.";

      leaf max-sessions {
        type union {
          type enumeration {
            enum unlimited;
          }
          type uint32;
        }
        description
          "Controls the maximum number of open sessions in the pool for
           a specific device. When this threshold is exceeded the oldest
           session in the pool will be closed.
           A Zero value will imply that pooling is disabled for
           this specific device. The label 'unlimited' implies that no
           upper limit exists for this specific device";
      }

      leaf idle-time {
        tailf:info
          "The maximum time that a session is kept open in the pool";
        type uint32 {
          range "1 .. max";
        }
        units "seconds";
        description
          "The maximum time that a session is kept open in the pool.
           If the session is not requested and used before the
           idle-time has expired, the session is closed.
           If no idle-time is set the default is 30 seconds.";
      }
    }
  }
}

In addition under /ncs:devices/global-settings/session-pool/default it is possible to control the global max size of the session pool, as defined by the following yang snippet:

container global-settings {
  tailf:info "Global settings for all managed devices.";
  description
    "Global settings for all managed devices. Some of these
     settings can be overridden per managed device.";

  uses device-profile-parameters {

    ...

    augment session-pool {
      leaf pool-max-sessions {
        type union {
          type enumeration {
            enum unlimited;
          }
          type uint32;
        }
        description
          "Controls the grand total session count in the pool.
           Independently on how different devices are pooled the grand
           total session count can never exceed this value.
           A Zero value will imply that pooling is disabled for all devices.
           The label 'unlimited' implies that no upper limit exists for
           the number open sessions in the pool";
      }
    }
  }
}

Let's illustrate the possibilities with an example configuration of the session pool:

ncs# configure
ncs(config)# devices global-settings session-pool idle-time 100
ncs(config)# devices profiles profile small session-pool max-sessions 3
ncs(config-profile-small)# top
ncs(config)# devices device ce* device-profile small
ncs(config-device-ce*)# top
ncs(config)# devices device pe0 session-pool max-sessions 0
ncs(config-device-pe0)# top
ncs(config)# commit
Commit complete.
ncs(config)# exit

ncs# devices connect suppress-positive-result
ncs# show devices session-pool
        DEVICE            MAX        IDLE
DEVICE  TYPE    SESSIONS  SESSIONS   TIME
-------------------------------------------
ce0     cli     1         3          100
ce1     cli     1         3          100
ce2     cli     1         3          100
ce3     cli     1         3          100
ce4     cli     1         3          100
ce5     cli     1         3          100
pe1     cli     1         unlimited  100
pe2     cli     1         unlimited  100

Now, we set an upper limit to the maximum number of sessions in the pool. Setting the value to 4 is too small for a real situation but serves the purpose of illustration:

ncs# configure
ncs(config)# devices global-settings session-pool pool-max-sessions 4
ncs(config)# commit
Commit complete.
ncs(config)# exit

The number of open sessions in the pool will be adjusted accordingly:

ncs# show devices session-pool
        DEVICE            MAX        IDLE
DEVICE  TYPE    SESSIONS  SESSIONS   TIME
-------------------------------------------
ce4     cli     1         3          100
ce5     cli     1         3          100
pe1     cli     1         unlimited  100
pe2     cli     1         unlimited  100

Device Session Limits

grouping device-profile-parameters {

  ...

    container session-limits {
      tailf:info "Parameters for limiting concurrent access to the device.";
      leaf max-sessions {
        type union {
          type enumeration {
            enum unlimited;
          }
          type uint32 {
            range "1..max";
          }
        }
        default unlimited;
        description
          "Puts a limit to the total number of concurrent sessions
           allowed for the device. The label 'unlimited' implies that no
           upper limit exists for this device.";
      }
    }

  ...

  }

container global-settings {
  tailf:info "Global settings for all managed devices.";
  description
    "Global settings for all managed devices. Some of these
     settings can be overridden per managed device.";

  uses device-profile-parameters {

  ...

    augment session-limits {
      description
        "Parameters for limiting concurrent access to devices.";
      container connect-rate {
        leaf burst {
          type union {
            type enumeration {
              enum unlimited;
            }
            type uint32 {
              range "1..max";
            }
          }
          default unlimited;
          description
            "The number of concurrent connect attempts allowed.
             For example, the devices managed by NSO talk to the same
             server for authentication which can only handle a limited
             number of connections at a time. Then we can limit
             the concurrency of connect attempts with this setting.";
        }
      }
      leaf max-wait-time {
        tailf:info
          "Max time in seconds to wait for device to be available.";
        type union {
          type enumeration {
            enum unlimited;
          }
          type uint32 {
            range "0..max";
          }
        }
        units "seconds";
        default 10;
        description
          "Max time in seconds to wait for a device being available
           to connect. When the maximum time is reached an error
           is returned. Setting this to 0 means that the error is
           returned immediately.";
      }
    }

  ...

}

Tracing Device Communication

To enable tracing, do:

ncs(config)# devices global-settings trace raw trace-dir .logs
ncs(config)# commit

The trace setting only affects new NED connections, so to ensure that we get any tracing data, we can do:

ncs(config)# devices disconnect

The above command terminates all existing connections.

At this point, if you execute a transaction towards one or several devices and then view the trace data.

ncs(config)# do file show logs/ned-cisco-ios-ce0.trace
>> 8-Oct-2014::18:23:18.512 CLI CONNECT to ce0-127.0.0.1:10022 as admin (Trace=true)

  *** output 8-Oct-2014::18:23:18.514 ***
-- SSH connecting to host: 127.0.0.1:10022 --
-- SSH initializing session --

  *** input 8-Oct-2014::18:23:18.547 ***

admin connected from 127.0.0.1 using ssh on ncs
...
ce0(config)#
  *** output 8-Oct-2014::18:23:19.428 ***
snmp-server community topsecret RW

It is possible to clear all existing trace files through the command

ncs(config)# devices clear-trace

Finally, it is worth mentioning the trace functionality does not come for free. It is fairly costly to have the trace turned on. Also, there exists no trace log wrapping functionality.

Checking Device Configuration

The transaction IDs are stored in CDB and can be viewed as:

ncs# show devices device state last-transaction-id
NAME  LAST TRANSACTION ID
----------------------------------------
ce0   ef3bbd344ef94b3fecec5cb93ac7458c
ce1   48e91db163e294bf5c3978d154922c9
ce2   48e91db163e294bf5c3978d154922c9
ce3   48e91db163e294bf5c3978d154922c9
ce4   48e91db163e294bf5c3978d154922c9
ce5   48e91db163e294bf5c3978d154922c9
ce6   48e91db163e294bf5c3978d154922c9
ce7   48e91db163e294bf5c3978d154922c9
ce8   48e91db163e294bf5c3978d154922c9
p0    -
p1    -
p2    -
p3    -
pe0   -
pe1   -
pe2   1412-581909-661436
pe3   -

To check for consistency, we execute:

ncs# devices check-sync
sync-result {
    device ce0
    result in-sync
}
...
sync-result {
    device p1
    result unsupported
}
...

Alternatively for all (or a subset) managed devices:

ncs# devices device ce0..3 check-sync
devices device ce0 check-sync
    result in-sync
devices device ce1 check-sync
    result in-sync
devices device ce2 check-sync
    result in-sync
devices device ce3 check-sync
    result in-sync

The following YANG grouping is used for the return value from the check-sync command:

grouping check-sync-result {
    description
      "Common result data from a 'check-sync' action.";

    leaf result {
      type enumeration {
        enum unknown {
          description
            "NCS have no record, probably because no
             sync actions have been executed towards the device.
             This is the initial state for a device.";
        }
        enum locked {
          tailf:code-name 'sync_locked';
          description
            "The device is administratively locked, meaning that NCS
             cannot talk to it.";
        }
        enum in-sync {
          tailf:code-name 'in-sync-result';
          description
            "The configuration on the device is in sync with NCS.";
        }
        enum out-of-sync {
          description
            "The device configuration is known to be out of sync, i.e.,
             it has been reconfigured out of band.";
        }
        enum unsupported {
          description
            "The device doesn't support the tailf-netconf-monitoring
             module.";
        }
        enum error {
          description
            "An error occurred when NCS tried to check the sync status.
             The leaf 'info' contains additional information.";
        }
      }
    }
  }

Comparing Device Configurations

$ ncs-netsim cli-i ce0
admin connected from 127.0.0.1 using console on ncs
ce0> enable
ce0# configure
Enter configuration commands, one per line. End with CNTL/Z.
ce0(config)# snmp-server community foobar RW
ce0(config)# exit
ce0# exit
$ ncs_cli -C -u admin

admin connected from 127.0.0.1 using console on ncs
ncs# devices device ce0 check-sync
result out-of-sync
info got: 290fa2b49608df9975c9912e4306110 expected: ef3bbd344ef94b3fecec5cb93ac7458c

ncs# devices device ce0 compare-config
diff
 devices {
     device ce0 {
         config {
             ios:snmp-server {
+                community foobar {
+                    RW;
+                }
             }
         }
     }
 }

The diff in the above output should be interpreted as: what needs to be done in NSO to become in sync with the device.

When you decide to reset the configuration with the copy kept in NSO use the option dry-run in conjunction with sync-to and inspect what will be sent to the device:

ncs# devices device ce0 sync-to dry-run
data
      no snmp-server community foobar RW
ncs#

As this is the desired data to send to the device a sync-to can now safely be performed.

ncs# devices device ce0 sync-to
result true
ncs#

The device configuration should now be in sync with the copy in NSO and compare-config ought to yield an empty output:

ncs# devices device ce0 compare-config
ncs#

Initialize Device

There exist several ways to initialize new devices. The two common ways are to initialize a device from another existing device or to use device templates.

From Other

For example, another CE router has been added to our example network. You want to base the configuration of that host on the configuration of the managed device ce0 which has a valid configuration:

ncs(config)# show full-configuration devices device ce0
devices device ce0
 address   127.0.0.1
 port      10022
 ssh host-key ssh-dss
  key-data "AAAAB3NzaC1kc3MAAACBAO9tkTdZgAqJMz8m...
 !
 authgroup default
 device-type cli ned-id cisco-ios-cli-3.8
 state admin-state unlocked
 config
  no ios:service pad
  no ios:ip domain-lookup
  no ios:ip http secure-server
  ios:ip source-route
  ios:interface GigabitEthernet0/1
  exit
  ios:interface GigabitEthernet0/10
  exit
  ios:interface GigabitEthernet0/11
  exit
  ios:interface GigabitEthernet0/12
  exit
  ios:interface GigabitEthernet0/13
  exit
  ios:interface GigabitEthernet0/14
  exit
....

If the configuration is accurate you can create a new managed device based on that configuration as:

Example: Instantiate Device from Other

ncs(config)# devices device ce9 address 127.0.0.1 port 10031
ncs(config-device-ce9)# device-type cli ned-id cisco-ios-cli-3.8
ncs(config-device-ce9)# authgroup default
ncs(config-device-ce9)# instantiate-from-other-device device-name ce0
ncs(config-device-ce9)# top
ncs(config)# show configuration
devices device ce9
 address   127.0.0.1
 port      10031
 authgroup default
 device-type cli ned-id cisco-ios-cli-3.8
 config
  no ios:service pad
  no ios:ip domain-lookup
  no ios:ip http secure-server
  ios:ip source-route
  ios:interface GigabitEthernet0/1
  exit
....
ncs(config)# commit
Commit complete.

This new configuration might not be entirely correct, you can modify any configuration before committing it.

ncs(config)# devices device ce9 sync-to
result false
info Device ce9 is southbound locked

(config)# show full-configuration devices device ce9 state | details
devices device ce9
 state admin-state southbound-locked
!

By Template

Another alternative to instantiating a device from the actual working configuration of another device is to have a number of named device templates that manipulate the configuration.

The template tree looks like this:

submodule tailf-ncs-devices {
  namespace "http://tail-f.com/ns/ncs";
  ...
container devices {
    ........
    list template {
      description
        "This list is used to define named template configurations that
         can be used to either instantiate the configuration for new
         devices, or to apply snippets of configurations to existing
         devices.
         ...
         ";

      key name;
      leaf name {
        description "The name of a specific template configuration";
        type string;
      }
      list ned-id {
        key id;
        leaf id {
          type identityref {
            base ned:ned-id;
          }
        }
        container config {
          tailf:mount-point ncs-template-config;
          tailf:cli-add-mode;
          tailf:cli-expose-ns-prefix;
          description
            "This container is augmented with data models from the devices.";
        }
      }
    }

The tree for device templates is generated from all device YANG models. All constraints are removed and the data type of all leafs is changed to string.

A device template is created by setting the desired data in the configuration. The created device template is stored in NSO CDB.

Example: Create ce-initialize Template

ncs(config)# devices template ce-initialize ned-id cisco-ios-cli-3.8 config
ncs(config-config)# no ios:service pad
ncs(config-config)# no ios:ip domain-lookup
ncs(config-config)# ios:ip dns server
ncs(config-config)# no ios:ip http server
ncs(config-config)# no ios:ip http secure-server
ncs(config-config)# ios:ip source-route true
ncs(config-config)# ios:interface GigabitEthernet 0/1
ncs(config- GigabitEthernet-0/1)# exit
ncs(config-config)# ios:interface GigabitEthernet 0/2
ncs(config- GigabitEthernet-0/2)# exit
ncs(config-config)# ios:interface GigabitEthernet 0/3
ncs(config- GigabitEthernet-0/3)# exit
ncs(config-config)# ios:interface Loopback 0
ncs(config-Loopback-0)# exit
ncs(config-config)# ios:snmp-server community public RO
ncs(config-community-public)# exit
ncs(config-config)# ios:snmp-server trap-source GigabitEthernet 0/2
ncs(config-config)# top
ncs(config)# commit

The device template created in the example above (Create ce-initialize template) can now be used to initialize single devices or device groups, see .

In the following CLI session, a new device ce10 is created:

ncs(config)# devices device ce10 address 127.0.0.1 port 10032
ncs(config-device-ce10)# device-type cli ned-id cisco-ios-cli-3.8
ncs(config-device-ce10)# authgroup default
ncs(config-device-ce10)# top
ncs(config)# commit

Initialize the newly created device ce10 with the device template ce-initialize:

ncs(config)# devices device ce10 apply-template template-name ce-initialize
apply-template-result {
    device ce10
    result no-capabilities
    info No capabilities found for device: ce10. Has a sync-from the device
         been performed?
}

When initializing devices, NSO does not have any knowledge about the capabilities of the device, no connect has been done. This can be overridden by the option accept-empty-capabilities

ncs(config)# devices device ce10 \
apply-template template-name ce-initialize accept-empty-capabilities
apply-template-result {
    device ce10
    result ok
}

Inspect the changes made by the template ce-initialize:

ncs(config)# show configuration
devices device ce10
 config
  ios:ip dns server
  ios:interface GigabitEthernet0/1
  exit
  ios:interface GigabitEthernet0/2
  exit
  ios:interface GigabitEthernet0/3
  exit
  ios:interface Loopback0
  exit
  ios:snmp-server community public RO
  ios:snmp-server trap-source GigabitEthernet0/2
 !
!

Device Templates

This section shows how device templates can be used to create and change device configurations. See in Templates for other ways of using templates.

The $NCS_DIR/examples.ncs/service-provider/mpls-vpn example comes with a pre-populated template for SNMP settings.

ncs(config)# show full-configuration devices template
devices template snmp1
 ned-id cisco-ios-cli-3.8
  config
   ios:snmp-server community {$COMMUNITY}
    RO
   !
  !
 !
 ned-id cisco-iosxr-cli-3.5
  config
   cisco-ios-xr:snmp-server community {$COMMUNITY}
    RO
   !
  !
 !
 ned-id juniper-junos-nc-3.0
  config
   junos:configuration snmp community {$COMMUNITY}
    authorization read-only
   !
  !
 !
!

The variable $DEVICE is used internally by NSO and can not be used in a template.

A template can be applied to a device, a device group, and a range of devices. It can be used as shown in to create the day-zero config for a newly created device.

Applying the snmp1 template, providing a value for the COMMUNITY template variable:

ncs(config)# devices device ce2 apply-template template-name \
      snmp1 variable { name COMMUNITY value 'FUZBAR' }
ncs(config)# show configuration
devices device ce2
 config
  ios:snmp-server community FUZBAR RO
 !
!
ncs(config)# commit dry-run outformat native
native {
    device {
        name ce2
        data snmp-server community FUZBAR RO
    }
}
ncs(config)# commit
Commit complete.

The result of applying the template:

ncs(config)# show full-configuration devices device ce2 config\
   ios:snmp-server
devices device ce2
 config
  ios:snmp-server community FUZBAR RO
 !
!

Debug

By adding the CLI pipe flag debug template when applying a template, the CLI will output detailed information on what is happening when the template is being applied:

ncs(config)# devices device ce2 apply-template template-name \
      snmp1 variable { name COMMUNITY value 'FUZBAR' } | debug template
Operation 'merge' on existing node: /devices/device[name='ce2']
The device /devices/device[name='ce2'] does not support
namespace 'http://tail-f.com/ned/cisco-ios-xr' for node "'snmp-server'"
Skipping...
The device /devices/device[name='ce2'] does not support
namespace 'http://xml.juniper.net/xnm/1.1/xnm' for node "configuration"
Skipping...
Variable $COMMUNITY is set to "FUZBAR"
Operation 'merge' on non-existing node:
/devices/device[name='ce2']/config/ios:snmp-server/community[name='FUZBAR']
Operation 'merge' on non-existing node:
/devices/device[name='ce2']/config/ios:snmp-server/community[name='FUZBAR']/RO

Renaming Devices in NSO

The /devices/device/rename action renames an existing device and fixes the node/data dependencies in CDB. When renaming a device, the action fixes the following dependencies:

Leafrefs and instance-identifiers (both config true and config false).
Monitor and kick-node of kickers, if they refer to this device.
Diff-sets and forward-diff-sets of services that touch this device (This includes nano-services and also zombies).

NSO maintains a history of past renames at /devices/device/rename-history.

Examples

admin@ncs> request devices device ex0 rename new-name foo
result true
[ok][2024-04-16 20:51:51]
admin@ncs> show devices device foo rename-history | tab
FROM  TO   WHEN                              USER
----------------------------------------------------
ex0   foo  2024-04-16T18:51:51.578439+00:00  admin

[ok][2024-04-16 20:52:07]
admin@ncs> show configuration devices device ex0
---------------------------------------------^
syntax error: element does not exist
[error][2024-04-16 20:52:09]
admin@ncs> show configuration devices device foo
address   127.0.0.1;
port      12022;
...

admin@ncs> request devices commit-queue add-lock device [ ex1 ]
commit-queue-id 1713297244546
[ok][2024-04-16 21:54:04]
admin@ncs> request devices device ex1 rename new-name foo wait-for-lock { timeout 5 }
result false
info ex1: A timeout occured when trying to add device lock to the commit queue
[ok][2024-04-16 21:54:26]

The parameter no-wait-for-lock makes the action fail immediately if the device lock is unavailable, while a timeout of infinity can be used to make it wait indefinitely for the lock.

Limitations

If a nano-service has components whose names are derived from the device name, and that device is renamed, the corresponding service components in its plan are not automatically renamed.

For example, let's say the nano-service has components with names matching device names.

admin@ncs% run show vlan-state test plan | tab
                                                                           POST
              BACK                                                         ACTION
TYPE  NAME  TRACK  GOAL  STATE        STATUS   WHEN                 ref  STATUS
---------------------------------------------------------------------------------
self  self  false  -     init         reached  2024-04-16T21:38:34  -    -
                         ready        reached  2024-04-16T21:38:34  -    -
vlan  ex1   false  -     init         reached  2024-04-16T21:38:34  -    -
                         router-init  reached  2024-04-16T21:38:34  -    -
                         ready        reached  2024-04-16T21:38:34  -    -
vlan  ex2   false  -     init         reached  2024-04-16T21:38:34  -    -
                         router-init  reached  2024-04-16T21:38:34  -    -
                         ready        reached  2024-04-16T21:38:34  -    -

[ok][2024-04-16 21:38:44]

If this device is renamed, the corresponding nano-service component is not renamed.

admin@ncs% request devices device ex1 rename new-name newex1
result true
[ok][2024-04-16 21:39:21]

[edit]
admin@ncs% run show vlan-state test plan | tab
                                                                           POST
              BACK                                                         ACTION
TYPE  NAME  TRACK  GOAL  STATE        STATUS   WHEN                 ref  STATUS
---------------------------------------------------------------------------------
self  self  false  -     init         reached  2024-04-16T21:38:34  -    -
                         ready        reached  2024-04-16T21:38:34  -    -
vlan  ex1   false  -     init         reached  2024-04-16T21:38:34  -    -
                         router-init  reached  2024-04-16T21:38:34  -    -
                         ready        reached  2024-04-16T21:38:34  -    -
vlan  ex2   false  -     init         reached  2024-04-16T21:38:34  -    -
                         router-init  reached  2024-04-16T21:38:34  -    -
                         ready        reached  2024-04-16T21:38:34  -    -

[ok][2024-04-16 21:39:24]

To handle this, the component with the old name must be force-back-tracked and the service re-deployed.

admin@ncs% request vlan-state test plan component vlan ex1 force-back-track
result true
[ok][2024-04-16 21:39:51]

[edit]
admin@ncs% run show vlan-state test plan | tab
                                                                         POST
            BACK                                                         ACTION
TYPE  NAME  TRACK  GOAL  STATE        STATUS   WHEN                 ref  STATUS
---------------------------------------------------------------------------------
self  self  false  -     init         reached  2024-04-16T21:38:34  -    -
                         ready        reached  2024-04-16T21:38:34  -    -
vlan  ex2   false  -     init         reached  2024-04-16T21:38:34  -    -
                         router-init  reached  2024-04-16T21:38:34  -    -
                         ready        reached  2024-04-16T21:38:34  -    -

[ok][2024-04-16 21:39:54]

[edit]
admin@ncs% request vlan test re-deploy
[ok][2024-04-16 21:40:02]

[edit]
admin@ncs% run show vlan-state test plan | tab
                                                                           POST
              BACK                                                         ACTION
TYPE  NAME    TRACK  GOAL  STATE        STATUS   WHEN                 ref  STATUS
-----------------------------------------------------------------------------------
self  self    false  -     init         reached  2024-04-16T21:38:34  -    -
                           ready        reached  2024-04-16T21:40:02  -    -
vlan  ex2     false  -     init         reached  2024-04-16T21:38:34  -    -
                           router-init  reached  2024-04-16T21:38:34  -    -
                           ready        reached  2024-04-16T21:38:34  -    -
vlan  newex1  false  -     init         reached  2024-04-16T21:40:02  -    -
                           router-init  reached  2024-04-16T21:40:02  -    -
                           ready        reached  2024-04-16T21:40:02  -    -

[ok][2024-04-17 08:40:05]

When a device is renamed, all components that derive their name from that device's name in all the service instances must be force-back-tracked.

Auto-configuring Devices in NSO

Examples

NSO will auto-configure a new device in a transaction if either /devices/device/auto-configure/vendor or /devices/device/auto-configure/ned-id is set in that transaction.

admin@ncs% show packages package component ned device
packages package router-nc-1.0
 component router
  ned device vendor "Acme Inc."
  ned device product-family [ "Acme Netconf router 1.0" ]
  ned device operating-system [ AcmeOS "AcmeOS 2.0" ]
[ok][2024-04-16 19:53:20]
admin@ncs% set devices device mydev address 127.0.0.1 port 12022 authgroup default
[ok][2024-04-16 19:53:34]

[edit]
admin@ncs% set devices device mydev auto-configure vendor "Acme Inc." operating-system AcmeOs
[ok][2024-04-16 19:53:36]

[edit]
admin@ncs% commit | details
...
 2024-04-16T19:53:37.655 device mydev: auto-configuring...
 2024-04-16T19:53:37.659 device mydev: configuring admin state... ok (0.000 s)
 2024-04-16T19:53:37.659 device mydev: fetching ssh host keys... ok (0.011 s)
 2024-04-16T19:53:37.671 device mydev: copying configuration from device... ok (0.054 s)
 2024-04-16T19:53:37.726 device mydev: auto-configuring: ok (0.070 s)
...

One can configure either vendor and product-family, or vendor and operating-system or just the ned-id explicitly.

admin@ncs% set devices device d1 auto-configure vendor "Acme Inc." product-family "Acme router"

admin@ncs% set devices device d2 auto-configure vendor "Acme Inc." operating-system AcmeOS

admin@ncs% set devices device d3 auto-configure ned-id router-nc-1.0

admin@ncs% set devices device mydev2 auto-configure vendor "Acme Inc." operating-system AcmeOS
[ok][2024-04-16 20:03:05]

[edit]
admin@ncs% set devices device mydev2 state admin-state southbound-locked
[ok][2024-04-16 20:03:05]

[edit]
admin@ncs% commit | details
...
 2024-04-16T20:03:08.604 device mydev2: auto-configuring...
 2024-04-16T20:03:08.606 device mydev2: configuring admin state... ok (0.000 s)
 2024-04-16T20:03:08.606 device mydev2: fetching ssh host keys... skipped - 'southbound-locked' configured (0.001 s)
 2024-04-16T20:03:08.608 device mydev2: auto-configuring: ok (0.003 s)
...

`oper-state` and `admin-state`

ncs# show devices device ce9 state oper-state
state oper-state disabled

Or, a slightly more interesting CLI usage:

ncs# show devices device state oper-state
      OPER
NAME  STATE
----------------
ce0   enabled
ce1   enabled
ce10  disabled
ce2   enabled
ce3   enabled
ce4   enabled
ce5   enabled
ce6   enabled
ce7   enabled
ce8   enabled
ce9   disabled
p0    enabled
p1    enabled
p2    enabled
p3    enabled
pe0   enabled
pe1   enabled
pe2   enabled
pe3   enabled

ncs# show devices device ce0..9 state oper-state
      OPER
NAME  STATE
----------------
ce0   enabled
ce1   enabled
ce2   enabled
ce3   enabled
ce4   enabled
ce5   enabled
ce6   enabled
ce7   enabled
ce8   enabled
ce9   disabled

$ ncs-netsim stop ce0
DEVICE ce0 STOPPED
$ ncs_cli -C -u admin
ncs# show devices device ce0 state oper-state
state oper-state enabled

ncs(config)# devices device ce0 config ios:snmp-server contact joe@acme.com
ncs(config-config)# commit
Aborted: Failed to connect to device ce0: connection refused: Connection refused
ncs(config-config)# *** ALARM connection-failure: Failed to
connect to device ce0: connection refused: Connection refused

Now, NSO has failed to connect to it, NSO knows that ce0 is dead:

ncs# show devices device ce0 state oper-state
state oper-state disabled

This concludes the oper-state discussion. The next state to be illustrated is the admin-state. The admin-state is what the operator configures, this is the desired state of the managed device.

In tailf-ncs.yang we have the following configuration definition for admin-state:

Example: tailf-ncs-devices.yang - admin-state

submodule tailf-ncs-devices {
  ....

  typedef admin-state {
    type enumeration {
      enum locked {
        description
          "When a device is administratively locked, it is not possible
           to modify its configuration, and no changes are ever
           pushed to the device.";
      }
      enum unlocked {
        description
          "Device is assumed to be operational.
           All changes are attempted to be sent southbound.";
      }
      enum southbound-locked {
        description
          "It is possible to configure the device, but
           no changes are sent to the device. Useful admin mode
           when pre provisioning devices. This is the default
           when a new device is created.";
      }
      enum config-locked {
        description
          "It is possible to send live-status commands or RPCs
           but it is not possible to modify the configuration
           of the device.";
      }
    }
  }

  ....
  container devices {
     ....
     container state {
        ....
        leaf admin-state {
          type admin-state;
          default southbound-locked;
        }

        leaf admin-state-description {
          type string;
          description
            "Reason for the admin state.";

        }

In the example above (tailf-ncs-devices.yang - admin-state), you can see the four different admin states for a managed device as defined in the YANG model.

locked - This means that all changes to the device are forbidden. Any transaction which attempts to manipulate the configuration of the device will fail. It is still possible to read the configuration of the device.
unlocked -This is the state a device is set into when the device is operational. All changes to the device are attempted to be sent southbound.
southbound-locked - This is the default value. It means that it is possible to manipulate the configuration of the device but changes done to the device configuration are never pushed to the device. This mode is useful during e.g. pre-provisioning, or when we instantiate new devices.
config-locked - This means that any transaction which attempts to manipulate the configuration of the device will fail. It is still possible to read the configuration of the device and send live-status commands or RPCs.

Configuration Source

Example: tailf-ncs-devices.yang - source

submodule tailf-ncs-devices {
  ...
      container source {
        tailf:info "How the device was added to NCS";
        leaf added-by-user {
          type string;
        }
        leaf context {
          type string;
        }
        leaf when {
          type yang:date-and-time;
        }
        leaf from-ip {
          type inet:ip-address;
        }
        leaf source {
          type string;
          reference "TMF518 NRB Network Resource Basics";
        }
      }

These attributes should be automatically set by the integration towards the inventory source, rather than manipulated manually.

added-by-user: Identify the user who loaded the managed device.
context: In what context was the device loaded.
when: When the device was added to NSO.
from-ip: From which IP the load activity was run.
source: Identify the source of the managed device such as the inventory system name or the name of the source file.

Capabilities, Modules, and Revision Management

ncs# show devices device ce0 capability
capability urn:ietf:params:netconf:capability:with-defaults:1.0?basic-mode=trim
capability urn:ios
 revision 2015-03-16
 module   tailf-ned-cisco-ios
capability urn:ios-stats
 revision 2015-03-16
 module   tailf-ned-cisco-ios-stats

ncs#  show devices device ce0 capability module
NAME                       REVISION    FEATURE  DEVIATION
-----------------------------------------------------------
tailf-ned-cisco-ios        2015-03-16  -        -
tailf-ned-cisco-ios-stats  2015-03-16  -        -

Discovery of a NETCONF Device

ncs(config)# devices device foo address 127.0.0.1 port 12033 authgroup default
ncs(config-device-foo)# device-type netconf ned-id netconf
ncs(config-device-foo)# state admin-state unlocked
ncs(config-device-foo)# commit
Commit complete.
ncs(config-device-foo)# exit
ncs(config)# exit
ncs# devices fetch-ssh-host-keys device foo
fetch-result {
    device foo
    result updated
    fingerprint {
        algorithm ssh-rsa
        value 14:3c:79:87:69:8e:e2:f0:6d:43:07:8c:89:41:fd:7f
    }
}
ncs# devices device foo connect
result true
info (admin) Connected to foo - 127.0.0.1:12033
ncs# show devices device foo capability
capability :candidate:1.0
capability :confirmed-commit:1.0
...
capability http://xml.juniper.net/xnm/1.1/xnm
 module junos
capability urn:ietf:params:xml:ns:yang:ietf-yang-types
 revision 2013-07-15
 module   ietf-yang-types
capability urn:juniper-rpc
 module junos-rpc
...

We can also check which modules the loaded NEDs support. Then we can pick the most suitable NED and configure the device with this NED ID.

ncs# show devices ned-ids
ID                    NAME                          REVISION
--------------------------------------------------------------
cisco-ios-xr-v2       tailf-ned-cisco-ios-xr        -
                      tailf-ned-cisco-ios-xr-stats  -
lsa-netconf
netconf
snmp
alu-sr-cli-3.4        tailf-ned-alu-sr              -
                      tailf-ned-alu-sr-stats        -
cisco-ios-cli-3.8     tailf-ned-cisco-ios           -
                      tailf-ned-cisco-ios-stats     -
cisco-iosxr-cli-3.5   tailf-ned-cisco-ios-xr        -
                      tailf-ned-cisco-ios-xr-stats  -
juniper-junos-nc-3.0  junos                         -
                      junos-rpc                     -
ncs# config
Entering configuration mode terminal
ncs(config)# devices device foo device-type netconf ned-id juniper-junos-nc-3.0
ncs(config-device-foo)# commit
Commit complete.

Configuration Datastore Support

NSO divides devices into the following groups.

start_trans_running: This mode is used for devices that support the Tail-f proprietary transaction extension defined by http://tail-f.com/ns/netconf/transactions/1.0. Read more on this in the Tail-f ConfD user guide. In principle it's a means to - over the NETCONF interface - control transaction processing towards the running data store. This may be more efficient than going through the candidate data store. The downside is that it is Tail-f proprietary non-standardized technology.
lock_candidate: This mode is used for devices that support the candidate data store but disallow direct writes to the running data store.
lock_reset_candidate: This mode is used for devices that support the candidate data and also allow direct writes to the running data store. This is the default mode for Tail-f ConfD NETCONF server. Since the running data store is configurable, we must, before each configuration attempt, copy all of the running to the candidate. (ConfD has optimized this particular usage pattern, so this is a very cheap operation for ConfD)
startup: This mode is used for devices that have writable running, no candidate but do support the startup data store. This is the typical mode for Cisco-like devices.
running-only: This mode is used for devices that only support writable running.
NED: The transaction is controlled by a Network Element Driver. The exact transaction mode depends on the type of the NED.

ncs# show devices device ce0 state transaction-mode
state transaction-mode ned
ncs# show devices device pe2 state transaction-mode
state transaction-mode lock-candidate

NSO talking to ConfD device running in its standard configuration, thus lock-reset-candidate.

If a managed device does not support this capability, NSO attempts to do the best it can.

This is how NSO handles common failure scenarios:

The operator aborts the transaction, or the NSO loses the SSH connection to another managed device which is also participating in the same network transaction. If the device does support the confirmed-commit capability, NSO aborts the outstanding yet-uncommitted transaction simply by closing the SSH connection. When the device does not support the confirmed-commit capability, NSO has the reverse diff and simply sends the precise undo information to the device instead.
The device rejects the transaction in the first place, i.e. the NSO attempts to modify its running data store. This is an easy case since NSO then simply aborts the transaction as a whole in the initial commit confirmed [time] attempt.
NSO loses SSH connectivity to the device during the timeout period. This is a real error case and the configuration is now in an unknown state. NSO will abort the entire transaction, but the configuration of the failing managed device is now probably in error. The correct procedure once network connectivity has been restored to the device is to sync it in the direction from NSO to the device. The NSO copy of the device configuration will be what was configured before the failed transaction.

Thus, even if not all participating devices have first-class NETCONF server implementations, NSO will attempt to fake the confirmed-commit capability.

Action Proxy

For example, the Juniper NED comes with a set of JunOS RPCs defined in: $NCS_DIR/packages/neds/juniper-junos/src/yang/junos-rpc.yang

module junos-rpc {
  ...
  rpc request-package-add {
  ...
  rpc request-reboot {
  ...
  rpc get-software-information {
  ...
  rpc ping {

ncs(config)# devices device pe2 rpc rpc-
Possible completions:
  rpc-get-software-information  rpc-idle-timeout  rpc-ping \
  rpc-request-package-add  rpc-request-reboot

ncs(config)# devices device pe2 rpc \
rpc-get-software-information get-software-information brief

In the simulated environment of the mpls-vpn example, these RPCs might not have been implemented.

Device Groups

The definition of device groups resides at the same layer in the NSO data model as the device list, thus we have:

Example: Device Groups

submodule tailf-ncs-devices {
  namespace "http://tail-f.com/ns/ncs";
  ...
  container devices {
     .....
    list device {
     ...
     }
    list device-group {
      key name;
      leaf name {
        type string;
      }
      description
        "A named group of devices, some actions can be
         applied to an entire  group of devices, for example
         apply-template, and the sync actions.";
      leaf-list device-name {
        type leafref {
          path "/devices/device/name";
        }
      }
      leaf-list device-group {
        type leafref {
          path "/devices/device-group/name";
        }
        description
          "A list of device groups contained in this device group.

           Recursive definitions are not valid.";
      }
      leaf-list member {
        type leafref {
          path "/devices/device/name";
        }
        config false;
        description
          "The current members of the device-group.  This is a flat list
           of all the devices in the group.";
      }
      uses connect-grouping ;
      uses sync-grouping;
      uses check-sync-grouping;
      uses apply-template-grouping;
    }
  }
}

The MPLS VPN example comes with a couple of pre-defined device-groups:

ncs(config)# show full-configuration devices device-group
devices device-group C
 device-name [ ce0 ce1 ce3 ce4 ce5 ce6 ce7 ce8 ]
!
devices device-group P
 device-name [ p0 p1 p2 p3 ]
!
devices device-group PE
 device-name [ pe0 pe1 pe2 pe3 ]
!

Device groups are created like below:

Example: Create Device Group

ncs(config)# devices device-group my-group device-name ce0
ncs(config-device-group-my-group)# device-name pe
Possible completions:
  pe0  pe1  pe2  pe3
ncs(config-device-group-my-group)# device-name pe0
ncs(config-device-group-my-group)# device-name p0
ncs(config-device-group-my-group)# commit

ncs(config-device-group-my-group)# device-group PE
ncs(config-device-group-my-group)# commit

ncs(config)# show full-configuration devices device-group my-group
devices device-group my-group
 device-name  [ ce0 p0 pe0 ]
 device-group [ PE ]
!
ncs(config)# exit

ncs# show devices device-group my-group
NAME      MEMBER                      INDETERMINATES  CRITICALS  MAJORS  MINORS  WARNINGS
-------------------------------------------------------------------------------------------
my-group  [ ce0 p0 pe0 pe1 pe2 pe3 ]  0               0          1       0       0

Once you have a group, you can sync and check-sync the entire group.

ncs# devices device-group C sync-to

However, what makes device groups really interesting is the ability to apply a template to a group. You can use the pre-populated templates to apply SNMP settings to device groups.

ncs(config)# devices device-group C apply-template \
template-name snmp1 variable { name COMMUNITY value 'cinderella' }
ncs(config)# show configuration
devices device ce0
 config
  ios:snmp-server community cinderella RO
 !
!
devices device ce1
 config
  ios:snmp-server community cinderella RO
 !
!
...
ncs(config)# commit

Policies

Assume you would like to enforce all CE routers to have a Gigabit interface 0/1.

Example: Policies

ncs(config)# policy rule gb-one-zero
ncs(config-rule-gb-one-zero)# foreach /ncs:devices/device[starts-with(name,'ce')]/config
ncs(config-rule-gb-one-zero)# expr ios:interface/ios:GigabitEthernet[ios:name='0/1']
ncs(config-rule-gb-one-zero)# warning-message "{../name} should have 0/1 interface"
ncs(config-rule-gb-one-zero)# commit
zork(config-rule-gb-one-zero)# top
zork(config)# !
ncs(config)# show full-configuration policy
policy rule gb-one-zero
 foreach         /ncs:devices/device[starts-with(name,'ce')]/config
 expr            ios:interface/ios:GigabitEthernet[ios:name='0/1']
 warning-message "{../name} should have 0/1 interface"
!
ncs(config)# no devices device ce0 config ios:interface GigabitEthernet 0/1
ncs(config)# validate
Validation completed with warnings:
  ce0 should have 0/1 interface
ncs(config)# no devices device ce1 config ios:interface GigabitEthernet 0/1
ncs(config)# validate
Validation completed with warnings:
  ce1 should have 0/1 interface
  ce0 should have 0/1 interface
ncs(config)# commit
The following warnings were generated:
  ce1 should have 0/1 interface
  ce0 should have 0/1 interface
Proceed? [yes,no] yes
Commit complete.

Validation is always performed at commit but can also be requested interactively.

Commit Queue

The big upside of this scheme is that the transactional throughput through NSO is considerably higher. Also, transient devices are handled better. The downsides are:

If a device rejects the proposed change, NSO and the device are now out of sync until any error recovery is performed. Whenever this happens, an NSO alarm (called commit-through-queue-failed) is generated.
While a transaction remains in the queue, i.e., it has been accepted for delivery by NSO but is not yet delivered, the view of the network in NSO is not (yet) correct. Eventually, though, the queued item will be delivered, thus achieving eventual consistency.

To facilitate the two use cases of the commit queue the outbound queue item can be either in an atomic or non-atomic mode.

In the following sequences, the simulated device ce0 is stopped to illustrate the commit queue. This can be achieved by the following sequence including returning to the NSO CLI config mode:

$ ncs-netsim stop ce0
DEVICE ce0 STOPPED
$ ncs_cli -C -u admin

admin connected from 127.0.0.1 using console on ncs
ncs# config

ncs(config)# commit commit-queue
Possible completions:
  async    Commit through commit queue and return immediately
  bypass   Bypass commit-queue when queue is enabled by default
  sync     Commit through commit queue and wait for reply
ncs(config)# commit commit-queue async

Or, by configuring NSO to always run all transactions through the commit queue as in:

ncs(config)# devices global-settings commit-queue enabled-by-default
[false,true] (false): true
ncs(config)# commit

Or, by configuring a number of devices to run through the commit queue as default:

ncs(config)# devices device ce0..2 commit-queue enabled-by-default
[false,true] (false): true
ncs(config)# commit

Do some changes and commit through the commit queue:

Example: Commit through Commit Queue

ncs(config)# devices device ce0..2 config ios:snmp-server \
    trap-source GigabitEthernet 0/1
ncs(config-config)# commit
commit-queue-id 9494446997
Commit complete.
ncs(config-config)# *** ALARM connection-failure: Failed to
connect to device ce0: connection refused: Connection refused

Commit Queue Scheduling

ncs(config)# devices device ce0 config ios:interface GigabitEthernet 0/25
ncs(config-if)# commit
commit-queue-id 9494530158
Commit complete.
ncs(config-if)# *** ALARM commit-through-queue-blocked:
Commit Queue item 9494530158 is blocked because qitem 9494446997
cannot connect to ce0

waiting: The queue item is waiting for other queue items to finish. This is because the waiting queue item has participating devices that are part of other queue items, ahead in the queue. It is waiting for a set of devices, to not occur ahead of itself in the queue.
executing: The queue item is currently being processed. Multiple queue items can run currently as long as they don't share any managed devices. Transient errors might be present. These errors occur when NSO fails to communicate with some of the devices. The errors are shown in the leaf-list transient-errors. Retries will take place at intervals specified in /ncs:devices/global-settings/commit-queue/retry-timeout. Examples of transient errors are connection failures and that the changes are rejected due to the device being locked. Transient errors are potentially bad since the queue might grow if new items are added, waiting for the same device.
locked: This queue item is locked and will not be processed until it has been unlocked, see the action /ncs:devices/commit-queue/queue-item/unlock. A locked queue item will block all subsequent queue items that are using any device in the locked queue item.

Viewing and Manipulating the Commit Queue

You can view the queue in the CLI. There are three different view modes, summary, normal, and detailed. Depending on the output, both the summary and the normal look good:

Example: Viewing Queue Items

ncs# show devices commit-queue | notab
devices commit-queue queue-item 9494446997
 age       144
 status    executing
 devices   [ ce0 ce1 ce2 ]
 transient ce0
  reason "Failed to connect to device ce0: connection refused"
 is-atomic true
devices commit-queue queue-item 9494530158
 age         61
 status      blocked
 devices     [ ce0 ]
 waiting-for [ ce0 ]
 is-atomic   true

The age field indicated how many seconds a queue item has been in the queue.

You can also view the queue items in detailed mode:

ncs# show devices commit-queue queue-item 9494530158 details | notab
devices commit-queue queue-item 9494530158
 age         278
 status      blocked
 devices     [ ce0 ]
 waiting-for [ ce0 ]
 is-atomic   true
 modification ce0
  data       <interface xmlns="urn:ios">
               <GigabitEthernet>
                 <name>0/25</name>
               </GigabitEthernet>
             </interface>

  local-user admin

The commit queue is disabled when both HA is enabled, and its HA role is none, i.e., not primary or secondary. See .

A number of useful actions are available to manipulate the queue:

devices commit-queue add-lock device [ ... ]. This adds a fictive queue item to the commit queue. Any queue item, affecting the same devices, which is entering the commit queue will have to wait for this lock item to be unlocked or deleted. If no devices are specified, all devices in NSO are locked.
devices commit-queue clear. This action clears the entire queue. All devices present in the commit queue will, after this action, have executed be out of sync. The clear action is a rather blunt tool and is not recommended to be used in any normal use case.
devices commit-queue prune device [ ... ] . This action prunes all specified devices from all queue items in the commit queue. The affected devices will, after this action has been executed, be out of sync. Devices that are currently being committed to will not be pruned unless the force option is used. Atomic queue items will not be affected, unless all devices in it are pruned. The force option will brutally kill an ongoing commit. This could leave the device in a bad state. It is not recommended in any normal use case.
devices commit-queue set-atomic-behaviour atomic [ true,false ]. This action sets the atomic behavior of all queue items. If these are set to false, the devices contained in these queue items can start executing if the same devices in other non-atomic queue items ahead of it in the queue are completed. If set to true, the atomic integrity of these queue items is preserved.
devices commit-queue wait-until-empty. This action waits until the commit queue is empty. The default is to wait infinity. A timeout can be specified to wait for a number of seconds. The result is empty if the queue is empty or timeout if there are still items in the queue to be processed.
devices commit-queue queue-item [ id ] lock. This action puts a lock on an existing queue item. A locked queue item will not start executing until it has been unlocked.
devices commit-queue queue-item [ id ] unlock. This action unlocks a locked queue item. Unlocking a queue item that is not locked is silently ignored.
devices commit-queue queue-item [ id ] delete. This action deletes a queue item from the queue. If other queue items are waiting for this (deleted) item, they will all automatically start to run. The devices of the deleted queue item will, after the action has been executed, be out of sync if they haven't started executing. Any error option set for the queue item will also be disregarded. The force option will brutally kill an ongoing commit. This could leave the device in a bad state. It is not recommended in any normal use case.
devices commit-queue queue-item [ id ] prune device [ ... ]. This action prunes the specified devices from the queue item. Devices that are currently being committed to will not be pruned unless the force option is used. Atomic queue items will not be affected, unless all devices in it are pruned. The force option will brutally kill an ongoing commit. This could leave the device in a bad state. It is not recommended in any normal use case.
devices commit-queue queue-item [ id ] set-atomic-behaviour atomic [ true,false ]. This action sets the atomic behavior of this queue item. If this is set to false, the devices contained in this queue item can start executing if the same devices in other non-atomic queue items ahead of it in the queue are completed. If set to true, the atomic integrity of the queue item is preserved.
devices commit-queue queue-item [ id ] wait-until-completed. This action waits until the queue item is completed. The default is to wait infinity. A timeout can be specified to wait for a number of seconds. The result is completed if the queue item is completed or timeout if the timer expired before the queue item was completed.
devices commit-queue queue-item [ id ] retry. This action retries devices with transient errors instead of waiting for the automatic retry attempt. The device option will let you specify the devices to retry.

ncs# show alarms alarm-list alarm ce0 commit-through-queue-blocked
alarms alarm-list alarm ce0 commit-through-queue-blocked /devices/device[name='ce0'] 9494530158
 is-cleared              false
 last-status-change      2015-02-09T16:48:17.915+00:00
 last-perceived-severity warning
 last-alarm-text         "Commit queue item 9494530158 is blocked because item 9494446997 cannot connect to ce0"
 status-change 2015-02-09T16:48:17.915+00:00
  received-time      2015-02-09T16:48:17.915+00:00
  perceived-severity warning
  alarm-text         "Commit queue item 9494530158 is blocked because item 9494446997 cannot connect to ce0"

Block other affecting device ce0 from entering the commit queue:

ncs(config)# devices commit-queue add-lock device [ ce0 ] block-others
commit-queue-id 9577950918
ncs# show devices commit-queue | notab
devices commit-queue queue-item 9494446997
 age       1444
 status    executing
 devices   [ ce0 ce1 ce2 ]
 transient ce0
  reason "Failed to connect to device ce0: connection refused"
 is-atomic true
devices commit-queue queue-item 9494530158
 age         1361
 status      blocked
 devices     [ ce0 ]
 waiting-for [ ce0 ]
 is-atomic   true
devices commit-queue queue-item 9577950918
 age         55
 status      locked
 devices     [ ce0 ]
 waiting-for [ ce0 ]
 is-atomic   true

Now queue item 9577950918 is blocking other items using ce0 from entering the queue.

Prune the usage of the device ce0 from all queue items in the commit queue:

ncs(config)# devices commit-queue set-atomic-behaviour atomic false
ncs(config)# devices commit-queue prune device [ ce0 ]
num-affected-queue-items 2
num-deleted-queue-items 1
ncs(config)# show devices commit-queue | notab
devices commit-queue queue-item 9577950918
 age              102
 status           locked
 kilo-bytes-size  1
 devices          [ ce0 ]
 is-atomic        true

The lock will be in the queue until it has been deleted or unlocked. Queue items affecting other devices are still allowed to enter the queue.

Fix the problem with the device ce0, remove the lock item and sync from the device:

ncs(config)# devices commit-queue queue-item 9577950918 delete
ncs(config)# devices device ce0 sync-from
result true

Commit Queue in a Cluster Environment

Example: Commit Queue in an LSA Cluster

ncs(config)# show configuration
vpn l3vpn volvo
 as-number 65101
 endpoint branch-office1
  ce-device    ce1
  ce-interface GigabitEthernet0/11
  ip-network   10.7.7.0/24
  bandwidth    6000000
 !
 endpoint main-office
  ce-device    ce0
  ce-interface GigabitEthernet0/11
  ip-network   10.10.1.0/24
  bandwidth    12000000
 !
!

ncs(config-if)# commit commit-queue async
commit-queue-id 9494530158

ncs# show devices commit-queue | notab
devices commit-queue queue-item 9494446997
 age       60
 status    executing
 devices   [ lsa-nso2 lsa-nso3 ]
 is-atomic true

ncs# show devices commit-queue | notab
devices commit-queue queue-item 9494446997
 age       66
 status    executing
 devices   [ lsa-nso2 ]
 completed [ lsa-nso3 ]
 is-atomic true

ncs# show devices commit-queue
% No entries found.

Configuring Commit Queue in a Cluster Environment

Example: Enabling the ncs-events Stream

<stream>
  <name>ncs-events</name>
  <description>NCS event according to tailf-ncs-devices.yang</description>
  <replay-support>true</replay-support>
  <builtin-replay-store>
    <enabled>true</enabled>
    <dir>./state</dir>
    <max-size>S10M</max-size>
    <max-files>50</max-files>
  </builtin-replay-store>
</stream>

In addition, the commit queue needs to be enabled in the cluster configuration.

ncs(config)# cluster commit-queue enabled
ncs(config)# commit

For more detailed information on how to set up clustering, see .

Error Recovery with Commit Queue

Example: Viewing Completed Queue items

ncs# show devices commit-queue completed | notab
devices commit-queue completed queue-item 9494446997
 when      2015-02-09T16:48:17.915+00:00
 succeeded false
 devices   [ ce0 ce1 ce2 ]
 failed ce0
  reason "Failed to connect to device ce0: closed"
devices commit-queue completed queue-item 9494530158
 when      2015-02-09T16:48:17.915+00:00
 succeeded false
 devices   [ ce0 ]
 failed ce0
  reason "Deleted by user"

Example: Execute Rollback Action

ncs(config)# devices commit-queue completed queue-item 9494446997 rollback

The error option can also be given as a commit parameter.

As the error option in a cluster environment will originate on the upper node, any configuration on the lower nodes will be meaningless.

TR1; service s1 creates ce0:a and ce1:b. The nodes a and b are created in CDB. In the changes of the queue item, CQ1, a and b are created.
TR2; service s2 creates ce1:c and ce2:d. The nodes c and d are created in CDB. In the changes of the queue item, CQ2, c, and d are created.
The queue item from TR1, CQ1, starts to execute. The node a cannot be created on the device. The node b was created on the device but that change is reverted as a failed to be created.

The reverse of TR1, the rollback of CQ1, TR3, is committed.
TR3; service s1 is applied with the old parameters. Thus the effect of TR1 is reverted. Nothing needs to be pushed towards the network, so no queue item is created.
TR2; as the queue item from TR2, CQ2, is not the same service instance and has no overlapping data on the ce1 device, this queue item executes as normal.

NSO1:TR1; service s1 dispatches the service to NSO2 and NSO3 through the queue item NSO1:CQ1. In the changes of NSO1:CQ1, NSO2:s1 and NSO3:s1 are created.
NSO1:TR2; service s2 dispatches the service to NSO2 through the queue item NSO1:CQ2. In the changes of NSO1:CQ2, NSO2:s2 is created.
The queue item from NSO2:TR1, NSO2:CQ1, starts to execute. The node a cannot be created on the device. The node b was created on the device, but that change is reverted as a failed to be created.
The queue item from NSO3:TR1, NSO3:CQ1, starts to execute. The changes in the queue item are committed successfully to the network.

The reverse of TR1, rollback of CQ1, TR3, is committed on all nodes part of TR1 that failed.
NSO2:TR3; service s1 is applied with the old parameters. Thus the effect of NSO2:TR1 is reverted. Nothing needs to be pushed towards the network, so no queue item is created.
NSO1:TR3; service s1 is applied with the old parameters. Thus the effect of NSO1:TR1 is reverted. A queue item is created to push the transaction changes to the lower nodes that didn't fail.
NSO3:TR3; service s1 is applied with the old parameters. Thus the effect of NSO3:TR1 is reverted. Since the changes in the queue item NSO3:CQ1 was successfully committed to the network a new queue item NSO3:CQ3 is created to revert those changes.

If for some reason the rollback transaction fails there are, depending on the failure, different techniques to reconcile the services involved:

Make sure that the commit queue is blocked to not interfere with the error recovery procedure. Do a sync-from on the non-completed device(s) and then re-deploy the failed service(s) with the reconcile option to reconcile original data, i.e., take control of that data. This option acknowledges other services controlling the same data. The reference count will indicate how many services control the data. Release any queue lock that was created.
Make sure that the commit queue is blocked to not interfere with the error recovery procedure. Use un-deploy with the no-networking option on the service and then do sync-from on the non-completed device(s). Make sure the error is fixed and then re-deploy the failed service(s) with the reconcile option. Release any queue lock that was created.

Commit Queue Tuning

NETCONF Call Home

The NSO device manager has built-in support for the NETCONF Call Home client protocol operations over SSH as defined in .

If an SSH connection is established, any outstanding configuration in the commit queue for the device will be pushed. Any notification stream for the device will also be reconnected.

NETCONF Call Home is enabled and configured under /ncs-config/netconf-call-home in the ncs.conf file. By default NETCONF Call Home is disabled.

Notifications

In the tailf-ncs.yang data model, you find a YANG data model that can be used to:

Setup subscriptions. A subscription is configuration data from the point of view of NSO, thus if NSO is restarted, all configured subscriptions are automatically resumed.
Inspect which named streams a managed device publishes.
View all received notifications.

Notifications must be defined at the top level of a YANG module. NSO does currently not support defining notifications inside lists or containers as specified in section 7.16 in .

An Example Session

In this section, we will use the examples.ncs/web-server-farm/basic example.

Example: notif.yang

module notif {
  namespace "http://router.com/notif";
  prefix notif;

  import ietf-inet-types {
    prefix inet;
  }


  notification startUp {
    leaf node-id {
      type string;
    }
  }

  notification linkUp {
    leaf ifName {
      type string;
      mandatory true;
    }
    leaf extraId {
      type string;
    }
    list linkProperty {
      max-elements 64;
      leaf newlyAdded {
        type empty;
      }
      leaf flags {
        type uint32;
        default 0;
      }
      list extensions {
        max-elements 64;
        leaf name {
          type uint32;
          mandatory true;
        }
        leaf value {
          type uint32;
          mandatory true;
        }
      }
    }

    list address {
      key ip;
      leaf ip {
        type inet:ipv4-address;
      }
      leaf mask {
        type inet:ipv4-address;
      }
    }

    leaf-list iface-flags {
      type enumeration {
        enum UP;
        enum DOWN;
        enum BROADCAST;
        enum RUNNING;
        enum MULTICAST;
        enum LOOPBACK;
      }
    }
  }


  notification linkDown {
    leaf ifName {
      type string;
      mandatory true;
    }
  }
}

Follow the instructions in the README file if you want to run the example: build the example, start netsim, and start NCS.

admin@ncs# show devices device pe2 notifications stream | notab
notifications stream NETCONF
 description    "default NETCONF event stream"
 replay-support false
notifications stream tailf-audit
 description    "Tailf Commit Audit events"
 replay-support true
notifications stream interface
 description              "Example notifications"
 replay-support           true
 replay-log-creation-time 2014-10-14T11:21:12+00:00
 replay-log-aged-time     2014-10-14T11:53:19.649207+00:00

The above shows how we can inspect - as status data - which named streams the managed device publishes. Each stream also has some associated data. The data model for that looks like this:

Example: tailf-ncs.yang Notification Streams

module tailf-ncs {
  namespace "http://tail-f.com/ns/ncs";
  ...
  container devices {
     list device {
       ....
       container notifications {
          ....

          list stream {
             description "A list of the notification streams
                          provided by the device. NCS reads this list in
                          real time";

             config false;
             key name;
             leaf name {
               description "The name of the the stream";
               type string;
             }
             leaf description {
               description "A textual description of the stream";
               type string;
             }
             leaf replay-support {
               description "An indication of whether or not event replay
                            is available on this stream.";
               type boolean;
             }
             leaf replay-log-creation-time {
               description "The timestamp of the creation of the log
                           used to support the replay function on
                           this stream.
                           Note that this might be earlier then
                           the earliest available
                           notification in the log.  This object
                           is updated if the log resets
                           for some reason.";

               type yang:date-and-time;
             }
             leaf replay-log-aged-time {
               description "The timestamp of the last notification
                            aged out of the log";
               type yang:date-and-time;
             }
           }

Let's set up a subscription for the stream called interface. The subscriptions are NSO configuration data, thus to create a subscription we need to enter configuration mode:

Example: Configuring a Subscription

admin@ncs(config)# devices device www0..2 notifications \
      subscription mysub stream interface
admin@ncs(config-subscription-mysub)# commit

Example: Viewing the Received Notifications

admin@ncs# show devices device notifications | notab
devices device www0
 notifications subscription mysub
  local-user admin
  status     running
 notifications stream NETCONF
  description    "default NETCONF event stream"
  replay-support false
 notifications stream tailf-audit
  description    "Tailf Commit Audit events"
  replay-support true
 notifications stream interface
  description              "Example notifications"
  replay-support           true
  replay-log-creation-time 2014-10-14T11:21:12+00:00
  replay-log-aged-time     2014-10-14T11:56:45.755964+00:00
 notifications notification-name startUp
  uri http://router.com/notif
 notifications notification-name linkUp
  uri http://router.com/notif
 notifications notification-name linkDown
  uri http://router.com/notif
 notifications received-notifications notification 2014-10-14T11:54:43.692371+00:00 0
  user          admin
  subscription  mysub
  stream        interface
  received-time 2014-10-14T11:54:43.695191+00:00
  data linkUp ifName eth2
  data linkUp linkProperty
   newlyAdded
   flags      42
   extensions
    name  1
    value 3
   extensions
    name  2
    value 4668
  data linkUp address 192.168.128.55
   mask 255.255.255.0

It is fairly instructive to inspect the XML that goes on the wire when we create a subscription and then also receive the first notification. We can do:

ncs(config)# devices global-settings trace pretty trace-dir ./logs
ncs(config)# commit

ncs(config)# devices disconnect

ncs(config)# devices device pe2 notifications \
     subscription foo stream interface
ncs(config-subscription-foo)# top
ncs(config)# exit

ncs# file show ./logs/netconf-pe2.trace
<<<<in 14-Oct-2014::13:59:52.295 device=pe2 session-id=14
<notification xmlns="urn:ietf:params:xml:ns:netconf:notification:1.0">
  <eventTime>2014-10-14T11:58:51.816077+00:00</eventTime>
  <linkUp xmlns="http://router.com/notif">
    <ifName>eth2</ifName>
    <linkProperty>
      <newlyAdded/>
      <flags>42</flags>
      <extensions>
        <name>1</name>
        <value>3</value>
      </extensions>
      <extensions>
        <name>2</name>
        <value>4668</value>
      </extensions>
    </linkProperty>
    <address>
      <ip>192.168.128.55</ip>
      <mask>255.255.255.0</mask>
    </address>
  </linkUp>
</notification>
 .........

ncs(config)# devices device www0 notifications \
   received-notifications max-size 100
admin@ncs(config-device-www0)# commit

The default value is 200. Once the size of the circular buffer is exceeded, the older notification is removed.

Subscription Status

A running subscription can be in either of three states. The YANG model has:

module tailf-ncs {
  namespace "http://tail-f.com/ns/ncs";
  ...
  container devices {
     list device {
       ....
       container notifications {
          ....
          list subscription {
             .....
            leaf status {
            description "Is this subscription currently running";
            config false;
            type enumeration {
              enum running {
                description "The subscription is established and we should
                             be receiving notifications";
              }
              enum connecting {
                description "Attempting to establish the subscription";
              }
              enum failed {
                description
                "The subscription has failed, unless the failure is
                 in the connection establishing, i.e connect() failed
                 there will be no automatic re-connect";
              }
            }
          }

ncs# show devices device notifications subscription
             LOCAL           FAILURE  ERROR
NAME  NAME   USER   STATUS   REASON   INFO
---------------------------------------------
www0  foo    admin  running  -        -
      mysub  admin  running  -        -
www1  mysub  admin  running  -        -
www2  mysub  admin  running  -        -

SNMP Notifications

These actions are programmed in Java, see the for how to do this.

Operations

NEDs and Adding Devices

Device Authentication

Connecting Devices for Different NED Types

CLI NEDs

NETCONF NEDs, JunOS

SNMP NEDs

Generic NEDs

Live Status Protocol

Multi-NEDs for Statistics

Administrative State for Devices

Troubleshooting NEDs

Device Communication Failure

NSO Device Manager

Managed Device Tree

The NED Packages

Starting the NSO Daemon

Synchronizing Devices

Partial sync-from

Configuring Devices

Connection Management

Authentication Groups

Connecting Using SSH Keyboard-Interactive (Multi-Factor) Authentication

Using a Callback to Provide Device Credentials

Caveats

Device Session Pooling

Device Session Limits

Tracing Device Communication

Checking Device Configuration

Comparing Device Configurations

Initialize Device

From Other

By Template

Device Templates

Tags

Debug

Renaming Devices in NSO

Examples

Limitations

Auto-configuring Devices in NSO

Examples

oper-state and admin-state

Configuration Source

Capabilities, Modules, and Revision Management

Discovery of a NETCONF Device

Configuration Datastore Support

Action Proxy

Device Groups

Policies

Commit Queue

Commit Queue Scheduling

Viewing and Manipulating the Commit Queue

Commit Queue in a Cluster Environment

Configuring Commit Queue in a Cluster Environment

Error Recovery with Commit Queue

Commit Queue Tuning

NETCONF Call Home

Notifications

An Example Session

Subscription Status

SNMP Notifications

Inactive Configuration

SSH Key Management

NSO as SSH Server

Host Keys

Public Key Authentication

NSO as SSH Client

Host Key Verification

Verification Level

Connection to a Managed Device

Connection to an NSO Cluster Node

Public Key Authentication

Private Key Selection

Connection to a Managed Device

Connection to an NSO Cluster Node

Alarm Manager

Alarm Concepts

Alarm List Administrative Actions

The Alarm Model

Alarm Handling

Partial `sync-from`

`oper-state` and `admin-state`

`command` Section

`param` Section

Full `command` Example

`policy` Section

Full `policy` Example

`post-commit` Section

Full `post-commit` Example

`command` Section