Problems with uk.pool.ntp.org stability

How Nutanix helped us uncover an unexpected issue with a public NTP source For many years now we have, by default, used uk.pool.ntp.org as an NTP source to synchronise time on all our systems. It is becoming increasingly important to keep all systems tightly integrated to a common time reference. This is especially the case with distributed clusters now becoming more common with the advent of Hyperconverged systems. Recently we had started to get alerts on our Nutanix cluster. It started with one cluster, so with Nutanix we were investigating any configuration or networking issues that could be causing it. Then when two other clusters on another site started displaying errors then we began to suspect it was the time source – unlikely as that seems. The problem was intermittent, and determined with a built in NTP configuration check performed hourly by the Nutanix system. We were getting errors reported between 5% and 15% of the time, which varied by node. Every time an error was detected it raised an alert, which was getting wearing, so we had to make some sort of change. All change! We therefore changed all systems over to time.google.com – Google’s new (as of last December) free NTP source. We use their Global DNS Cloud service, so assuming that the level of reliability would be the same we decided to at least try this alternate time source. Immediately the errors stopped. So this indicates that there is a problem server int he UK NTP Org pool. As we’d been having trouble for weeks on and off it doesn’t look like it’s going to get...

Ethernet Fabric – Towards Unified Networking

As Ethernet networks grow they invariably become more complex and difficult to manage, and due to the way they have to be configured they often suffer performance penalties as they grow. Ethernet networks also have to contend with an increasingly diverse set of applications, and now with iSCSI and AoE (as well as FCoE) we are seeing block storage being added to this mix. Different use cases have different requirements, and while web traffic is pretty latency tolerant the latest virtualised block storage requires very low latency and high bandwidth to maintain system performance. In order to be able to combine and scale the requirements of the Enterprise network, Ethernet needs to evolve into a new form that is capable of meeting these demands – Ethernet Fabric. Ethernet Fabric is borne out of the need to produce flatter, intelligent, more simple, scalable and efficient networks as demands on those networks increase. Traditional Ethernet networks were fine in the days of client/server, but as virtualisation develops into true heterogeneous cloud resource pools the way the network needs to tie these layers together needs to adapt to cope, if the network is not to become the bottleneck. Ethernet Fabric was developed to address these requirements. Flatter Classic Ethernet networks are hierarchical with three or more tiers. Traffic has to move up and down this logical tree to be able to flow between server racks (left hand image above), which adds latency and creates congestion on inter-switch links (ISLs). This is because in order to prevent traffic loops only 1 path in a redundant connection architecture can be allowed to be active...
Using MPLS to add routability to Coraid’s AoE

Using MPLS to add routability to Coraid’s AoE

A common point raised when comparing iSCSI to AoE protocols is that AoE is not routable and therefore not satisfactory for Enterprise use where there may be many sites between which you wish to share data. In fact this feature is what lead to the development of iSCSI in the first place. This is true in that AoE is a light layer 2 protocol integrated with Ethernet frames, and therefore by definition it is stopped when it meets a router. This provides security in that data cannot inadvertently be routed out of the network but also causes a headache when it needs to be routed away from a common LAN segment, with DR being a common requirement needing this feature. Because AoE does not have a built in authentication method like iSCSI, and can only secure data with LUN to MAC address masking, it would also be a risk to expose the data to any external network directly. Coraid have their own solution for this, which requires placement of a AoE gateway at the LAN segment edge before the router, which can then route encrypted AoE packets over IP to another gateway on another network. This works great over IP but is vendor specific, so what if you like the idea of AoE but want a more generic solution – AoE is Open Source after all? In research at University College Dublin, they found that AoE over MPLS provides a routable protocol which can be implemented without a need for tunnels, and with a very modest increase in the header size in comparison with raw AoE. As a side...

A comparison of AoE to FC and iSCSI protocols

One of the first issues I have to contend with when talking about Coraid storage and its use of the ATA-over-Ethernet (AoE) protocol to transfer data, is the response “Ethernet? Oh, so it’s iSCSI then?”. No it isn’t…. AoE was built from the ground up as an open source data transfer protocol, specifically concerned with finding the most efficient way to transmit raw disk I/O commands over raw Ethernet, and keeping the overhead as low as possible to maximize the throughput. In many ways AoE is more akin to Fibre Channel (FC) than it is to iSCSI in that it is a non routable protocol designed for locally based storage rather than sending data over the Internet. Like FC, AoE can be made to route over the Internet when it needs to, such as in site-to-site DR applications, but the non routable nature of the protocol makes accidental exposure of data to non authorized networks that much harder. So in order to help differentiate the data transfer protocols upon which all your networked storage systems are based, this blog entry is here to help dispel some of the myths about AoE. The only real comparison of AoE and iSCSI is that they both use Ethernet as the transport medium. iSCSI uses TCP/IP at Layer 4 and AoE Layer 2, but after that things get very different. Data delivery the iSCSI way The diagram below shows how data is sent from a client to a disk device using the iSCSI protocol.     iSCSI is a connection based topology, as is FC, and therefore requires sequenced serial delivery of the...

Clearing up some misconceptions about the AoE protocol

I have stumbled upon an interesting blog conversation regarding the AoE protocol, a cornerstone of the Coraid Etherdrive storage systems we recommend. I have come across many storage professionals who just don’t seem to like simplicity in anything and AoE is so simple it does take a lot of convincing for some to take it seriously – this is no exception! The blog to which I am referring to was entitled “ATA over Ethernet for converged data center networks? No way” and originally published here. Many points were raised but were picked up by Coraid and the response makes some interesting reading, not only for the way in which the points raised were dealt with, but also in the way it shows just how far we still need to go to convince some storage professionals that you can have simple in an Enterprise! The response in its entirety appears below:   No sequencing The protocol does not contain a single sequence number that would allow servers and storage arrays to differentiate between requests or split a single request into multiple Ethernet frames. A server can thus have only a single outstanding request with any particular storage array. (Or maybe LUN — who knows? The protocol specifications are silent.) Answer: As packets between initiator and target are not connection based, sequence numbers are irrelevant. A client can, however, have multiple requests outstanding with different tag values which is how a target differentiates between requests. Spreading between Ethernet frames is performed by the client who is responsible for turning a large request into a series of MTU sized requests (a 64KB...