Saturday, June 8, 2013

Troubleshooting Cisco Switches

When an outage or network incident takes place, it can often create an intense burden for the engineers called upon to assist. As is the case with many emergencies, the demand for action and results may supersede more effective ways of resolving the issue(s) at hand. Having a solid strategy for how to approach incidents can give you the calm confidence to effectively resolve the problem even in the midst of the proverbial storm.

In ancient societies, many rituals existed, some targeted especially at recognizing the passage of an individual from youth to a recognized member of the adult community. These ceremonies or events are often referred to as a "rite of passage" and indicate an important milestone in the life of that person as well as the group of which they are a part. In most western cultures, this is no longer directly relevant, but certain experiences certainly play a similar role. For almost every network engineer, the "rite of passage" is a network outage or problem that was unpleasantly memorable and particularly difficult, and told in stories for years to come. The important concept here is to realize that while these things can and do happen, they should remain infrequent events rather than frequent occurrences. This is essentially what network troubleshooting and problem resolution is all about. In this white paper we will examine five steps for addressing issues, and provide some tools for dealing with issues when they arise.

Issues and problems with networks of any shape and size are fundamentally inevitable, almost entirely due to the nature of the human condition, namely, imperfection. On the one hand, the fact that imperfect people created the networking technology used in the world today guarantees that flaws and imperfections will exist in that technology. Algorithms will malfunction, hardware will fail, and software will have bugs in it that can create issues of various kinds. On the other hand, network engineers troubleshooting issues almost always have to deal with a crowd of end-users, managers, and company leadership, both at times when the network is in steady-state and when it is having problems.

This underscores topics discussed in other white papers and sources of information, namely that problems can and will arise. The key, then, is to prevent as many issues as possible before they even arise, through activities such as proactive maintenance, device monitoring, and so forth. The best solution to a problem is to keep it from occurring in the first place. This helps minimize true problems that are not possible to foresee and builds confidence on the part of the end-users that matters are well in hand.

One of the reasons for pointing out the nature of the human condition at the outset is strategic, since some issues may in fact not even be issues at all. An actual issue in this regard involved Internet access at a large port authority on the west coast, when the customer requested the Internet Service Provider to investigate a problem. Upon examination of the usage reports generated internally by the provider, the customer's support engineer noted that the spikes consuming all of the traffic were taking place at approximately 2:00 AM PST, when the offices were closed. When the engineer arrived at the customer site to report the findings, the customer reluctantly admitted that a janitor (who was dismissed shortly after) had been illegally downloading movies of questionable content. The port authority began with the assumption that the service provider was having a service issue, but investigation revealed the real source of the problem.

The investigation and diagnosis phase is the most critical part of the process, as it sets the stage for rallying resources and helps to narrow the scope of the actual issue. Without sounding disingenuous, understand that endusers will probably not understand networking technology at even a fundamental level, and that the complaint may not even have a technical foundation. For example, the user may report that they are experiencing network slowness and may even be impatient, but when you question further, you may discover that they are downloading large files or streaming videos from the Internet. In reality, that individual may have felt that the network was the issue when it was a problem of their own making. The skill required when interacting with end-users is to ask the right questions to get the information without creating offense.

In the healthcare field, patients visit a doctor with a set of symptoms that they need addressed and treated. In some cases, such as the common cold, treating the symptoms themselves is advised, mostly because nothing else can be done. In other situations, the physician may order a variety of tests in order to find out the true root of the problem. Once the actual root cause is discovered, then the healthcare practitioner can set about a treatment plan to address and resolved the problem.


View the original article here

No comments:

Post a Comment