5 Reasons Why You Should Include LAN Switches in Your NCCM Scope

We’ve been doing a lot of blogging around here lately about NCCM and the importance of having an automated configuration and change management system. We’ve even published a Best practices guide for NCCM. One of the main points in any NCCM system is having consistent and accurate configuration backups of all of your “key” devices.

When I ask Network Managers to name their key devices, they generally start with WAN / Internet routers and Firewalls. This makes sense of course because, in a modern large-scale network, connectivity (WAN / Internet routers) & security (Firewalls) tend to get most of the attention. However, we think that it’s important not to overlook core and access switching layers. After all, without that “front line” connectivity – the internal user cannot get out to the WAN/Internet in the first place.

With that in mind, today’s blog offers up 5 Reasons Why You Should Include LAN Switches in Your NCCM Scope


5 Reasons Why You Should Include LAN Switches in Your NCCM Scope1. Switch Failure

LAN switches tend to be some of the most utilized devices in a network. They also don’t generally come with the top quality hardware and redundant power supplies that core devices have. In many cases, they may also be located on less than pristine locations. Dirty manufacturing floors, dormitory closets, remote office kitchens – I have seen access switches in all of these places. When you combine a heavy workload with tough conditions and less expensive part, you have a recipe for devices that will fail at a higher rate.

So, when that time comes to replace / upgrade a switch, having its configuration backed up and a system which can automate the provisioning of the new system can be a real time and workload saver. Just put the IP address and some basic management information on the new device and the NCCM tool should be able to take off the rest in mere minutes.

2. User Tracking

As the front line connectivity device for the majority of LAN users, the switch is the best place to track down user connections. You may want to know where a particular user is located, or maybe you are trying to troubleshoot an application performance issue; no matter what the cause, it’s important to have that connectivity data available to the IT department. NCCM systems may use layer 2 management data from CDP/LLDP as well as other techniques to gather this information. A good system will allow you to search for a particular IP/MAC/DNS and return connectivity information like which device/port it is connected to as well as when it was first and last seen on that port. This data can also be used to draw live topology maps which offer a great visualization of the network.

3. Policy Checking

Another area where the focus tends to be on “gateway” devices such as WAN routers and firewalls is policy checking. While those devices certainly should have lots of attention paid to them, especially in the area of security policies, we believe that it’s equally as important not to neglect the access layer when it comes to compliance. In general terms, there are two aspects of policy checking which need to be addressed on these devices: QoS policies and regulatory compliance policies.

The vast majority of VoIP and Video systems will connect to the network via a traditional LAN switch. These switches, therefore, must have the correct VLAN and QoS configurations in order to accurately forward the traffic in the appropriate manner so that Quality of Service is maintained.

If your organization is subject to regulatory compliance standards such as PCI, HIPAA etc then these regulations are applicable to all devices and systems that are connected to or pass sensitive data.

In both of these cases, it is incredibly important to ensure policy compliance on all of your devices, even the ones on the “edge” of your network.

4. Asset Lifecycle Management

Especially in larger and more spread out organizations, just understanding what you have can be a challenge. At some point (and always when you are least prepared for it) you will get the “What do we have?” question from a manager. An NCCM system is exactly the right tool to use to answer this question. Even though NCCM is generally considered to be the tool for change – it is equally the tool for information. Only devices that are well documented can be managed and that documentation is best supplied through the use of an automated inventory discovery system. Likewise, when it is time for a technology refresh, or even the build out of a new location or network, understanding the current state of the existing network is the first step towards building an effective plan for the future.

5. New Service Initiatives

Whether you are a large IT shop or a service provider – new applications and services are always coming. In many cases, that will require widespread changes to the infrastructure. The change may be small or larger, but if it needs to be implemented on a number of systems at the same time, it will require coordination and automation to get it done efficiently and successfully. In some instances, this will only require changes to the core, but in many cases it will also require changes to the switch infrastructure as well. This is what NCCM tools were designed to do and there is no reason that you should be handcuffed in your efforts to implement change just because you haven’t added all of your devices into the NCCM platform.

Networks are complicated systems of many individual components spread throughout various locations with interdependencies that can be hard to comprehend without the help of network management tools. While the temptation may be to focus on the core systems, we think that it’s critical to view all parts, even the underappreciated LAN switch, as equal pieces to the puzzle and, therefore, should not be overlooked when implementing an NCCM system.

Top 20 Best Practices for NCCM

Thanks to NMSaaS for the article.

Advertisements

Virtualization Visibility

Virtualization Visibility

See Virtual with the Clarity of Physical

The cost-saving shift to virtualization has challenged network teams to maintain accurate views. While application performance is often the first casualty when visibility is reduced, the right solution can match and in some cases even exceed the capabilities of traditional monitoring strategies.

Virtual Eyes

Network teams are the de facto “first responders” when application performance degrades. For this reason, it’s critical to maintain visibility into and around all virtual constructs for effective troubleshooting and optimal service delivery. Otherwise, much of the value of server virtualization and consolidation efforts may be offset by sub-par application performance.

Fundamentally, achieving comprehensive visibility of a virtualized server environment requires an understanding of the health of the underlying resources, including host, hypervisor, and virtual switch (vSwitch) along with perimeter client, and application traffic.

In addition, unique communication technologies like VXLAN, and Cisco FabricPath must be supported for full visibility into the traffic in these environments. Without this support, network analyzers cannot gain comprehensive views into virtual data center (VDC) traffic.

Step One: Get Status of Host and Virtualization Components

The host, hypervisor, and vSwitch are the foundation of the entire virtualization effort so their health is crucial. Polling technologies such as SNMP, WSD, and WMI can provide performance insight by interrogating the host and various virtualized elements. A fully-integrated performance management platform can not only provide these views, but also display relevant operating metrics in a single, user-friendly dashboard.

Metrics like CPU utilization, memory usage, and virtualized variables like individual VM instance status are examples of accessible data. Often, these parameters can point to the root cause of service issues that may otherwise manifest themselves indirectly.

Virtualization Visibility

For example, poor response time of an application hosted on a virtualized server may have nothing to do with the service or the network, but may instead be tied to excessively high CPU utilization. Without this monitoring perspective, troubleshooting will be more difficult and time consuming.

Next Steps

Virtualization and consolidation offers significant upside for today’s dynamic data center model and in achieving optimal IT business service delivery. However, monitoring visibility must be maintained so potential application degradation issues can be detected and resolved before impacting the end user.

To learn more about how your team can achieve the same visibility in virtualized environments as you do in physical environments, download the complete 3 Steps to Server Virtualization Visibility White Paper now.

Thanks to Viavi Solutions for the article.

External Availability Monitoring – Why it Matters

External Availability Monitoring - Why it MattersRemember the “good old days” when everyone that worked got in their car and drove to a big office building every day? And any application that a user needed was housed completely within the walls of the corporate datacenter? And partners / customers had to dial a phone to get a price or place an order? Well, if you are as old as I am, you may remember those days – but for the vast majority you reading this, you may think of what I just described as being about as common as a black and white TV.

The simple fact is that as the availability and ubiquity of the Internet has transformed the lives of people, it has equally (if not more dramatically) transformed IT departments.In some way this has been an incredible boon, for example, I can now download and install new software in a fraction of the time it used to take to purchase and receive that same software on CD’s (look it up kids).

Users can now login to almost any critical business application from anywhere there is a Wi-Fi connection. They can probably perform their job function to nearly 100% from their phone….in a Starbucks…. or on an airplane…..But of course, with all of the good, comes (some) of the bad – or at least difficult challenges for the IT staff whose job it is to keep all of those applications available to everyone , everywhere, all of the time. The (relatively) simple “rules” for IT monitoring need to be re-thought and extended for the modern work place. This is where External Availability Monitoring comes in.

We define External Availability Monitoring (EAM) as the process through which your critical network services and the applications that run over them are continuously tested from multiple test points which simulate real world geo-diversity and connectivity options. Simply put, you need to constantly monitor the availability and performance of any public facing services. This could be your corporate website, VPN termination servers, public cloud based applications and more.

This type of testing matters, because the most likely cause of service issues today is not call from Bob on the 3rd floor, but rather Jane who is currently in a hotel in South America and is having trouble downloading the latest presentation from the corporate intranet which she needs to deliver tomorrow morning.

Without a proactive approach to continuous service monitoring, you are flying blind as to issues that impact the global availability – and therefore operations- of your business.

So, how is this type of monitoring delivered? We think the best approach is to setup multiple types of tests such as:

  • ICMP Availability
  • TCP Connects
  • DNS Tests
  • URL Downloads
  • Multimedia (VoIP and Video) tests (from external agent to internal agent)
  • Customized application tests

These tests should be performed from multiple global locations (especially from anywhere your users commonly travel). This could even include work from home locations. At a base level, even a few test points can alert you quickly to availability issues.

More test points can increase the accuracy with which you can pinpoint some problems. It may be that the incident seems to be isolated to users in the Midwest or is only being seen on apps that reside on a particular cloud provider. The more diverse data you collect, the swifter and more intelligent your response can be.

Alerts should be enabled so that you can be notified immediately if there is an issue with application degradation, or “service down” situation. The last piece to the puzzle is to quickly be able to correlate these issues with underlying internal network or external service provider problems.

We see this trend of an “any application, anywhere, anytime” service model becoming the standard for IT departments large and small. With this shift comes an even greater need for continuous & proactive External Availability Monitoring.

External Availability Monitoring - Why it Matters

Thanks to NMSaaS for the article.

3 Key Cloud Monitoring Metrics

Reliably Measure Performance, and Ensure Effective Monitoring in the Cloud

When adopting monitoring strategies for cloud services, it’s important to establish performance metrics and baselines. This involves going beyond vendor-provided data, to ensure your monitoring tools provide:

Taking a holistic approach ensures your staff can manage any performance issue and provide proof to cloud providers if their systems are the cause of the issue.

CLOUD MONITORING CHALLENGES

The primary issue with ensuring reliable cloud performance revolves around the lack of metrics for monitoring the Service Level Agreement (SLA) and performance. Understanding these issues is critical for developing successful monitoring strategies.

HERE ARE A FEW COMMON CHALLENGES

Obtaining Application Performance Agreements

While vendors will highlight service or resource availability in their SLAs, application performance metrics or expectations are typically absent. In the case of Salesforce.com, SLAs (if one is provided) discuss downtime, but there aren’t any performance guarantees.

Lack of Performance Metrics

Similar to SLAs, organizations should not expect vendors to provide any meaningful performance metrics beyond service and resource availability. If managers rely on trust.salesforce.com to track CRM service performance and availability, they are limited to:

  • Monitoring server status
  • Transactions handled
  • Server processing speed

These reports fail to provide meaningful performance metrics to evaluate service degradation issues or to isolate problems to the cloud vendor.

No Meaningful Performance Benchmarks

Most Software as a Service (SaaS) vendors don’t offer any benchmarks or averages that allow you to forecast potential performance or service demand. While Infrastructure as a Service (IaaS) and Platform as a Service (PaaS) vendors will provide cloud performance benchmarks, these numbers don’t take into account:

  • Internet latency
  • The location of your users to the services
  • Your network environment

The challenge is to properly benchmark and create meaningful metrics for your organization.

These challenges require organizations to take a holistic approach to monitoring by implementing solutions that allow them to seamlessly view external components and performance as if they were a part of their internal network.

EFFECTIVE CLOUD MONITORING STRATEGIES

Given the lack of metrics to assess performance for a specific organization, how do engineers successfully manage user interaction with cloud services?

ENSURE SERVICE PERFORMANCE

While SLAs may not guarantee performance, cloud vendors should take action when clear proof shows their systems are the source of the problem.

How can you ensure reliable performance?

Set up synthetic transactions to execute a specific process on the cloud provider’s site. By regularly conducting these transactions, monitoring routes and response times, you can pinpoint the potential source of delay between the internal network, ISP, and cloud provider. This data along with web error codes can be provided to the cloud vendor to help them resolve issues.

Network teams should have a true view of performance that tracks packets from the user over the ISP to the cloud provider, including any cloud-hosted components.

OVERCOME LACK OF PERFORMANCE METRICS

Depending upon the service, vendors will provide varying levels of detail. The type of service also impacts what you can monitor.

What metrics can you use?

In the case of SaaS, you can monitor user interactions and synthetic transactions, and rely on vendor-provided reports.

For PaaS, monitoring solutions such as Observer Infrastructure provide significant performance metrics. These metrics can be viewed alongside response time metrics for a more complete picture of service health.

In the case of IaaS, you have access to the server’s operating system and applications. In addition to polling performance metrics of cloud components, analysis devices can be placed on the cloud server for an end-to-end view of performance.

ADDRESS PERFORMANCE BENCHMARKS

With monitoring systems in place, it’s important to baseline response times and cloud component performance. They can help you become proactive in your management strategy. How can you use baselines?

Using these baselines, you can set meaningful benchmarks for your specific organization. From this point, alarms can be set to proactively alert teams of degrading performance. Utilizing long-term packet capture, you can investigate performance from the user to the cloud and isolate potential problems.

Thanks to Network Instruments for the article.

Tracking Web Activity by MAC Address

Tracking Web Activity by MAC AddressTracking web activity is nothing new. For many years IT managers have tried to get some sort of visibility at the network edge so that they can see what is happening. One of the main drivers for this is the need to keep the network secure. As Internet usage is constantly growing, malicious, phishing, scamming and fraudulent sites are also evolving.

While some firewalls and proxy servers include reporting capabilities, most are not up to the job. These systems were designed to block or control access and reporting was just added on at a later date. Server log files do not always have the answer either. They are meant to provide server administrators with data about the behaviour of the server, not what users are doing on the Internet.

Some vendors are pitching flow type (NetFlow, IPFIX, etc…) tools to address the problem. The idea is that you get flow records from the edge of your network so you can see what IP address is connecting to what. However, as with server logs, NetFlow isn’t a web usage tracker. The main reason for this is that it does not look at HTTP headers where a lot of the important information is stored.

One of the best data sources for web tracking is packet capture. You can enable packet capturing with SPAN\mirror ports, packet brokers, TAPs or by using promiscuous mode on virtual platforms. The trick is to pull the relevant information and discard the rest so you don’t end up storing massive packet captures.

Relevant information includes things like MAC address, source IP, destination IP, time, website, URI and username. You only see the big picture when you have all of these variables in front of you.

Tracking Web Activity by MAC Address

Why track Internet activity?

  • Root out the source of Ransomware and other security threats. Track it down to specific users, IP addresses or MAC addresses
  • Maintain logs so that you can respond to third party requests. Finding the source of Bittorrent use would be a common requirement on open networks.
  • Find out why your Internet connection is slow. Employees watching HD movies is a frequent cause.
  • Out-of-band network forensics for troubleshooting or identifying odd network traffic.

Customer Use Case

End user is a very large airport in EMEA. Basic requirements and use case is tracking web activity, keeping a historical record of it for a period of one year, and because most of the users are just passing through (thousands of wireless users every hour!) the only way to uniquely identify each user or device is by MAC address.

Luckily for us, because the LANGuardian HTTP decoder captures and analyses wire data off a SPAN or mirror port it can easily track proxy or non-proxy traffic by IP or MAC address. The customer can also drill down to URI level when they need to investigate an incident. For them LANGuardian is an ideal solution for tracking BYOD activity as there are no modifications to the network and no agents, clients or logs required.

The MAC address variable is an important one when it comes to tracking devices on your network. Most networks use DHCP servers so you cannot rely on tracking activity based on IP addresses only. MAC addresses are unique per device so they will give you a reliable audit trail as to what is happening on your network.

Thanks to NetFort for the article.

IT Heroes: App Troubles in the Skies

IT Heroes: App Troubles in the Skies

Each and every day, the bold men and women of IT risk it all to deliver critical applications and services. Their stories are unique. Their triumphs inspire. Learn how the US Air Force applies the intelligence gained from the Observer Platform in field testing applications, before deployment, to ensure peak performance on the battlefield and how it could do the same for you.

Hiding in Plain Sight

Nestled in the Florida panhandle, Eglin Air Force Base is about 100 miles from the Mississippi border as the crow flies. Its location to the storied Gulf Coast, near sleepy retirement towns like Destin and Fort Walton Beach belies the fact that the 725 square mile base is home to the 46th Test Squadron.

During a Joint Expeditionary Force Experiment (JEFX), the 46th Test Squadron’s Command and Control Performance Team Lead, Lee Paggeot was on hand to make sure that hundreds of participants and myriad weapons, vehicles, and other devices stayed connected.

New Technology Revealed

As part of the JEFX experiment, Paggeot’s team focused on air-to-air communication, specifically what would be revealed as the Battlefield Airborne Communications Node. The result, they hoped, would be a flying gateway between multiple military communications networks, enabling increased coordination between forces.

“We had hundreds of systems – tons and tons of servers in a massive configuration,” says Paggeot who employed the Observer Performance Management Platform to closely monitor the sensitive airborne network.

A Costly Test

Paggeot’s secret weapon, the Observer Platform was key to ensuring the success of the expensive experiments.

“Sometimes we’d lose the event,” says Paggeot, remembering the days when network problems were far more difficult and costly to solve. “We would have spent thousands of dollars, a hundred thousand dollars. We wouldn’t know until the last day that it was a multicast broadcast storm. Now if that happens at any of our events, we know in minutes.”

Find out how this IT Hero helped the U.S. military prepare for a historic wartime event, while detailing technical deficiencies to resolve IT issues faster – 60 times faster.

Get the full Eglin Air Force Base Study Now.

Thanks to Network Instruments for the article. 

 

Application Performance Management and the Cloud

Cloud ComputingThe lack of innovation in traditional data centers has given way to developments in the cloud. It offers flexible user models such as Pay As You Go (PAYG) and Multi Tenancy services for e.g. Amazon Web Services (AWS). The downside is that as the cloud’s capacity increases (400k registrations AWS-Quadnet 2014) it is prone to more blackouts, security and compliance risks than we are led to believe.

The IT environment has become more complex around the cloud. The continued convergence of platforms and technologies has created additional challenges like Virtualization of legacy Data Center, Cloud Hosting, Software Defined Networks (SDN), remote access, Mobility (BYOD) and additional unstructured Big Data, a part of which is consumerism and encompasses User Generated Content (UGC) such as social media (Voice/Video).

The confluence of hardware and software over layered on an existing network architecture will create architectural complications in monitoring applications and network performance and visibility blind spots such as bandwidth growth across the vertical network between VM and physical servers, security and compliance protocols for remote and cloud environments etc.

The interplay of complexity e.g. in the area of data packet loss, leaks and packet segmentation in a virtualized environment will lead to delays of more than a few seconds in software performance synchronization. This can cause brownouts (lags, latency or degradation) and blackouts (crashes) which are detrimental to any commercial environment – such as retail web UI where a 2 second delay in web page uploads (slow DNS) is far too much.

The issues in a virtualized cloud lie in the Hypervisor as it changes IP addresses for VDI’s regularly. So the real measurement issue becomes getting insight into the core virtualized server environment.

When questioned, 79% of CTOs (Information Week study 2010) cited “software as very important” and with only 32% of APM service providers actually using specialized monitoring tools for the cloud. By not gaining deep insight into PaaS (Programming as a Service) and IaaS (Infrastructure as a Service), there is no visibility into the performance of application and networks. Therefore tracking degradation, latency and hub jitter becomes like finding a needle in the proverbial infrastructure haystack.

The debate surrounding cloud visibility and transparency is yet to be resolved partly because synthetic, probes, and passive agents only provide a mid-tier picture of the cloud. A passive virtual agent can be used to gain deep insight into the virtualized cloud environment. As the cloud market becomes more competitive, suppliers are being forced to disclose IaaS/PaaS performance data. Currently 59% of CTOs hold software in the cloud (Information Week 2011) without any specialized APM solution. Therefore one can only monitor the end user experience or resource used (CPU, memory etc.) to get some idea of application/network performance through the wire.

The imperative is in ensuring that your APM provider can cope with the intertwining complexities of the network, application, infrastructure and architecture. This means that a full arsenal of active and passive measuring tools need to be deployed for a pure play APM or a full MSP (Managed Service Provider) of end to end solutions that can set up, measure and translate outsourcing and SLAs into core critical measurable metrics. Furthermore, new software/technology deployments can be compared to established benchmarks allowing business decisions – such as application or hardware upgrades – to be made on current and relevant factual information i.e. business transaction, end user experience and network/application efficacy.

The convergence, consumerism, challenges and complexities based around the cloud have increased. So have the proficiencies of the leading APM providers in dealing with cloud complexity by using agentless data, collecting mechanisms such as injecting probes into middleware or using routers or switches embedded with NetFlow data analysers. The data is used to compile reports and dashboards on packet loss, latency and hub jitter etc. The generated reports allow comparisons of trends through semantic relationship testing, correlation and multivariate analysis with automated and advanced statistical techniques allowing CTOs and CIOs to make real time business decisions that provide a competitive advantage.

Thanks to APMDigest for the article.