Welcome!

@CloudExpo Authors: Liz McMillan, Elizabeth White, Pat Romanski, Yeshim Deniz, Aruna Ravichandran

Related Topics: @CloudExpo, Microservices Expo, Containers Expo Blog, Agile Computing, @BigDataExpo, SDN Journal

@CloudExpo: Article

Log Management 101: Where Do Logs Come From?

Logs are machine data generated by any sort of application or the infrastructure used to run that application

We’ve had a lot of people asking for the Log Management Primer for a while now. And, surprisingly, many of these folks have a strong technical background, including developers. Some want it for themselves, and some want it to pass on to a colleague, manager, etc. I’m going to explain what logs are, where they come from and how you can get your logs.

If you’re a developer, this post probably isn’t for you as we don’t dig into the code level nitty gritty, but it will give you a high level overview of logs, where they come from and how they get sent to a third party service.

Where Do Logs Come From?
Logs are machine data generated by any sort of application or the infrastructure used to run that application. They created a record of all the events that happen in an application system. They are also usually unstructured or semi-structured and often contain common parameters such as a time stamp, IP address, process id, etc.

Logs can come from throughout the application stack, including:

  • Mobile Apps/Devices
  • Client Browser
  • Web Application code
  • Platform as a Service
    • Application server
    • Database
    • Router
  • Infrastructure as a Service
    • Operating System
    • Other Services (think AWS S3, RDS etc…)
  • On Premise
    • Virtual Machines
    • Hypervisor
    • Network
  • Hardware
    • Server
    • Etc…

Application_Stack_Graphic

How to Get Your Logs
Pretty much all levels of the application stack kick out log data. Different levels, though, lend themselves to different methods of gaining access. That said, there are some similarities across the levels, though.

File System
For the most part, logs are sent to a file system by default. Without further action, that’s where they’d meet an untimely demise as they’re deleted to make way for new logs. The file system is not and ideal place for long term storage for this data and often only a relatively small amount of data is stored here regularly as logs often get rotated after they reach a certain size. If you want to store more of your log data, or if you want to perform any analysis, graphing, alerting, tagging, etc., then you’ll need more than just the file system. Often people will archive logs periodically (e.g., to S3) or will send them to a third party service.

Everything from the application down through the hardware level can send logs to the file system…it’s just a matter of how. For example, your applications, the app server, database, OS, and VMs will all normally send data straight to a file system.

So, now that you’ve got your logs flying to the file system, what do you do to get them out of there and into somewhere with a bit more longevity? When sending them on to a third party logging tool there are two main ways to do this, via syslog or a collector agent.

Syslog
Syslog
is the protocol you’ll use, most likely, if you’re running a Linux setup. If you’re not running Linux, move on; Syslog is the domain of Linus Torvalds and those who use his creation.

Once you’ve sent logs to the file system, Syslog will step in and forward these to your log analysis tool. Note – you’ll need to configure syslog accordingly. Or, in some cases, you can use it to forward directly from the PaaS layer (e.g. logplex from Heroku supports this).

There are several flavors of Syslog,:

The benefits of Syslog are:

  • Its available out of the box on all Linux distros
  • It’s secure
    • You can send logs via TCP (secure) or UDP (unsecure)
    • It’s a known, standard protocol that is widely supported (for instance, see our documentation on Syslog)

The downfalls? It can be challenging to configure if you don’t know what you are doing. Although if you are following a good set of docs it should take no longer than a few minutes. Also older flavors of Linux can have limitations e.g. with syslogd you cannot send data from non-syslog log files that may exist elsewhere on your file system outside of the /var/logs folder where all your syslog logs live. Rsyslog solves this however and ships with most distros these days. Finally, although Snare is a windows equivalent syslog, this approach is largely only used on Linux systems.

Agent
An agent is a lightweight application that runs on your server and (in this case) forwards your logs from the file system to your log management tool. Agents are great for when you’re not just pulling logs out of your Linux box (where you may just use Syslog). That said, you can still use them on Linux if you’re so inclined. They’ll usually send logs to your cloud log solution via an API

The benefits of agents (or at least the Logentries agent):

  • Quick
  • Easy to set up
  • Secure (they use TCP)
  • You can modify the source to filter sensitive data from being logged

The downsides are:

  • It must be updated appropriately – although the Logentries agent is plugged into the relevant Linux package management systems – so this is taken care of in this instance
  • Also sometimes people are reluctant to run unknown pieces of code on their systems – we’ve open sourced our agent for this reason – so you can look at exactly what is running on your machine. That being said you may not have the time or the inclination to do this and may prefer to use a more tried and tested approach like syslog.
  • Scaling issues in relation to deploying agents can also arise - e.g. when you’re trying to deploy on ~100 servers …do you want to do that manually? Luckily, there are tools like Puppet or Chef to automate this.

Libraries
Libraries can be set up to send logs to a logging service from the application layer via an API. Each library supports a specific language (e.g. java, ruby, node.js, c#, python, etc…). The benefit of libraries is that you can still get your logs, even if you only have access at the application code level. Many PaaS providers do not provide file system access or a way to forward logs to a third party service – so libraries are a must in this case.

Client side libraries also allow you to get a view into what is happening from an end user’s perspective. For example, they can allow you to log from your end user’s browser so that you can get a full end to end view of your system. You can use our le.js library to do just that!

Libraries can also be used to log from your mobile apps – check out our android library for this.

Conclusion
So there you have it, now you know how where logs come from and how your logs get from the different parts of your application stack to your log analyzer – all logs from the browser, to the backend, to the log management solution of your choice.

More Stories By Trevor Parsons

Trevor Parsons is Chief Scientist and Co-founder of Logentries. Trevor has over 10 years experience in enterprise software and, in particular, has specialized in developing enterprise monitoring and performance tools for distributed systems. He is also a research fellow at the Performance Engineering Lab Research Group and was formerly a Scientist at the IBM Center for Advanced Studies. Trevor holds a PhD from University College Dublin, Ireland.

@CloudExpo Stories
What is the best strategy for selecting the right offshore company for your business? In his session at 21st Cloud Expo, Alan Winters, U.S. Head of Business Development at MobiDev, will discuss the things to look for - positive and negative - in evaluating your options. He will also discuss how to maximize productivity with your offshore developers. Before you start your search, clearly understand your business needs and how that impacts software choices.
As you move to the cloud, your network should be efficient, secure, and easy to manage. An enterprise adopting a hybrid or public cloud needs systems and tools that provide: Agility: ability to deliver applications and services faster, even in complex hybrid environments Easier manageability: enable reliable connectivity with complete oversight as the data center network evolves Greater efficiency: eliminate wasted effort while reducing errors and optimize asset utilization Security: imple...
As people view cloud as a preferred option to build IT systems, the size of the cloud-based system is getting bigger and more complex. As the system gets bigger, more people need to collaborate from design to management. As more people collaborate to create a bigger system, the need for a systematic approach to automate the process is required. Just as in software, cloud now needs DevOps. In this session, the audience can see how people can solve this issue with a visual model. Visual models ha...
High-velocity engineering teams are applying not only continuous delivery processes, but also lessons in experimentation from established leaders like Amazon, Netflix, and Facebook. These companies have made experimentation a foundation for their release processes, allowing them to try out major feature releases and redesigns within smaller groups before making them broadly available. In his session at 21st Cloud Expo, Brian Lucas, Senior Staff Engineer at Optimizely, will discuss how by using...
The next XaaS is CICDaaS. Why? Because CICD saves developers a huge amount of time. CD is an especially great option for projects that require multiple and frequent contributions to be integrated. But… securing CICD best practices is an emerging, essential, yet little understood practice for DevOps teams and their Cloud Service Providers. The only way to get CICD to work in a highly secure environment takes collaboration, patience and persistence. Building CICD in the cloud requires rigorous ar...
SYS-CON Events announced today that Taica will exhibit at the Japan External Trade Organization (JETRO) Pavilion at SYS-CON's 21st International Cloud Expo®, which will take place on Oct 31 – Nov 2, 2017, at the Santa Clara Convention Center in Santa Clara, CA. ANSeeN are the measurement electronics maker for X-ray and Gamma-ray and Neutron measurement equipment such as spectrometers, pulse shape analyzer, and CdTe-FPD. For more information, visit http://anseen.com/.
DevOps at Cloud Expo, taking place October 31 - November 2, 2017, at the Santa Clara Convention Center in Santa Clara, CA, is co-located with 21st Cloud Expo and will feature technical sessions from a rock star conference faculty and the leading industry players in the world. The widespread success of cloud computing is driving the DevOps revolution in enterprise IT. Now as never before, development teams must communicate and collaborate in a dynamic, 24/7/365 environment. There is no time to w...
Is advanced scheduling in Kubernetes achievable? Yes, however, how do you properly accommodate every real-life scenario that a Kubernetes user might encounter? How do you leverage advanced scheduling techniques to shape and describe each scenario in easy-to-use rules and configurations? In his session at @DevOpsSummit at 21st Cloud Expo, Oleg Chunikhin, CTO at Kublr, will answer these questions and demonstrate techniques for implementing advanced scheduling. For example, using spot instances ...
SYS-CON Events announced today that Daiya Industry will exhibit at the Japanese Pavilion at SYS-CON's 21st International Cloud Expo®, which will take place on Oct 31 – Nov 2, 2017, at the Santa Clara Convention Center in Santa Clara, CA. Ruby Development Inc. builds new services in short period of time and provides a continuous support of those services based on Ruby on Rails. For more information, please visit https://github.com/RubyDevInc.
When it comes to cloud computing, the ability to turn massive amounts of compute cores on and off on demand sounds attractive to IT staff, who need to manage peaks and valleys in user activity. With cloud bursting, the majority of the data can stay on premises while tapping into compute from public cloud providers, reducing risk and minimizing need to move large files. In his session at 18th Cloud Expo, Scott Jeschonek, Director of Product Management at Avere Systems, discussed the IT and busine...
Nordstrom is transforming the way that they do business and the cloud is the key to enabling speed and hyper personalized customer experiences. In his session at 21st Cloud Expo, Ken Schow, VP of Engineering at Nordstrom, will discuss some of the key learnings and common pitfalls of large enterprises moving to the cloud. This includes strategies around choosing a cloud provider(s), architecture, and lessons learned. In addition, he’ll go over some of the best practices for structured team migrat...
As businesses evolve, they need technology that is simple to help them succeed today and flexible enough to help them build for tomorrow. Chrome is fit for the workplace of the future — providing a secure, consistent user experience across a range of devices that can be used anywhere. In her session at 21st Cloud Expo, Vidya Nagarajan, a Senior Product Manager at Google, will take a look at various options as to how ChromeOS can be leveraged to interact with people on the devices, and formats th...
First generation hyperconverged solutions have taken the data center by storm, rapidly proliferating in pockets everywhere to provide further consolidation of floor space and workloads. These first generation solutions are not without challenges, however. In his session at 21st Cloud Expo, Wes Talbert, a Principal Architect and results-driven enterprise sales leader at NetApp, will discuss how the HCI solution of tomorrow will integrate with the public cloud to deliver a quality hybrid cloud e...
SYS-CON Events announced today that Yuasa System will exhibit at the Japan External Trade Organization (JETRO) Pavilion at SYS-CON's 21st International Cloud Expo®, which will take place on Oct 31 – Nov 2, 2017, at the Santa Clara Convention Center in Santa Clara, CA. Yuasa System is introducing a multi-purpose endurance testing system for flexible displays, OLED devices, flexible substrates, flat cables, and films in smartphones, wearables, automobiles, and healthcare.
Companies are harnessing data in ways we once associated with science fiction. Analysts have access to a plethora of visualization and reporting tools, but considering the vast amount of data businesses collect and limitations of CPUs, end users are forced to design their structures and systems with limitations. Until now. As the cloud toolkit to analyze data has evolved, GPUs have stepped in to massively parallel SQL, visualization and machine learning.
Join IBM November 1 at 21st Cloud Expo at the Santa Clara Convention Center in Santa Clara, CA, and learn how IBM Watson can bring cognitive services and AI to intelligent, unmanned systems. Cognitive analysis impacts today’s systems with unparalleled ability that were previously available only to manned, back-end operations. Thanks to cloud processing, IBM Watson can bring cognitive services and AI to intelligent, unmanned systems. Imagine a robot vacuum that becomes your personal assistant tha...
Organizations do not need a Big Data strategy; they need a business strategy that incorporates Big Data. Most organizations lack a road map for using Big Data to optimize key business processes, deliver a differentiated customer experience, or uncover new business opportunities. They do not understand what’s possible with respect to integrating Big Data into the business model.
The session is centered around the tracing of systems on cloud using technologies like ebpf. The goal is to talk about what this technology is all about and what purpose it serves. In his session at 21st Cloud Expo, Shashank Jain, Development Architect at SAP, will touch upon concepts of observability in the cloud and also some of the challenges we have. Generally most cloud-based monitoring tools capture details at a very granular level. To troubleshoot problems this might not be good enough.
Coca-Cola’s Google powered digital signage system lays the groundwork for a more valuable connection between Coke and its customers. Digital signs pair software with high-resolution displays so that a message can be changed instantly based on what the operator wants to communicate or sell. In their Day 3 Keynote at 21st Cloud Expo, Greg Chambers, Global Group Director, Digital Innovation, Coca-Cola, and Vidya Nagarajan, a Senior Product Manager at Google, will discuss how from store operations...
Recently, REAN Cloud built a digital concierge for a North Carolina hospital that had observed that most patient call button questions were repetitive. In addition, the paper-based process used to measure patient health metrics was laborious, not in real-time and sometimes error-prone. In their session at 21st Cloud Expo, Sean Finnerty, Executive Director, Practice Lead, Health Care & Life Science at REAN Cloud, and Dr. S.P.T. Krishnan, Principal Architect at REAN Cloud, will discuss how they bu...