Welcome!

@CloudExpo Authors: Liz McMillan, Elizabeth White, Pat Romanski, Stackify Blog, Harry Trott

Related Topics: @CloudExpo, Microservices Expo, Containers Expo Blog, Agile Computing, @BigDataExpo, SDN Journal

@CloudExpo: Article

Log Management 101: Where Do Logs Come From?

Logs are machine data generated by any sort of application or the infrastructure used to run that application

We’ve had a lot of people asking for the Log Management Primer for a while now. And, surprisingly, many of these folks have a strong technical background, including developers. Some want it for themselves, and some want it to pass on to a colleague, manager, etc. I’m going to explain what logs are, where they come from and how you can get your logs.

If you’re a developer, this post probably isn’t for you as we don’t dig into the code level nitty gritty, but it will give you a high level overview of logs, where they come from and how they get sent to a third party service.

Where Do Logs Come From?
Logs are machine data generated by any sort of application or the infrastructure used to run that application. They created a record of all the events that happen in an application system. They are also usually unstructured or semi-structured and often contain common parameters such as a time stamp, IP address, process id, etc.

Logs can come from throughout the application stack, including:

  • Mobile Apps/Devices
  • Client Browser
  • Web Application code
  • Platform as a Service
    • Application server
    • Database
    • Router
  • Infrastructure as a Service
    • Operating System
    • Other Services (think AWS S3, RDS etc…)
  • On Premise
    • Virtual Machines
    • Hypervisor
    • Network
  • Hardware
    • Server
    • Etc…

Application_Stack_Graphic

How to Get Your Logs
Pretty much all levels of the application stack kick out log data. Different levels, though, lend themselves to different methods of gaining access. That said, there are some similarities across the levels, though.

File System
For the most part, logs are sent to a file system by default. Without further action, that’s where they’d meet an untimely demise as they’re deleted to make way for new logs. The file system is not and ideal place for long term storage for this data and often only a relatively small amount of data is stored here regularly as logs often get rotated after they reach a certain size. If you want to store more of your log data, or if you want to perform any analysis, graphing, alerting, tagging, etc., then you’ll need more than just the file system. Often people will archive logs periodically (e.g., to S3) or will send them to a third party service.

Everything from the application down through the hardware level can send logs to the file system…it’s just a matter of how. For example, your applications, the app server, database, OS, and VMs will all normally send data straight to a file system.

So, now that you’ve got your logs flying to the file system, what do you do to get them out of there and into somewhere with a bit more longevity? When sending them on to a third party logging tool there are two main ways to do this, via syslog or a collector agent.

Syslog
Syslog
is the protocol you’ll use, most likely, if you’re running a Linux setup. If you’re not running Linux, move on; Syslog is the domain of Linus Torvalds and those who use his creation.

Once you’ve sent logs to the file system, Syslog will step in and forward these to your log analysis tool. Note – you’ll need to configure syslog accordingly. Or, in some cases, you can use it to forward directly from the PaaS layer (e.g. logplex from Heroku supports this).

There are several flavors of Syslog,:

The benefits of Syslog are:

  • Its available out of the box on all Linux distros
  • It’s secure
    • You can send logs via TCP (secure) or UDP (unsecure)
    • It’s a known, standard protocol that is widely supported (for instance, see our documentation on Syslog)

The downfalls? It can be challenging to configure if you don’t know what you are doing. Although if you are following a good set of docs it should take no longer than a few minutes. Also older flavors of Linux can have limitations e.g. with syslogd you cannot send data from non-syslog log files that may exist elsewhere on your file system outside of the /var/logs folder where all your syslog logs live. Rsyslog solves this however and ships with most distros these days. Finally, although Snare is a windows equivalent syslog, this approach is largely only used on Linux systems.

Agent
An agent is a lightweight application that runs on your server and (in this case) forwards your logs from the file system to your log management tool. Agents are great for when you’re not just pulling logs out of your Linux box (where you may just use Syslog). That said, you can still use them on Linux if you’re so inclined. They’ll usually send logs to your cloud log solution via an API

The benefits of agents (or at least the Logentries agent):

  • Quick
  • Easy to set up
  • Secure (they use TCP)
  • You can modify the source to filter sensitive data from being logged

The downsides are:

  • It must be updated appropriately – although the Logentries agent is plugged into the relevant Linux package management systems – so this is taken care of in this instance
  • Also sometimes people are reluctant to run unknown pieces of code on their systems – we’ve open sourced our agent for this reason – so you can look at exactly what is running on your machine. That being said you may not have the time or the inclination to do this and may prefer to use a more tried and tested approach like syslog.
  • Scaling issues in relation to deploying agents can also arise - e.g. when you’re trying to deploy on ~100 servers …do you want to do that manually? Luckily, there are tools like Puppet or Chef to automate this.

Libraries
Libraries can be set up to send logs to a logging service from the application layer via an API. Each library supports a specific language (e.g. java, ruby, node.js, c#, python, etc…). The benefit of libraries is that you can still get your logs, even if you only have access at the application code level. Many PaaS providers do not provide file system access or a way to forward logs to a third party service – so libraries are a must in this case.

Client side libraries also allow you to get a view into what is happening from an end user’s perspective. For example, they can allow you to log from your end user’s browser so that you can get a full end to end view of your system. You can use our le.js library to do just that!

Libraries can also be used to log from your mobile apps – check out our android library for this.

Conclusion
So there you have it, now you know how where logs come from and how your logs get from the different parts of your application stack to your log analyzer – all logs from the browser, to the backend, to the log management solution of your choice.

More Stories By Trevor Parsons

Trevor Parsons is Chief Scientist and Co-founder of Logentries. Trevor has over 10 years experience in enterprise software and, in particular, has specialized in developing enterprise monitoring and performance tools for distributed systems. He is also a research fellow at the Performance Engineering Lab Research Group and was formerly a Scientist at the IBM Center for Advanced Studies. Trevor holds a PhD from University College Dublin, Ireland.

@CloudExpo Stories
SYS-CON Events announced today that Grape Up will exhibit at SYS-CON's 21st International Cloud Expo®, which will take place on Oct. 31 – Nov 2, 2017, at the Santa Clara Convention Center in Santa Clara, CA. Grape Up is a software company specializing in cloud native application development and professional services related to Cloud Foundry PaaS. With five expert teams that operate in various sectors of the market across the U.S. and Europe, Grape Up works with a variety of customers from emergi...
SYS-CON Events announced today that Golden Gate University will exhibit at SYS-CON's 21st International Cloud Expo®, which will take place on Oct 31 – Nov 2, 2017, at the Santa Clara Convention Center in Santa Clara, CA. Since 1901, non-profit Golden Gate University (GGU) has been helping adults achieve their professional goals by providing high quality, practice-based undergraduate and graduate educational programs in law, taxation, business and related professions. Many of its courses are taug...
@DevOpsSummit at Cloud Expo taking place Oct 31 - Nov 2, 2017, at the Santa Clara Convention Center, Santa Clara, CA, is co-located with the 21st International Cloud Expo and will feature technical sessions from a rock star conference faculty and the leading industry players in the world. The widespread success of cloud computing is driving the DevOps revolution in enterprise IT. Now as never before, development teams must communicate and collaborate in a dynamic, 24/7/365 environment. There is ...
DevOps at Cloud Expo, taking place October 31 - November 2, 2017, at the Santa Clara Convention Center in Santa Clara, CA, is co-located with 21st Cloud Expo and will feature technical sessions from a rock star conference faculty and the leading industry players in the world. The widespread success of cloud computing is driving the DevOps revolution in enterprise IT. Now as never before, development teams must communicate and collaborate in a dynamic, 24/7/365 environment. There is no time to w...
When shopping for a new data processing platform for IoT solutions, many development teams want to be able to test-drive options before making a choice. Yet when evaluating an IoT solution, it’s simply not feasible to do so at scale with physical devices. Building a sensor simulator is the next best choice; however, generating a realistic simulation at very high TPS with ease of configurability is a formidable challenge. When dealing with multiple application or transport protocols, you would be...
With Cloud Foundry you can easily deploy and use apps utilizing websocket technology, but not everybody realizes that scaling them out is not that trivial. In his session at 21st Cloud Expo, Roman Swoszowski, CTO and VP, Cloud Foundry Services, at Grape Up, will show you an example of how to deal with this issue. He will demonstrate a cloud-native Spring Boot app running in Cloud Foundry and communicating with clients over websocket protocol that can be easily scaled horizontally and coordinate...
yperConvergence came to market with the objective of being simple, flexible and to help drive down operating expenses. It reduced the footprint by bundling the compute/storage/network into one box. This brought a new set of challenges as the HyperConverged vendors are very focused on their own proprietary building blocks. If you want to scale in a certain way, let’s say you identified a need for more storage and want to add a device that is not sold by the HyperConverged vendor, forget about it....
21st International Cloud Expo, taking place October 31 - November 2, 2017, at the Santa Clara Convention Center in Santa Clara, CA, will feature technical sessions from a rock star conference faculty and the leading industry players in the world. Cloud computing is now being embraced by a majority of enterprises of all sizes. Yesterday's debate about public vs. private has transformed into the reality of hybrid cloud: a recent survey shows that 74% of enterprises have a hybrid cloud strategy. Me...
DevOps at Cloud Expo, taking place October 31 - November 2, 2017, at the Santa Clara Convention Center in Santa Clara, CA, is co-located with 21st Cloud Expo and will feature technical sessions from a rock star conference faculty and the leading industry players in the world. The widespread success of cloud computing is driving the DevOps revolution in enterprise IT. Now as never before, development teams must communicate and collaborate in a dynamic, 24/7/365 environment. There is no time to w...
Recently, WebRTC has a lot of eyes from market. The use cases of WebRTC are expanding - video chat, online education, online health care etc. Not only for human-to-human communication, but also IoT use cases such as machine to human use cases can be seen recently. One of the typical use-case is remote camera monitoring. With WebRTC, people can have interoperability and flexibility for deploying monitoring service. However, the benefit of WebRTC for IoT is not only its convenience and interopera...
Vulnerability management is vital for large companies that need to secure containers across thousands of hosts, but many struggle to understand how exposed they are when they discover a new high security vulnerability. In his session at 21st Cloud Expo, John Morello, CTO of Twistlock, will address this pressing concern by introducing the concept of the “Vulnerability Risk Tree API,” which brings all the data together in a simple REST endpoint, allowing companies to easily grasp the severity of t...
In his session at 20th Cloud Expo, Scott Davis, CTO of Embotics, discussed how automation can provide the dynamic management required to cost-effectively deliver microservices and container solutions at scale. He also discussed how flexible automation is the key to effectively bridging and seamlessly coordinating both IT and developer needs for component orchestration across disparate clouds – an increasingly important requirement at today’s multi-cloud enterprise.
Connecting to major cloud service providers is becoming central to doing business. But your cloud provider’s performance is only as good as your connectivity solution. Massive Networks will place you in the driver's seat by exposing how you can extend your LAN from any location to include any cloud platform through an advanced high-performance connection that is secure and dedicated to your business-critical data. In his session at 21st Cloud Expo, Paul Mako, CEO & CIO of Massive Networks, wil...
Any startup has to have a clear go –to-market strategy from the beginning. Similarly, any data science project has to have a go to production strategy from its first days, so it could go beyond proof-of-concept. Machine learning and artificial intelligence in production would result in hundreds of training pipelines and machine learning models that are continuously revised by teams of data scientists and seamlessly connected with web applications for tenants and users.
When shopping for a new data processing platform for IoT solutions, many development teams want to be able to test-drive options before making a choice. Yet when evaluating an IoT solution, it’s simply not feasible to do so at scale with physical devices. Building a sensor simulator is the next best choice; however, generating a realistic simulation at very high TPS with ease of configurability is a formidable challenge. When dealing with multiple application or transport protocols, you would be...
WebRTC is great technology to build your own communication tools. It will be even more exciting experience it with advanced devices, such as a 360 Camera, 360 microphone, and a depth sensor camera. In his session at @ThingsExpo, Masashi Ganeko, a manager at INFOCOM Corporation, will introduce two experimental projects from his team and what they learned from them. "Shotoku Tamago" uses the robot audition software HARK to track speakers in 360 video of a remote party. "Virtual Teleport" uses a...
IT organizations are moving to the cloud in hopes to approve efficiency, increase agility and save money. Migrating workloads might seem like a simple task, but what many businesses don’t realize is that application migration criteria differs across organizations, making it difficult for architects to arrive at an accurate TCO number. In his session at 21st Cloud Expo, Joe Kinsella, CTO of CloudHealth Technologies, will offer a systematic approach to understanding the TCO of a cloud application...
"With Digital Experience Monitoring what used to be a simple visit to a web page has exploded into app on phones, data from social media feeds, competitive benchmarking - these are all components that are only available because of some type of digital asset," explained Leo Vasiliou, Director of Web Performance Engineering at Catchpoint Systems, in this SYS-CON.tv interview at DevOps Summit at 20th Cloud Expo, held June 6-8, 2017, at the Javits Center in New York City, NY.
SYS-CON Events announced today that Secure Channels, a cybersecurity firm, will exhibit at SYS-CON's 21st International Cloud Expo®, which will take place on Oct 31 – Nov 2, 2017, at the Santa Clara Convention Center in Santa Clara, CA. Secure Channels, Inc. offers several products and solutions to its many clients, helping them protect critical data from being compromised and access to computer networks from the unauthorized. The company develops comprehensive data encryption security strategie...
SYS-CON Events announced today that App2Cloud will exhibit at SYS-CON's 21st International Cloud Expo®, which will take place on Oct. 31 – Nov 2, 2017, at the Santa Clara Convention Center in Santa Clara, CA. App2Cloud is an online Platform, specializing in migrating legacy applications to any Cloud Providers (AWS, Azure, Google Cloud).