Welcome!

@CloudExpo Authors: John Basso, AppDynamics Blog, David Sprott, AppNeta Blog, William Schmarzo

Related Topics: @CloudExpo

@CloudExpo: Blog Post

Cousins of Cobol in Big Data Analytics

How DFSORT, REXX Support Big Data Analytics

In this  article  I would  like to look at a few tools which are overlooked when it comes to Big Data analytics. Organizations that  have  already  heavy investment  on Mainframe  and  would like to continue  with the utilization of Mainframe can consider these  tools for further  expanding their Big Data Analytics reach.

DFSORT-  Sorting & Merging Large Data Sets :

  • Much before RDBMS have taken their place, Cobol programs have 2 major file manipulation operations namely:
  • SORT operation accepts un-sequenced input and produces output in specified sequence
  • The Merge operation compares records from two or more files and combines them in order
  • DFSORT adds the ability to do faster and easier sorting, merging, copying, reporting and analysis of your business information, as well as versatile data handling at the record, fixed position/length or variable position/length field, and bit level.
  • DFSORT is designed to optimize the efficiency and speed with which operations are completed through synergy with processor, device, and system features
  • A Cobol program will typically act as a intermediary in handling the FILE inputs and passing them to DFSORT
  • After all the input records have been passed to DFSORT, the sorting operation is executed. This operation arranges the entire set of records in the sequence specified by keys.
  • Much like a SORT , MERGE statement is also called from a COBOL job
  • The MERGE statement execution begins the MERGE processing. This operation compares keys with the records of the input files, and passes the sequenced records to create a MERGED output file
  • As per the documentation from the vendor , there is no maximum number of keys which can support the needs for Big Data Analytics processing
  • Some of the advanced options of DFSORT also facilitates parallel sort processing which goes well with needs of Big Data Analytics
  • With the work loads of Big Data Analytical jobs can span multiple physical and virtual servers including mainframe, it is good to see that DFSORT has the option to sort records either in EBCDIC or ASCII or another collating sequence. This can result in uniformity of massively parallel sorting jobs if they run on heterogeneous systems
  • The Job Control Language (JCL), which gives Hadoop like management of large file processing jobs in Mainframe have good features to specify multiple input and output file options for SORT and MERGE jobs
  • As evident this article does not aim as a tutorial for DFSORT and various performance features can be looked from Mainframe manuals or can ask Mainframe Gurus in your organization.

REXX :

  • REXX (Restructured eXtended eXecutor) is another programming language that is used in the same eco system of Cobol and DFSORT and can considerably contribute to the Big Data Analytical needs of the enterprises
  • REXX has advantages in string manipulation, Dynamic data typing, Storage Management and is generally considered to be very reliable and robust
  • One of the most important strengths of REXX that is of relevance to Bigdata Analytics is its ‘'character string" handling ability.
  • There are some useful string manipulation functions like COPIES (), WORDS(), STRIP(), TRANSLATE(), which can go a long way in the Map Reduce functionality needs of typical big data analytical jobs
  • PARSE instruction is also used frequently in REXX programs. It is able to take strings from a number of sources and break them apart into constituent parts using a fairly natural notation
  • Probably PARSE could be one of the highly useful feature of REXX in its positioning as a Big Data Analytical tool
  • The REXX parse statement divides a source string into constituent parts and assigns these to symbols as directed by the governing parsing template
  • REXX, DFSORT and Cobol programs can be inter operable such that we could call a REXX program from Cobol , and all these can be tied together with JCL
  • Again this note is meant as a tutorial for REXX and lot of good documentation is available on utilizing the String manipulation features of REXX.

Summary : There is  a strong  need for enterprises  to  adopt Big Data  Analytics  and start mining the  huge sets  of  unstructured data which has been ignored so far to arrive at meaningful business decisions.  While  newer  frameworks like Hadoop  or  the new breed of  analytical databases are going to satisfy  this need,  however   enterprises  should not be spending their time on picking up the tools and languages when it comes to Big Data Analytics.

If there is a significant  investment  and organization direction is to use the legacy  platforms like Cobol, JCL, REXX, DFSORT  it is only prudent  to utilize best  of their capabilities  in arriving  at options for Big Data Analytics.

We are seeing   that  Big Data Analytics  is mainly dependent on Map / Reduce algorithms,  these  functions are aimed  at  crunching  large data sets, like reading the input files  and  create key/value pair   and map functions take these  key/value pairs  and generates  another  key/value pair.  Further Reducer function  also depends on  sorted  key/value pairs  and iterate them and reduce the output further.

If we look at the way this logic works,  there is a  heavy need for sorting, merging, string  manipulation and parsing all the way. Hence  the tools mentioned  above like DFSORT,  REXX  along with Cobol  will likely to satisfy  the Big Data needs  of large enterprises  if  they  have already invested  on Mainframe compute capacity.

 

More Stories By Srinivasan Sundara Rajan

Highly passionate about utilizing Digital Technologies to enable next generation enterprise. Believes in enterprise transformation through the Natives (Cloud Native & Mobile Native).

Comments (0)

Share your thoughts on this story.

Add your comment
You must be signed in to add a comment. Sign-in | Register

In accordance with our Comment Policy, we encourage comments that are on topic, relevant and to-the-point. We will remove comments that include profanity, personal attacks, racial slurs, threats of violence, or other inappropriate material that violates our Terms and Conditions, and will block users who make repeated violations. We ask all readers to expect diversity of opinion and to treat one another with dignity and respect.


@CloudExpo Stories
Deploying applications in hybrid cloud environments is hard work. Your team spends most of the time maintaining your infrastructure, configuring dev/test and production environments, and deploying applications across environments – which can be both time consuming and error prone. But what if you could automate provisioning and deployment to deliver error free environments faster? What could you do with your free time?
Using new techniques of information modeling, indexing, and processing, new cloud-based systems can support cloud-based workloads previously not possible for high-throughput insurance, banking, and case-based applications. In his session at 18th Cloud Expo, John Newton, CTO, Founder and Chairman of Alfresco, described how to scale cloud-based content management repositories to store, manage, and retrieve billions of documents and related information with fast and linear scalability. He addres...
SYS-CON Events announced today the Kubernetes and Google Container Engine Workshop, being held November 3, 2016, in conjunction with @DevOpsSummit at 19th Cloud Expo at the Santa Clara Convention Center in Santa Clara, CA. This workshop led by Sebastian Scheele introduces participants to Kubernetes and Google Container Engine (GKE). Through a combination of instructor-led presentations, demonstrations, and hands-on labs, students learn the key concepts and practices for deploying and maintainin...
The competitive landscape of the global cloud computing market in the healthcare industry is crowded due to the presence of a large number of players. The large number of participants has led to the fragmented nature of the market. Some of the major players operating in the global cloud computing market in the healthcare industry are Cisco Systems Inc., Carestream Health Inc., Carecloud Corp., AGFA Healthcare, IBM Corp., Cleardata Networks, Merge Healthcare Inc., Microsoft Corp., Intel Corp., an...
Cloud analytics is dramatically altering business intelligence. Some businesses will capitalize on these promising new technologies and gain key insights that’ll help them gain competitive advantage. And others won’t. Whether you’re a business leader, an IT manager, or an analyst, we want to help you and the people you need to influence with a free copy of “Cloud Analytics for Dummies,” the essential guide to this explosive new space for business intelligence.
Aspose.Total for .NET is the most complete package of all file format APIs for .NET as offered by Aspose. It empowers developers to create, edit, render, print and convert between a wide range of popular document formats within any .NET, C#, ASP.NET and VB.NET applications. Aspose compiles all .NET APIs on a daily basis to ensure that it contains the most up to date versions of each of Aspose .NET APIs. If a new .NET API or a new version of existing APIs is released during the subscription peri...
Enterprise networks are complex. Moreover, they were designed and deployed to meet a specific set of business requirements at a specific point in time. But, the adoption of cloud services, new business applications and intensifying security policies, among other factors, require IT organizations to continuously deploy configuration changes. Therefore, enterprises are looking for better ways to automate the management of their networks while still leveraging existing capabilities, optimizing perf...
SYS-CON Events announced today that 910Telecom will exhibit at the 19th International Cloud Expo, which will take place on November 1–3, 2016, at the Santa Clara Convention Center in Santa Clara, CA. Housed in the classic Denver Gas & Electric Building, 910 15th St., 910Telecom is a carrier-neutral telecom hotel located in the heart of Denver. Adjacent to CenturyLink, AT&T, and Denver Main, 910Telecom offers connectivity to all major carriers, Internet service providers, Internet backbones and ...
Ixia (Nasdaq: XXIA) has announced that NoviFlow Inc.has deployed IxNetwork® to validate the company’s designs and accelerate the delivery of its proven, reliable products. Based in Montréal, NoviFlow Inc. supports network carriers, hyperscale data center operators, and enterprises seeking greater network control and flexibility, network scalability, and the capacity to handle extremely large numbers of flows, while maintaining maximum network performance. To meet these requirements, NoviFlow in...
Adding public cloud resources to an existing application can be a daunting process. The tools that you currently use to manage the software and hardware outside the cloud aren’t always the best tools to efficiently grow into the cloud. All of the major configuration management tools have cloud orchestration plugins that can be leveraged, but there are also cloud-native tools that can dramatically improve the efficiency of managing your application lifecycle. In his session at 18th Cloud Expo, ...
SYS-CON Events announced today that LeaseWeb USA, a cloud Infrastructure-as-a-Service (IaaS) provider, will exhibit at the 19th International Cloud Expo, which will take place on November 1–3, 2016, at the Santa Clara Convention Center in Santa Clara, CA. LeaseWeb is one of the world's largest hosting brands. The company helps customers define, develop and deploy IT infrastructure tailored to their exact business needs, by combining various kinds cloud solutions.
Hostway Services, Inc. and WSM International have partnered to deliver trouble free migration services for any organization that wishes to bring their IT infrastructure to Hostway's Azure managed cloud services. WSM is the leader in providing turn-key IT migration services since 2003 and is now the preferred provider to any Hostway customer that is seeking to move its computer infrastructure to the Hostway Azure-based cloud.
Ovum, a leading technology analyst firm, has published an in-depth report, Ovum Decision Matrix: Selecting a DevOps Release Management Solution, 2016–17. The report focuses on the automation aspects of DevOps, Release Management and compares solutions from the leading vendors.
Continuous testing helps bridge the gap between developing quickly and maintaining high quality products. But to implement continuous testing, CTOs must take a strategic approach to building a testing infrastructure and toolset that empowers their team to move fast. Download our guide to laying the groundwork for a scalable continuous testing strategy.
Manufacturers are embracing the Industrial Internet the same way consumers are leveraging Fitbits – to improve overall health and wellness. Both can provide consistent measurement, visibility, and suggest performance improvements customized to help reach goals. Fitbit users can view real-time data and make adjustments to increase their activity. In his session at @ThingsExpo, Mark Bernardo Professional Services Leader, Americas, at GE Digital, discussed how leveraging the Industrial Internet a...
"We formed Formation several years ago to really address the need for bring complete modernization and software-defined storage to the more classic private cloud marketplace," stated Mark Lewis, Chairman and CEO of Formation Data Systems, in this SYS-CON.tv interview at 18th Cloud Expo, held June 7-9, 2016, at the Javits Center in New York City, NY.
The cloud market growth today is largely in public clouds. While there is a lot of spend in IT departments in virtualization, these aren’t yet translating into a true “cloud” experience within the enterprise. What is stopping the growth of the “private cloud” market? In his general session at 18th Cloud Expo, Nara Rajagopalan, CEO of Accelerite, explored the challenges in deploying, managing, and getting adoption for a private cloud within an enterprise. What are the key differences between wh...
Security, data privacy, reliability and regulatory compliance are critical factors when evaluating whether to move business applications from in-house client hosted environments to a cloud platform. In her session at 18th Cloud Expo, Vandana Viswanathan, Associate Director at Cognizant, In this session, will provide an orientation to the five stages required to implement a cloud hosted solution validation strategy.
SYS-CON Events announced today that Venafi, the Immune System for the Internet™ and the leading provider of Next Generation Trust Protection, will exhibit at @DevOpsSummit at 19th International Cloud Expo, which will take place on November 1–3, 2016, at the Santa Clara Convention Center in Santa Clara, CA. Venafi is the Immune System for the Internet™ that protects the foundation of all cybersecurity – cryptographic keys and digital certificates – so they can’t be misused by bad guys in attacks...
The best-practices for building IoT applications with Go Code that attendees can use to build their own IoT applications. In his session at @ThingsExpo, Indraneel Mitra, Senior Solutions Architect & Technology Evangelist at Cognizant, provided valuable information and resources for both novice and experienced developers on how to get started with IoT and Golang in a day. He also provided information on how to use Intel Arduino Kit, Go Robotics API and AWS IoT stack to build an application tha...