|By Srinivasan Sundara Rajan||
|December 28, 2011 07:15 AM EST||
In this article I would like to look at a few tools which are overlooked when it comes to Big Data analytics. Organizations that have already heavy investment on Mainframe and would like to continue with the utilization of Mainframe can consider these tools for further expanding their Big Data Analytics reach.
DFSORT- Sorting & Merging Large Data Sets :
- Much before RDBMS have taken their place, Cobol programs have 2 major file manipulation operations namely:
- SORT operation accepts un-sequenced input and produces output in specified sequence
- The Merge operation compares records from two or more files and combines them in order
- DFSORT adds the ability to do faster and easier sorting, merging, copying, reporting and analysis of your business information, as well as versatile data handling at the record, fixed position/length or variable position/length field, and bit level.
- DFSORT is designed to optimize the efficiency and speed with which operations are completed through synergy with processor, device, and system features
- A Cobol program will typically act as a intermediary in handling the FILE inputs and passing them to DFSORT
- After all the input records have been passed to DFSORT, the sorting operation is executed. This operation arranges the entire set of records in the sequence specified by keys.
- Much like a SORT , MERGE statement is also called from a COBOL job
- The MERGE statement execution begins the MERGE processing. This operation compares keys with the records of the input files, and passes the sequenced records to create a MERGED output file
- As per the documentation from the vendor , there is no maximum number of keys which can support the needs for Big Data Analytics processing
- Some of the advanced options of DFSORT also facilitates parallel sort processing which goes well with needs of Big Data Analytics
- With the work loads of Big Data Analytical jobs can span multiple physical and virtual servers including mainframe, it is good to see that DFSORT has the option to sort records either in EBCDIC or ASCII or another collating sequence. This can result in uniformity of massively parallel sorting jobs if they run on heterogeneous systems
- The Job Control Language (JCL), which gives Hadoop like management of large file processing jobs in Mainframe have good features to specify multiple input and output file options for SORT and MERGE jobs
- As evident this article does not aim as a tutorial for DFSORT and various performance features can be looked from Mainframe manuals or can ask Mainframe Gurus in your organization.
- REXX (Restructured eXtended eXecutor) is another programming language that is used in the same eco system of Cobol and DFSORT and can considerably contribute to the Big Data Analytical needs of the enterprises
- REXX has advantages in string manipulation, Dynamic data typing, Storage Management and is generally considered to be very reliable and robust
- One of the most important strengths of REXX that is of relevance to Bigdata Analytics is its ‘'character string" handling ability.
- There are some useful string manipulation functions like COPIES (), WORDS(), STRIP(), TRANSLATE(), which can go a long way in the Map Reduce functionality needs of typical big data analytical jobs
- PARSE instruction is also used frequently in REXX programs. It is able to take strings from a number of sources and break them apart into constituent parts using a fairly natural notation
- Probably PARSE could be one of the highly useful feature of REXX in its positioning as a Big Data Analytical tool
- The REXX parse statement divides a source string into constituent parts and assigns these to symbols as directed by the governing parsing template
- REXX, DFSORT and Cobol programs can be inter operable such that we could call a REXX program from Cobol , and all these can be tied together with JCL
- Again this note is meant as a tutorial for REXX and lot of good documentation is available on utilizing the String manipulation features of REXX.
Summary : There is a strong need for enterprises to adopt Big Data Analytics and start mining the huge sets of unstructured data which has been ignored so far to arrive at meaningful business decisions. While newer frameworks like Hadoop or the new breed of analytical databases are going to satisfy this need, however enterprises should not be spending their time on picking up the tools and languages when it comes to Big Data Analytics.
If there is a significant investment and organization direction is to use the legacy platforms like Cobol, JCL, REXX, DFSORT it is only prudent to utilize best of their capabilities in arriving at options for Big Data Analytics.
We are seeing that Big Data Analytics is mainly dependent on Map / Reduce algorithms, these functions are aimed at crunching large data sets, like reading the input files and create key/value pair and map functions take these key/value pairs and generates another key/value pair. Further Reducer function also depends on sorted key/value pairs and iterate them and reduce the output further.
If we look at the way this logic works, there is a heavy need for sorting, merging, string manipulation and parsing all the way. Hence the tools mentioned above like DFSORT, REXX along with Cobol will likely to satisfy the Big Data needs of large enterprises if they have already invested on Mainframe compute capacity.
The cloud is becoming the de-facto way for enterprises to leverage common infrastructure while innovating and one of the biggest obstacles facing public cloud computing is security. In his session at 15th Cloud Expo, Jeff Aliber, a global marketing executive at Verizon, discussed how the best place for web security is in the cloud. Benefits include: Functions as the first layer of defense Easy operation –CNAME change Implement an integrated solution Best architecture for addressing network-l...
Dec. 19, 2014 02:00 AM EST Reads: 910
Leysin American School is an exclusive, private boarding school located in Leysin, Switzerland. Leysin selected an OpenStack-powered, private cloud as a service to manage multiple applications and provide development environments for students across the institution. Seeking to meet rigid data sovereignty and data integrity requirements while offering flexible, on-demand cloud resources to users, Leysin identified OpenStack as the clear choice to round out the school's cloud strategy. Additional...
Dec. 19, 2014 02:00 AM EST Reads: 1,847
DevOps Summit 2015 New York, co-located with the 16th International Cloud Expo - to be held June 9-11, 2015, at the Javits Center in New York City, NY - announces that it is now accepting Keynote Proposals. The widespread success of cloud computing is driving the DevOps revolution in enterprise IT. Now as never before, development teams must communicate and collaborate in a dynamic, 24/7/365 environment. There is no time to wait for long development cycles that produce software that is obsolete...
Dec. 18, 2014 09:45 PM EST Reads: 881
“DevOps is really about the business. The business is under pressure today, competitively in the marketplace to respond to the expectations of the customer. The business is driving IT and the problem is that IT isn't responding fast enough," explained Mark Levy, Senior Product Marketing Manager at Serena Software, in this SYS-CON.tv interview at DevOps Summit, held Nov 4–6, 2014, at the Santa Clara Convention Center in Santa Clara, CA.
Dec. 18, 2014 08:00 PM EST Reads: 1,144
“We help people build clusters, in the classical sense of the cluster. We help people put a full stack on top of every single one of those machines. We do the full bare metal install," explained Greg Bruno, Vice President of Engineering and co-founder of StackIQ, in this SYS-CON.tv interview at 15th Cloud Expo, held Nov 4–6, 2014, at the Santa Clara Convention Center in Santa Clara, CA.
Dec. 18, 2014 02:30 PM EST Reads: 879
Mobile commerce traffic is surpassing desktop, yet less than 20% of sales in the U.S. are mobile commerce sales. In his session at 15th Cloud Expo, Dan Franklin, Segment Manager, Commerce, at Verizon Digital Media Services, defined mobile devices and discussed how next generation means simplification. It means taking your digital content and turning it into instantly gratifying experiences.
Dec. 18, 2014 12:00 PM EST Reads: 1,041
“In the past year we've seen a lot of stabilization of WebRTC. You can now use it in production with a far greater degree of certainty. A lot of the real developments in the past year have been in things like the data channel, which will enable a whole new type of application," explained Peter Dunkley, Technical Director at Acision, in this SYS-CON.tv interview at @ThingsExpo, held Nov 4–6, 2014, at the Santa Clara Convention Center in Santa Clara, CA.
Dec. 18, 2014 11:30 AM EST Reads: 919
SYS-CON Events announced today that Windstream, a leading provider of advanced network and cloud communications, has been named “Silver Sponsor” of SYS-CON's 16th International Cloud Expo®, which will take place on June 9–11, 2015, at the Javits Center in New York, NY. Windstream (Nasdaq: WIN), a FORTUNE 500 and S&P 500 company, is a leading provider of advanced network communications, including cloud computing and managed services, to businesses nationwide. The company also offers broadband, p...
Dec. 18, 2014 11:00 AM EST Reads: 2,120
The major cloud platforms defy a simple, side-by-side analysis. Each of the major IaaS public-cloud platforms offers their own unique strengths and functionality. Options for on-site private cloud are diverse as well, and must be designed and deployed while taking existing legacy architecture and infrastructure into account. Then the reality is that most enterprises are embarking on a hybrid cloud strategy and programs. In this Power Panel at 15th Cloud Expo (http://www.CloudComputingExpo.com...
Dec. 18, 2014 10:30 AM EST Reads: 2,250
Verizon Enterprise Solutions is simplifying the cloud-purchasing experience for its clients, with the launch of Verizon Cloud Marketplace, a key foundational component of the company's robust ecosystem of enterprise-class technologies. The online storefront will initially feature pre-built cloud-based services from AppDynamics, Hitachi Data Systems, Juniper Networks, PfSense and Tervela. Available globally to enterprises using Verizon Cloud, Verizon Cloud Marketplace provides a one-stop shop fo...
Dec. 18, 2014 10:30 AM EST Reads: 1,805
The Internet of Things is not new. Historically, smart businesses have used its basic concept of leveraging data to drive better decision making and have capitalized on those insights to realize additional revenue opportunities. So, what has changed to make the Internet of Things one of the hottest topics in tech? In his session at @ThingsExpo, Chris Gray, Director, Embedded and Internet of Things, discussed the underlying factors that are driving the economics of intelligent systems. Discover ...
Dec. 18, 2014 10:15 AM EST Reads: 2,075
The move in recent years to cloud computing services and architectures has added significant pace to the application development and deployment environment. When enterprise IT can spin up large computing instances in just minutes, developers can also design and deploy in small time frames that were unimaginable a few years ago. The consequent move toward lean, agile, and fast development leads to the need for the development and operations sides to work very closely together. Thus, DevOps become...
Dec. 18, 2014 10:00 AM EST Reads: 1,892
"Our premise is Docker is not enough. That's not a bad thing - we actually love Docker. At ActiveState all our products are based on open source technology and Docker is an up-and-coming piece of open source technology," explained Bart Copeland, President & CEO of ActiveState Software, in this SYS-CON.tv interview at DevOps Summit at Cloud Expo®, held Nov 4-6, 2014, at the Santa Clara Convention Center in Santa Clara, CA.
Dec. 18, 2014 10:00 AM EST Reads: 1,836
"BSQUARE is in the business of selling software solutions for smart connected devices. It's obvious that IoT has moved from being a technology to being a fundamental part of business, and in the last 18 months people have said let's figure out how to do it and let's put some focus on it, " explained Dave Wagstaff, VP & Chief Architect, at BSQUARE Corporation, in this SYS-CON.tv interview at @ThingsExpo, held Nov 4-6, 2014, at the Santa Clara Convention Center in Santa Clara, CA.
Dec. 18, 2014 10:00 AM EST Reads: 1,778
SYS-CON Events announced today that AIC, a leading provider of OEM/ODM server and storage solutions, will exhibit at SYS-CON's 16th International Cloud Expo®, which will take place on June 9-11, 2015, at the Javits Center in New York City, NY. AIC is a leading provider of both standard OTS, off-the-shelf, and OEM/ODM server and storage solutions. With expert in-house design capabilities, validation, manufacturing and production, AIC's broad selection of products are highly flexible and are conf...
Dec. 18, 2014 09:45 AM EST Reads: 1,729
SYS-CON Events announced today that IDenticard will exhibit at SYS-CON's 16th International Cloud Expo®, which will take place on June 9-11, 2015, at the Javits Center in New York City, NY. IDenticard™ is the security division of Brady Corp (NYSE: BRC), a $1.5 billion manufacturer of identification products. We have small-company values with the strength and stability of a major corporation. IDenticard offers local sales, support and service to our customers across the United States and Canada...
Dec. 18, 2014 09:30 AM EST Reads: 1,885
"People are a lot more knowledgeable about APIs now. There are two types of people who work with APIs - IT people who want to use APIs for something internal and the product managers who want to do something outside APIs for people to connect to them," explained Roberto Medrano, Executive Vice President at SOA Software, in this SYS-CON.tv interview at Cloud Expo, held Nov 4–6, 2014, at the Santa Clara Convention Center in Santa Clara, CA.
Dec. 18, 2014 09:00 AM EST Reads: 1,222
“We are a managed services company. We have taken the key aspects of the cloud and the purposed data center and merged the two together and launched the Purposed Cloud about 18–24 months ago," explained Chetan Patwardhan, CEO of Stratogent, in this SYS-CON.tv interview at 15th Cloud Expo, held Nov 4–6, 2014, at the Santa Clara Convention Center in Santa Clara, CA.
Dec. 18, 2014 09:00 AM EST Reads: 1,185
The Internet of Things is a misnomer. That implies that everything is on the Internet, and that simply should not be - especially for things that are blurring the line between medical devices that stimulate like a pacemaker and quantified self-sensors like a pedometer or pulse tracker. The mesh of things that we manage must be segmented into zones of trust for sensing data, transmitting data, receiving command and control administrative changes, and peer-to-peer mesh messaging. In his session a...
Dec. 17, 2014 11:15 PM EST Reads: 1,273