|By Srinivasan Sundara Rajan||
|December 28, 2011 07:15 AM EST||
In this article I would like to look at a few tools which are overlooked when it comes to Big Data analytics. Organizations that have already heavy investment on Mainframe and would like to continue with the utilization of Mainframe can consider these tools for further expanding their Big Data Analytics reach.
DFSORT- Sorting & Merging Large Data Sets :
- Much before RDBMS have taken their place, Cobol programs have 2 major file manipulation operations namely:
- SORT operation accepts un-sequenced input and produces output in specified sequence
- The Merge operation compares records from two or more files and combines them in order
- DFSORT adds the ability to do faster and easier sorting, merging, copying, reporting and analysis of your business information, as well as versatile data handling at the record, fixed position/length or variable position/length field, and bit level.
- DFSORT is designed to optimize the efficiency and speed with which operations are completed through synergy with processor, device, and system features
- A Cobol program will typically act as a intermediary in handling the FILE inputs and passing them to DFSORT
- After all the input records have been passed to DFSORT, the sorting operation is executed. This operation arranges the entire set of records in the sequence specified by keys.
- Much like a SORT , MERGE statement is also called from a COBOL job
- The MERGE statement execution begins the MERGE processing. This operation compares keys with the records of the input files, and passes the sequenced records to create a MERGED output file
- As per the documentation from the vendor , there is no maximum number of keys which can support the needs for Big Data Analytics processing
- Some of the advanced options of DFSORT also facilitates parallel sort processing which goes well with needs of Big Data Analytics
- With the work loads of Big Data Analytical jobs can span multiple physical and virtual servers including mainframe, it is good to see that DFSORT has the option to sort records either in EBCDIC or ASCII or another collating sequence. This can result in uniformity of massively parallel sorting jobs if they run on heterogeneous systems
- The Job Control Language (JCL), which gives Hadoop like management of large file processing jobs in Mainframe have good features to specify multiple input and output file options for SORT and MERGE jobs
- As evident this article does not aim as a tutorial for DFSORT and various performance features can be looked from Mainframe manuals or can ask Mainframe Gurus in your organization.
- REXX (Restructured eXtended eXecutor) is another programming language that is used in the same eco system of Cobol and DFSORT and can considerably contribute to the Big Data Analytical needs of the enterprises
- REXX has advantages in string manipulation, Dynamic data typing, Storage Management and is generally considered to be very reliable and robust
- One of the most important strengths of REXX that is of relevance to Bigdata Analytics is its ‘'character string" handling ability.
- There are some useful string manipulation functions like COPIES (), WORDS(), STRIP(), TRANSLATE(), which can go a long way in the Map Reduce functionality needs of typical big data analytical jobs
- PARSE instruction is also used frequently in REXX programs. It is able to take strings from a number of sources and break them apart into constituent parts using a fairly natural notation
- Probably PARSE could be one of the highly useful feature of REXX in its positioning as a Big Data Analytical tool
- The REXX parse statement divides a source string into constituent parts and assigns these to symbols as directed by the governing parsing template
- REXX, DFSORT and Cobol programs can be inter operable such that we could call a REXX program from Cobol , and all these can be tied together with JCL
- Again this note is meant as a tutorial for REXX and lot of good documentation is available on utilizing the String manipulation features of REXX.
Summary : There is a strong need for enterprises to adopt Big Data Analytics and start mining the huge sets of unstructured data which has been ignored so far to arrive at meaningful business decisions. While newer frameworks like Hadoop or the new breed of analytical databases are going to satisfy this need, however enterprises should not be spending their time on picking up the tools and languages when it comes to Big Data Analytics.
If there is a significant investment and organization direction is to use the legacy platforms like Cobol, JCL, REXX, DFSORT it is only prudent to utilize best of their capabilities in arriving at options for Big Data Analytics.
We are seeing that Big Data Analytics is mainly dependent on Map / Reduce algorithms, these functions are aimed at crunching large data sets, like reading the input files and create key/value pair and map functions take these key/value pairs and generates another key/value pair. Further Reducer function also depends on sorted key/value pairs and iterate them and reduce the output further.
If we look at the way this logic works, there is a heavy need for sorting, merging, string manipulation and parsing all the way. Hence the tools mentioned above like DFSORT, REXX along with Cobol will likely to satisfy the Big Data needs of large enterprises if they have already invested on Mainframe compute capacity.
As more and more data is generated from a variety of connected devices, the need to get insights from this data and predict future behavior and trends is increasingly essential for businesses. Real-time stream processing is needed in a variety of different industries such as Manufacturing, Oil and Gas, Automobile, Finance, Online Retail, Smart Grids, and Healthcare. Azure Stream Analytics is a fully managed distributed stream computation service that provides low latency, scalable processing of ...
Aug. 28, 2015 07:45 PM EDT Reads: 143
Too often with compelling new technologies market participants become overly enamored with that attractiveness of the technology and neglect underlying business drivers. This tendency, what some call the “newest shiny object syndrome,” is understandable given that virtually all of us are heavily engaged in technology. But it is also mistaken. Without concrete business cases driving its deployment, IoT, like many other technologies before it, will fade into obscurity.
Aug. 28, 2015 06:00 PM EDT Reads: 307
With the proliferation of connected devices underpinning new Internet of Things systems, Brandon Schulz, Director of Luxoft IoT – Retail, will be looking at the transformation of the retail customer experience in brick and mortar stores in his session at @ThingsExpo. Questions he will address include: Will beacons drop to the wayside like QR codes, or be a proximity-based profit driver? How will the customer experience change in stores of all types when everything can be instrumented and a...
Aug. 28, 2015 05:30 PM EDT Reads: 406
SYS-CON Events announced today that HPM Networks will exhibit at the 17th International Cloud Expo®, which will take place on November 3–5, 2015, at the Santa Clara Convention Center in Santa Clara, CA. For 20 years, HPM Networks has been integrating technology solutions that solve complex business challenges. HPM Networks has designed solutions for both SMB and enterprise customers throughout the San Francisco Bay Area.
Aug. 28, 2015 03:30 PM EDT Reads: 812
Cloud and datacenter migration innovator AppZero has joined the Microsoft Enterprise Cloud Alliance Program. AppZero is a fast, flexible way to move Windows Server applications from any source machine – physical or virtual – to any destination server, in any cloud or datacenter, using its patented container technology. AppZero’s container is also called a Virtual Application Appliance (VAA). To facilitate Microsoft Azure onboarding, AppZero has two purpose-built offerings: AppZero SP for Azure,...
Aug. 28, 2015 03:15 PM EDT
WSM International, the pioneer and leader in server migration services, has announced an agreement with WHOA.com, a leader in providing secure public, private and hybrid cloud computing services. Under terms of the agreement, WSM will provide migration services to WHOA.com customers to relocate some or all of their applications, digital assets, and other computing workloads to WHOA.com enterprise-class, secure cloud infrastructure. The migration services include detailed evaluation and planning...
Aug. 28, 2015 03:01 PM EDT
SYS-CON Events announced today that G2G3 will exhibit at SYS-CON's @DevOpsSummit Silicon Valley, which will take place on November 3–5, 2015, at the Santa Clara Convention Center in Santa Clara, CA. Based on a collective appreciation for user experience, design, and technology, G2G3 is uniquely qualified and motivated to redefine how organizations and people engage in an increasingly digital world.
Aug. 28, 2015 02:15 PM EDT Reads: 410
Introducing Containers & Microservices Bootcamp at @CloudExpo Silicon Valley | #Containers #Microservices
SYS-CON Events announced today the Containers & Microservices Bootcamp, being held November 3-4, 2015, in conjunction with 17th Cloud Expo, @ThingsExpo, and @DevOpsSummit at the Santa Clara Convention Center in Santa Clara, CA. This is your chance to get started with the latest technology in the industry. Combined with real-world scenarios and use cases, the Containers and Microservices Bootcamp, led by Janakiram MSV, a Microsoft Regional Director, will include presentations as well as hands-on...
Aug. 28, 2015 12:30 PM EDT Reads: 113
SYS-CON Events announced today that Micron Technology, Inc., a global leader in advanced semiconductor systems, will exhibit at the 17th International Cloud Expo®, which will take place on November 3–5, 2015, at the Santa Clara Convention Center in Santa Clara, CA. Micron’s broad portfolio of high-performance memory technologies – including DRAM, NAND and NOR Flash – is the basis for solid state drives, modules, multichip packages and other system solutions. Backed by more than 35 years of tech...
Aug. 28, 2015 12:30 PM EDT Reads: 146
This Enterprise Strategy Group lab validation report of the NEC Express5800/R320 server with Intel® Xeon® processor presents the benefits of 99.999% uptime NEC fault-tolerant servers that lower overall virtualized server total cost of ownership. This report also includes survey data on the significant costs associated with system outages impacting enterprise and web applications. Click Here to Download Report Now!
Aug. 28, 2015 12:30 PM EDT
SYS-CON Events announced today that Pythian, a global IT services company specializing in helping companies leverage disruptive technologies to optimize revenue-generating systems, has been named “Bronze Sponsor” of SYS-CON's 17th Cloud Expo, which will take place on November 3–5, 2015, at the Santa Clara Convention Center in Santa Clara, CA. Founded in 1997, Pythian is a global IT services company that helps companies compete by adopting disruptive technologies such as cloud, Big Data, advance...
Aug. 28, 2015 12:00 PM EDT Reads: 190
Culture is the most important ingredient of DevOps. The challenge for most organizations is defining and communicating a vision of beneficial DevOps culture for their organizations, and then facilitating the changes needed to achieve that. Often this comes down to an ability to provide true leadership. As a CIO, are your direct reports IT managers or are they IT leaders? The hard truth is that many IT managers have risen through the ranks based on their technical skills, not their leadership ab...
Aug. 28, 2015 10:00 AM EDT Reads: 269
IBM’s Blue Box Cloud, powered by OpenStack, is now available in any of IBM’s globally integrated cloud data centers running SoftLayer infrastructure. Less than 90 days after its acquisition of Blue Box, IBM has integrated its Blue Box Cloud Dedicated private-cloud-as-a-service into its broader portfolio of OpenStack® based solutions. The announcement, made today at the OpenStack Silicon Valley event, further highlights IBM’s continued support to deliver OpenStack solutions across all cloud depl...
Aug. 28, 2015 09:45 AM EDT Reads: 152
SYS-CON Events announced today that DataClear Inc. will exhibit at the 17th International Cloud Expo®, which will take place on November 3–5, 2015, at the Santa Clara Convention Center in Santa Clara, CA. The DataClear ‘BlackBox’ is the only solution that moves your PC, browsing and data out of the United States and away from prying (and spying) eyes. Its solution automatically builds you a clean, on-demand, virus free, new virtual cloud based PC outside of the United States, and wipes it clean...
Aug. 28, 2015 09:45 AM EDT Reads: 333
Through WebRTC, audio and video communications are being embedded more easily than ever into applications, helping carriers, enterprises and independent software vendors deliver greater functionality to their end users. With today’s business world increasingly focused on outcomes, users’ growing calls for ease of use, and businesses craving smarter, tighter integration, what’s the next step in delivering a richer, more immersive experience? That richer, more fully integrated experience comes ab...
Aug. 28, 2015 07:30 AM EDT Reads: 553
In today's digital world, change is the one constant. Disruptive innovations like cloud, mobility, social media, and the Internet of Things have reshaped the market and set new standards in customer expectations. To remain competitive, businesses must tap the potential of emerging technologies and markets through the rapid release of new products and services. However, the rigid and siloed structures of traditional IT platforms and processes are slowing them down – resulting in lengthy delivery ...
Aug. 28, 2015 06:45 AM EDT Reads: 531
U.S. companies are desperately trying to recruit and hire skilled software engineers and developers, but there is simply not enough quality talent to go around. Tiempo Development is a nearshore software development company. Our headquarters are in AZ, but we are a pioneer and leader in outsourcing to Mexico, based on our three software development centers there. We have a proven process and we are experts at providing our customers with powerful solutions. We transform ideas into reality.
Aug. 28, 2015 04:45 AM EDT Reads: 435
In their Live Hack” presentation at 17th Cloud Expo, Stephen Coty and Paul Fletcher, Chief Security Evangelists at Alert Logic, will provide the audience with a chance to see a live demonstration of the common tools cyber attackers use to attack cloud and traditional IT systems. This “Live Hack” uses open source attack tools that are free and available for download by anybody. Attendees will learn where to find and how to operate these tools for the purpose of testing their own IT infrastructu...
Aug. 28, 2015 04:30 AM EDT Reads: 345
Any Ops team trying to support a company in today’s cloud-connected world knows that a new way of thinking is required – one just as dramatic than the shift from Ops to DevOps. The diversity of modern operations requires teams to focus their impact on breadth vs. depth. In his session at DevOps Summit, Adam Serediuk, Director of Operations at xMatters, Inc., will discuss the strategic requirements of evolving from Ops to DevOps, and why modern Operations has begun leveraging the “NoOps” approa...
Aug. 28, 2015 03:15 AM EDT Reads: 324
SYS-CON Events announced today that IceWarp will exhibit at the 17th International Cloud Expo®, which will take place on November 3–5, 2015, at the Santa Clara Convention Center in Santa Clara, CA. IceWarp, the leader of cloud and on-premise messaging, delivers secured email, chat, documents, conferencing and collaboration to today's mobile workforce, all in one unified interface
Aug. 28, 2015 03:00 AM EDT Reads: 352