|By Alan McMahon||
|January 28, 2013 08:45 AM EST||
There are numerous applications for cost-effective data retention. Organizations can gain substantial competitive advantages when able to rely on data retention for improved decision making and trend analysis. Research enterprises can make use of large scale data sets enabling them to study information more completely than ever before.
Simplified data retention on a massive scale speeds up access time to Big Data. Big Data is defined as large-scale data sets that are too large to analyze and manage using ordinary methods. This data in both structured and unstructured form is valuable and comes from sources such as trading systems.
In many cases existing systems cannot process data of this variety and volume. Some organizations store such data in file systems so as not to overburden their databases. This may be a temporary stop gap, but it will not suffice in the long run. Because Big Data is increasing at an exponential rate, this is only a temporary solution. It's likely that machine-generated data will exceed the processing capability of conventional systems. The cost of extracting this data can be so high that many organizations will just shy away from it.
Today technology is just beginning to address Big Data issues. Many organizations try to apply existing strategies to manage this data effectively. Standard methods from relational database queries to complex analysis tools are being used. Data retention software is also being applied to extract relevant information from Big Data sources.
Currently Big Data retention technology is available that is scalable and easy to implement. Using this technology it's possible to access Big Data online using SQL along with business intelligence software. Components of this type of system are storage platforms with specialized software and a specialized massive scale data repository developed for data retention online. This unique Big Data management system is scalable and designed to process machine-generated data at 40:11 compression ratios while maintaining its online availability.
Organizations that need to process Big Data may benefit by using databases specifically designed for this purpose. Such databases will prove cost-effective and are currently being used in numerous organizations internationally. Such databases work in parallel allowing tens of billions of records to be processed each day. At the same time, the retention capability is practically limitless. This database can fit content addressable storage (CAS), direct attach storage (DAS), and storage area network (SAN). Some of the benefits of this data storage and retrieval system are reduction in infrastructure through reduction in physical storage demand and effective, configurable record management.
One Big Data retention solution has three components. The first is paired server level service managers that share metadata and provide import and query capability. The second is a data archive residing on a cluster services node as well as storage nodes. It's designed with enough scalability to process billions of objects. The third component consists of shared storage that can be local direct access storage, a network file system or a comprehensive clustered file system.
This type of system was recently tested on 508 GB of artificially generated using stock trading test data, modeled after NASDAQ. Performance test results for data import showed a rate of close to 12 billion records imported within an hour. Data compression resulted in a data reduction of 476.1 GB. The archive data was only about 6.3% of the original size prior to compression. A SQL query was executed selecting the three largest volume stocks having trades of well over 4 million per day. This query against 11.6 billion records took approximately 5.5 seconds to execute.
Big Data is high-volume, high-velocity and perhaps highly variable as well. Big Data retention solutions can lead to better decision making, new discoveries and even process optimization. Science is a major area that can benefit from Big Data solutions. Meteorology is just one example that can reap rewards by using new technological advances in data retention on a massive scale. The ability to do research and analysis with extremely large sets of data gives greater understanding to those who are modeling weather, oceanographic conditions, the economy or social trends. With new cost-effective technology available many new organizations will consider the possibilities of Big Data retention in their enterprise.
May. 1, 2016 08:15 PM EDT Reads: 2,079
May. 1, 2016 08:00 PM EDT Reads: 1,835
May. 1, 2016 06:00 PM EDT Reads: 803
May. 1, 2016 06:00 PM EDT Reads: 1,137
May. 1, 2016 04:30 PM EDT Reads: 1,470
May. 1, 2016 03:15 PM EDT Reads: 1,020
May. 1, 2016 01:45 PM EDT Reads: 1,222
May. 1, 2016 01:30 PM EDT Reads: 1,568
May. 1, 2016 01:00 PM EDT Reads: 1,744
May. 1, 2016 12:45 PM EDT Reads: 807
May. 1, 2016 12:15 PM EDT Reads: 2,323
May. 1, 2016 11:45 AM EDT Reads: 1,491
May. 1, 2016 11:00 AM EDT Reads: 1,091
May. 1, 2016 10:15 AM EDT Reads: 975
May. 1, 2016 09:45 AM EDT Reads: 1,013
May. 1, 2016 09:45 AM EDT Reads: 924
May. 1, 2016 09:30 AM EDT Reads: 988
May. 1, 2016 07:30 AM EDT Reads: 974
May. 1, 2016 07:00 AM EDT Reads: 842
May. 1, 2016 06:30 AM EDT Reads: 2,500