Applied Information Management
 
News and Updates

What CIOs and CTOs Need to Know About Big Data

Big data systems are being implemented in multiple enterprise sectors, including commerce, science, and society. A few examples illustrate the use of big data in real world settings, including the stock market, public health, the Sloan Digital Sky Survey (SDSS), and Walmart. These examples share several common traits: (a) they utilize large data stores, (b) they apply domain appropriate analysis, and (c) they present the analytic results visually.

Massive database systems are driving new business growth.

The CIO and the CTO are being asked to build big data into operations to drive significant business growth. The availability of big data to business has been fueled by a significant new shift into massive data collection and analysis―what Gray (2007) calls the Fourth Paradigm. This shift is causing businesses to realign IT operations to exploit the potential of big data, with the emergence of new technologies including Hadoop and cloud computing. The CIO and the CTO need to ensure that big data complements and enhances the existing IT infrastructure. For example, big data can be used as an early-warning alert system to monitor existing business processes.

While the technology business sector is ahead of the curve in their usage of big data, others such as health care, manufacturing, and the public sector are not yet on board. This is problematic, since these types of organizations are facing cost pressures, and big data may be incorporated in business operations as a way to reduce costs and streamline operations. Ratner (2011) notes that customer-relationship management is a key cross-business-sector goal and describes how the CIO and CTO can use big data analytics to improve customer retention.

The CIO and the CTO need to guard against expensive IT failures in the business by recognizing and guarding against the limitations of big data. Once a big data project is underway, questions arise such as how tightly to integrate big data into the existing business infrastructure. Executives need to guard against placing too much emphasis on technology and not enough emphasis on business integration.

Hardware selection is crucial to the success of big data, and, while distributed computing is a requirement to analyzing big data, networking imposes limitations. When an aggregated data subset in a distributed system needs to be accessed frequently by other data subsets, resultant throughput delays can cause inefficiencies.

There are also post-production limitations to what can be done with big data, such as misleading data and data weakness when taken out of a specific context. For example, big data provided by social media companies may be misleading if only a subset of the entire data set is provided.

The promise of big data is possible by the integration of key technology such as sensors, data storage, distributed computing, and analytics. However, big data technology is still emerging, and the CIO and CTO need be careful not to overinvest in technology that may quickly become obsolete.

The CIO and CTO must rely on a skilled and well-trained IT staff to be able to assess how external forces―regulation, security, and user behavior―affect the potential for big data to become a transformative force in the business. They need to hire, develop, and retain IT professionals including data scientists.

References

  • Borkar, V., Carey, M., & Li, C. (2012). Inside big data management: Ogres, onions, or parfaits? EDBT. Retrieved April 29, 2012 from Extending Database Technology (EDBT) Association.
  • Boyd, D., & Crawford, K. (2012). Critical questions for big data: Provocations for a cultural, technological, and scholarly phenomenon. Information, Communication and Society, 662-679. doi: 10.1080/1369118X.2012.678878.
  • Bryant, R. E., Katz, R. H., & Lazowska, E. D. (2008). Big-data computing: Creating revolutionary breakthroughs in commerce, science and society. Computing Research Association. Retrieved June 11, 2012 from Computing Community Consortium.
  • Gray, J., & Szalay, A. (2007). eScience–A transformed scientific method. Presentation to the Computer Science and Technology Board of the National Research Council, Mountain View, CA. Retrieved from SlideShare.
  • Jacobs, A. (2009). The pathologies of big data. Communications of the ACM, 52(8). Retrieved June 1, 2012 from Association for Computing Machinery.
  • Manovich, L. (2011). Trending: The promises and the challenges of big social data [Online]. In M. Gold (Editor), Debates in the Digital Humanities, In M. Gold. Minneapolis, MN: The University of Minnesota Press. Retrieved June 11, 2012 from Lev Manovich's website.
  • Manyika, J., Chui, M., Brown, B., Bughin, J., Dobbs, R., Roxburgh, C., & Byers, A. (2011). Big data: The next frontier for innovation, competition, and productivity [Kindle edition], McKinsey Global Institute. Retrieved June 11, 2012 from McKinsey Global Institute.
  • Ratner, B. (2011). Statistical and machine-learning data mining: Techniques for better predictive modeling and analysis of big data, second edition. CRC Press: New York.
  • Schreyogg, G., & Kliesch-Eberl, M. (2007). How dynamic can organizational capabilities be? Towards a dual-process model of capability dynamization. Strategic Management Journal, 28(9), 913-933. doi: 10.1002/smj.613.
  • Shah, S., Horne, A., & Capell·, J. (2012, April). Good data wonít guarantee good decisions. Harvard Business Review. Retrieved June 11, 2012 from Harvard Business Review.
AIM alumnus Bob Brehm

Research Paper Author: Robert Brehm, software developer, Zoom Software Solutions—2012 University of Oregon, AIM Program Graduate.

Abstract: The nature of business computing is changing due to the proliferation of massive data sets, referred to as big data, that can be used to produce business analytics (Borkar, Carey, & Li, 2012). This annotated bibliography presents literature published between 2000 and 2012. It provides information to CIOs and CTOs about big data by: (a) identifying business examples, (b) describing the relationship to data-intensive computing, (c) exploring opportunities and limitations, and (d) identifying cost factors.

Download the entire Capstone research project