Chapter 08
Understanding Big Data and Its Impact on Business
True / False Questions
1.Big data is a collection of large, complex data sets, including structured and
unstructured data, that cannot be analyzed using traditional database methods and
tools.
True False
2.The four common characteristics of big data are variety, veracity, volume, velocity.
True False
3.Variety includes different forms of structured and unstructured data.
True False
4.Veracity includes the uncertainty of data, including biases, noise, and abnormalities.
True False
5.Volume includes the scale of data.
True False
6.Velocity includes the analysis of streaming data as it travels around the Internet.
True False
7. Velocity includes different forms of structured and unstructured data.
True False
8.Volume includes the uncertainty of data, including biases, noise, and abnormalities.
True False
9.Distributed computing processes and manages algorithms across many machines in
a computing environment.
True False
8-1
Copyright © 2018 McGraw-Hill Education. All rights reserved. No reproduction or distribution without the prior written consent of
McGraw-Hill Education.
29 June 2021 6 min read IBM Cloud Education, IBM Cloud Education All data is not created equal. Some data is structured, but most of it is unstructured. Structured and unstructured data
is sourced, collected and scaled in different ways, and each one resides in a different type of database. In this article, we’ll take a deep dive into both types so that you can get the most out of your data. Structured data — typically categorized as quantitative data — is highly organized and easily decipherable by
machine learning algorithms. Developed by IBM in 1974, structured query language (SQL) is the programming language used to manage structured data. By using a relational (SQL) database, business users can quickly input, search and manipulate structured data. Examples of structured data include dates, names, addresses, credit card numbers, etc. Their benefits are tied to ease of use and access, while liabilities revolve around data inflexibility: Unstructured data, typically categorized as qualitative data, cannot be processed and analyzed via conventional data tools and methods. Since unstructured data does not have a
predefined data model, it is best managed in non-relational (NoSQL) databases. Another way to manage unstructured data is to use data lakes to preserve it in raw form. The importance of unstructured
data is rapidly increasing. Recent projections indicate that unstructured data is over 80% of all enterprise data, while 95% of businesses prioritize unstructured data management. Examples of unstructured data include text, mobile activity, social media posts, Internet of Things (IoT) sensor data, etc. Their benefits involve
advantages in format, speed and storage, while liabilities revolve around expertise and available resources: While structured (quantitative) data gives a “birds-eye view” of customers, unstructured (qualitative) data
provides a deeper understanding of customer behavior and intent. Let’s explore some of the key areas of difference and their implications: Semi-structured data (e.g., JSON, CSV, XML) is the “bridge” between structured and unstructured data. It does not
have a predefined data model and is more complex than structured data, yet easier to store than unstructured data. Semi-structured data uses “metadata” (e.g., tags and semantic markers) to identify specific data characteristics and scale data into records and preset fields. Metadata ultimately enables semi-structured data to be better cataloged, searched and analyzed than unstructured data. A look into structured and unstructured data, their key differences and which form best meets your business needs.
What is structured data?
Pros and cons of structured data
Pros
Cons
Structured data tools
Use cases for structured data
What is unstructured data?
Pros and cons of unstructured data
Pros
Cons
Unstructured data tools
Use cases for unstructured data
What are the key differences between structured and unstructured data?
What is semi-structured data?
- Example of metadata usage: An online article displays a headline, a snippet, a featured image, image alt-text, slug, etc., which helps differentiate one piece of web content from similar pieces.
- Example of semi-structured data vs. structured data: A tab-delimited file containing customer data versus a database containing CRM tables.
- Example of semi-structured data vs. unstructured data: A tab-delimited file versus a list of comments from a customer’s Instagram.
The future of data
Recent developments in artificial intelligence (AI) and machine learning (ML) are driving the future wave of data, which is enhancing business intelligence and advancing industrial innovation. In particular, the data formats and models covered in this article are helping business users to do the following:
- Analyze digital communications for compliance: Pattern recognition and email threading analysis software that can search email and chat data for potential noncompliance.
- Track high-volume customer conversations in social media: Text analytics and sentiment analysis that enables monitoring of marketing campaign results and identifying online threats.
- Gain new marketing intelligence: ML analytics tools that can quickly cover massive amounts of data to help businesses analyze customer behavior.
Furthermore, smart and efficient usage of data formats and models can help you with the following:
- Understand customer needs at a deeper level to better serve them
- Create more focused and targeted marketing campaigns
- Track current metrics and create new ones
- Create better product opportunities and offerings
- Reduce operational costs
Structured and unstructured data and IBM
Whether you are a seasoned data expert or a novice business owner, being able to handle all forms of data is conducive to your success. By leveraging structured, semi-structured and unstructured data options, you can perform optimal data management that will ultimately benefit your mission.
To better understand data storage options for whatever kind of data best serves you, check out IBM Cloud Databases.
Follow IBM Cloud
Be the first to hear about news, product updates, and innovation from IBM Cloud.
Email subscribeRSS
Related Articles
Be the first to hear about news, product updates, and innovation from IBM Cloud