At what point during the analysis process does a data analyst use a changelog?

can use andview a changelog in spreadsheets and SQL toachieve similar results. Let's start with the spreadsheet. We can use Sheet's versionhistory, which provides areal-time tracker of all the changes and who made them from individual cells tothe entire worksheet. To find this feature, click the File tab, and then select Versionhistory. In the right panel, choose an earlier version. We can find whoedited the file and the changes they made in thecolumn next to their name. To return to the current version, go to the top left

and click "Back." If you want to check outchanges in a specific cell, we can right-click andselect Show Edit History. Also, if you want others to be able to browse asheet's version history, you'll need to assign permission. Now let's switch gearsand talk about SQL. The way you create andview a changelog with SQL depends on the softwareprogram you're using. Some companies even havetheir own separate software that keeps track of changelogsand important SQL queries. This gets pretty advanced. Essentially, allyou have to do is specify exactly what you did and why when you commit a query to therepository as a newand improved query. This allows the company to revert back to a previous version ifsomething you've donecrashes the system, which has happened to me before. Another option is to just addcomments as you go whileyou're cleaning data in SQL. This will help you construct your changelog after the fact.For now, we'll check out query history, which tracksall the queries you've run. You can click on any ofthem to revert back to a previous versionof your query or to bring up an older versionto find what you've changed. Here's what we've got. I'm in the Query history tab. Listedon the bottom right are all the queries thatrun by date and time. You can click on thisicon to the right of each individual query to bringit up to the Query editor. Changelogs like these are a great way to keepyourself on track. It also lets your team get real-time updateswhen they want them. But there's another wayto keep the communication flowing, and that's reporting. Stick around, and you'll learnsome easy ways to share your documentation and maybe impress your stakeholdersin the process. See you in the next video.4.49Great, you're back. Let's set the stage. The crime is dirty data. We've gathered theevidence. It's been cleaned, verified,and cleaned again. Now it's time to

present our evidence. We'll retrace the steps and present our case to our peers. As wediscussedearlier, data cleaning, verifying, and reportingis a lot like crime drama. Now it's our day in court. Just like a forensic scientist testifieson the standabout the evidence, data analysts arecounted on to present their findings after adata cleaning effort. Earlier, we learnedhow to document and track every step of thedata cleaning process, which means we have solidinformation to pull from. As a quick refresher, documentation is the process of trackingchanges, additions, deletions, and errors involvedin a data cleaning effort, changelogs are goodexample of this. Since it's stagedchronologically, it provides a real-timeaccount of every modification. Documenting will bea huge time saver for you as a future data analyst. It's basically a cheatsheet you canrefer to ifyou're working with the similar data set or needto address similar errors. While your team can view

Coursera Google Data Analytics Professional Certificate Course 5 – Analyze Data to Answer Questions quiz answers to all weekly questions (weeks 1 – 4):

  • Week 1: Organizing data to begin analysis
  • Week 2: Formatting and adjusting data
  • Week 3: Aggregating data for analysis
  • Week 4: Performing data calculations

You may also be interested in Google Data Analytics Professional Certificate Course 1: Foundations – Cliffs Notes.


Week 1: Organizing data to begin analysis

Organizing data makes the data easier to use in your analysis. In this part of the course, you’ll learn the importance of organizing your data through sorting and filtering. You’ll explore these processes in both spreadsheets and SQL as you continue to prepare your data for analysis.

Learning Objectives

  • Describe what is involved in the data analysis process with reference to goals and key tasks
  • Discuss the importance of organizing data before analysis with references to sorts and filters
  • Describe sorting as it relates to data in a spreadsheet or database with reference to functionality and benefits
  • Demonstrate an understanding of the steps involved in sorting and filtering data through the use of SQL queries

Answers to week 1 quiz questions

L2 Data analysis basics

Question 1

You are creating a spreadsheet that contains data about a volunteer theater production. You ask the volunteers which tasks they have already completed, then add that data to the spreadsheet. Next, you will use the information provided by the volunteers to figure out which tasks still need to be done. This is an example of which phase of analysis?

  • Formatting and adjusting data
  • Getting input from others
  • Organizing data
  • Transforming data

This is an example of getting input from others. Getting input means soliciting information from other sources to inform your decisions. Transforming data involves identifying the trends and patterns between the data.

Question 2

You are working with three datasets about voter turnout in your county. First, you identify relationships and patterns between the datasets. Then, you use formulas and functions to make calculations based on your data. This is an example of which phase of analysis?

  • Organizing data
  • Getting input from others
  • Transforming data
  • Formatting and adjusting data

This is an example of transforming data, which involves identifying relationships and patterns between the datasets and making calculations based on the data.

Question 3

You are working with a dataset from a local community college. You sort the students alphabetically by last name. This is an example of which phase of analysis?

  • Format and adjust data
  • Transform data
  • Get input from others
  • Organize data

Sorting a list of students alphabetically is an example of formatting and adjusting data. This is a step analysts take to rearrange the data to make it easier to work with.

L3 Organize data for analysis

Question 1

Fill in the blank: A data analyst uses _ to decide which data is relevant to their analysis and which data types and variables are appropriate.

  • database relationships
  • database references
  • database organization
  • database normalization

Database organization enables analysts to make decisions about which data is relevant to pull for a specific analysis. Database references let them access objects from other databases.

Question 2

You are working with a dataset that lists student athletes at a school. The Sport column designates the sport each athlete plays. Which of the following SQL queries would return only the athletes who play volleyball?

  • WHERE Sport = “Volleyball”
  • SPORT(“Volleyball”)
  • WHERE Sport = Volleyball
  • SPORT = “Volleyball”

The query WHERE Column = “Type” will return only athletes who play volleyball.

L4 Sort data in spreadsheets

Question 1

Which spreadsheet menu function is used to sort all data in a spreadsheet by the ranking of a specific sorted column?

  • Sort Data
  • Sort By Rank
  • Sort Range
  • Sort Sheet

Sort Sheet is used to sort all data in a spreadsheet by the ranking of a specific sorted column.

Question 2

In spreadsheets, data analysts can sort a range from the Data tab in the menu or by typing a function directly into an empty cell.

  • True
  • False

Sorting a range and sorting a sheet can both be done from the menu and written as a function. Analysts can work from the Data tab in the menu or type a function directly into an empty cell.

Question 3

An analyst uses =SORT to sort spreadsheet data in descending order. What do they type at the end of their sort function?

  • FALSE
  • DESCEND
  • LEFT
  • REVERSE

To sort a spreadsheet in descending order, the analyst types FALSE at the end of their sort function.

L5 Sort data in SQL

Question 1

A data analyst is writing a SQL query to sort data in a column in ascending order. The column is called column_title. What is the correct syntax for their query?

  • ORDER column_title ASC
  • ORDER BY column_title
  • ORDER BY column_title DESC
  • ORDER column_title

An ORDER BY statement sorts in ascending order by default. ORDER BY column_title is the syntax for this query.

Question 2

You want to sort a database table of newly released young adult novels. Which statement sorts action novels by word count in descending order?

  • WHERE genre = “Action” ORDER BY word_count DESC
  • WHERE word_count DESC ORDER BY genre = “Action”
  • WHERE genre = “Action” WHERE word_count DESC
  • WHERE genre = “Action” SORT BY word_count DESC

The correct statement is: SQL query WHERE genre = "Action" ORDER BY word_count DESC. The ORDER BY clause tells the database how to organize the data it returns.

Weekly challenge 1

Question 1

In the data analysis process, which of the following refers to a phase of analysis? Select all that apply.

  • Visualize the data
  • Organize data into understandable sections
  • Get input from others
  • Format data using sorts and filters

There are four phases of analysis: organize data, format and adjust data, get input from others, and transform data by observing relationships between data points and making calculations.

Question 2

During which phase of analysis can you find a correlation between two variables?

  • Format and adjust data
  • Get input from others
  • Organize data
  • Transform data

Finding a correlation between two variables occurs while transforming data.

Question 3

You are performing a calculation during your analysis of a dataset. Which phase of analysis are you in?

  • Transform data
  • Get input from others
  • Organize data
  • Format and adjust data

You are the transform data phase of analysis. This is an example of identifying relationships and patterns between data.

Question 4

Typically, a data analyst uses filters when they want to expand the amount of data they are working with.

  • True
  • False

Typically, a data analyst uses filters when they want to narrow down the amount of data they are working with.

Question 5

A data analyst is sorting spreadsheet data. They want to make sure that, when they rearrange the data, data across rows is kept together. What technique should they use to sort the data?

  • Sort Column
  • Sort Sheet
  • Sort Together
  • Sort Rows

Sort sheet sorts all of the data in a spreadsheet by a specific sorted column. Data across rows is kept together during the sort.

Question 6

A data analyst uses a function to sort a spreadsheet range between cells H1 and K65. They sort in ascending order by the first column, Column H. What is the syntax they are using?

  • =SORT(H1:K65, 1, TRUE)
  • =SORT(H1:K65, A, FALSE)
  • =SORT(H1:K65, A, TRUE)
  • =SORT(H1:K65, 1, FALSE)

The syntax is =SORT(H1:K65, 1, TRUE). The first part of the function sorts the data in the specified range. The 1 represents the first column. And a TRUE statement sorts in ascending order.

Question 7

A data analyst is querying a database that contains data about dental equipment inventory. They are only interested in data related to cleaning products. Which of the following sections of an SQL statement would return the correct result?

  • WHERE “Cleaning”
  • WHERE product = “Cleaning”
  • ORDER BY “Cleaning”
  • ORDER BY product = “Cleaning”

The correct section is WHERE Product = "Cleaning". A WHERE statement in SQL includes the name of the column, an equals sign, and the value(s) in the column to include.

Question 8

A data analyst would write the following section of a SQL query to sort Golden Retrievers, ordered by birth date, in ascending order:

WHERE Breed = "Golden Retriever" ORDER BY Birth_date
  • True
  • False

The query will return Golden Retrievers, ordered by birth date, in ascending order.

Week 2: Formatting and adjusting data

As you move closer to analyzing your data, you’ll want to have the data formatted and ready to go. In this part of the course, you’ll learn all about converting and formatting data, including how SQL queries can help you combine data. You’ll also find out the value of feedback and support from your colleagues and how it can lead to new learning that you can apply to your work.

Learning Objectives

  • Demonstrate an understanding of what is involved in the conversion and formatting of data
  • Demonstrate an understanding of the use of spreadsheets and SQL queries to combine multiple pieces of data
  • Discuss the importance of seeking feedback and support from others

Answers to week 2 quiz questions

L2 Convert and format

Question 1

A spreadsheet cell contains the coldest temperature ever recorded in New Zealand: -22 °Celsius. What function could be used to display that temperature in Fahrenheit?

  • =CONVERT(-22, C, F)
  • =CONVERT(-22, F, C)
  • =CONVERT(-22, “C”, “F”)
  • =CONVERT(-22, “F”, “C”)

=CONVERT(-22, “C”, “F”) will display -22 °C in Fahrenheit.

Question 2

A data analyst wants to ensure spreadsheet tools continue to run correctly, even if someone enters the wrong data by mistake. Which data-validation menu option should they select?

  • Deny Help Text
  • Reject Invalid Inputs
  • Forbid Entry
  • Remove Validation

To ensure spreadsheet tools continue to run correctly, even if someone enters the wrong data by mistake, select Reject Invalid Inputs.

Question 3

A data analyst clicks on the Format Cells If drop-down menu and selects the option Text Is Exactly November. This changes the color of all the cells that contain the word November. What spreadsheet tool is the analyst using?

  • Conditional formatting
  • CONVERT
  • Filtering
  • Data validation

The data analyst is using conditional formatting. Conditional formatting is a spreadsheet tool that changes how cells appear when values meet specific conditions.

L3 Combine multiple data sets

Question 1

You are working on a project related to rental properties in the United States. You write the following query:

SELECT * FROM rentals.us_housing_units

How can you instruct the database to retrieve only the first 10 results?

  • RETRIEVE 10
  • LIMIT 10
  • RETURN 10
  • FIRST 10

To display only the first 10 results, you would type, LIMIT 10.

Question 2

What function can be used to confirm that spreadsheet cell B8 contains exactly 20 characters?

  • LEN = B8,20
  • =LEN(20)
  • LEN = B8
  • =LEN(B8)

The function =LEN(B8) will display the number of characters in cell B8. The LEN function returns the length of a string of text by counting the number of characters it contains.

Weekly challenge 2

Question 1

An analyst notes that the “160” in cell A9 is formatted as text, but it should be Australian dollars. What spreadsheet tool can help them select the right format?

  • CURRENCY
  • Format as Currency
  • EXCHANGE
  • Format as Dollar

The Format as Currency tool can be used to change the text to Australian dollars.

Question 2

You are creating a spreadsheet to help you with your job search. Every time you find an interesting job, you add it to the spreadsheet. Then, you want to indicate two possible options: Need to Apply or Applied. What spreadsheet tool will save you time by enabling you to create a dropdown list with Need to Apply and Applied as the possible options?

  • Data validation
  • FIND
  • Conditional formatting
  • Pop-up menus

Data validation can be used to add drop-down lists with predetermined options for Need to Apply and Applied.

Question 3

You are using a spreadsheet to keep track of your newspaper subscriptions. You add color to indicate if a subscription is current or has expired. Which spreadsheet tool changes how cells appear when values meet each expiration date?

  • Add color
  • CONVERT
  • Data validation
  • Conditional formatting

You are using conditional formatting. Conditional formatting changes how cells appear when values meet specific conditions.

Question 4

A data analyst wants to write a SQL query to combine data from two columns and into a new column. What function can they use?

  • CONCAT
  • JOIN
  • COMBINE
  • GROUP

They can use CONCAT, which joins multiple text strings from multiple sources.

Question 5

You are querying a database of ice cream flavors to determine which stores are selling the most mint chip. For your project, you only need the first 80 records. What clause should you add to the following SQL query?

SELECT flavors FROM ice_cream_table WHERE flavor = "mint_chip"
  • LIMIT = 80
  • LIMIT_80
  • LIMIT,80
  • LIMIT 80

To return only the first 80 records, type LIMIT 80.

Question 6

A data analyst is working with a spreadsheet that has very long text strings. They use a function to count the number of characters in cell G11. What is the correct syntax?

  • =LEN(G,11)
  • =LEN(G11)
  • =LEN(G:G11)
  • =LEN(“G11”)

The correct syntax is =LEN(G11). The LEN function counts the number of characters in a text string and the parameter for the function is the cell reference.

Question 7

Spreadsheet cell L6 contains the text string “Function.” To return the substring “Fun,” what is the correct syntax?

  • =RIGHT(3,L6)
  • =LEFT(L6, 3)
  • =RIGHT(L6, 3)
  • =LEFT(3,L6)

The function =LEFT(L6, 3) will return “Fun.” The LEFT function returns a set number of characters from the left side of a text string. In this case, it returns a three-character substring from the end of the string in L6, starting from the left.

Question 8

Fill in the blank: When working with a database, data analysts can use the _ function to locate specific characters in a string.

  • IDENTIFY
  • WHERE
  • FIND
  • FROM

When working with a database, data analysts can use the FIND function to locate specific characters in a string.

Week 3: Aggregating data for analysis

As part of your analysis, you’ll often have to combine data in order to gain insights and complete business objectives. In this part of the course, you’ll explore the functions, procedures, and syntax involved in combining, or aggregating, data. You’ll learn how to do this from multiple cells in spreadsheets and from multiple database tables using SQL queries.

Learning Objectives

  • Demonstrate an understanding of functions and procedures that may be used to combine data from multiple cells in a spreadsheets
  • Demonstrate an understanding of functions and syntax to create SQL queries for combining data from multiple database tables
  • Use VLOOKUP to query data, trim data, convert text data to numeric data, and create a summary table from a queried information

Answers to week 3 quiz questions

L2 Avoid common VLOOKUP pitfalls

Question 1

To change a text string in spreadsheet cell F8 to a numerical value, what is the correct function?

  • =VALUE(F8)
  • =MATCH(F8)
  • =NUM(F8)
  • =CONVERT(F8)

To change the text string in spreadsheet cell F8 to a numerical value, the correct syntax is =VALUE(F8). Within the parenthesis, the VALUE syntax must include a reference to the specific cell whose value the function should convert.

Question 2

What is the purpose of an absolute reference within a function, such as “$C$3”?

  • To remove unnecessary instructions from a formula or function
  • To lock rows and columns so they won’t change when a function is copied
  • To represent missing values in a formula or function
  • To make formulas and functions unconditional

The purpose of an absolute reference is to lock the reference to a row or column so values won’t change when a function is copied.

Question 3

In VLOOKUP, TRUE tells the function to search for exact matches, and FALSE tells the function to look for approximate matches.

  • True
  • False

In VLOOKUP, TRUE tells the function to search for approximate matches, and FALSE tells the function to look for exact matches.

Question 4

The following is a selection from a spreadsheet:

ABC
1 Country Population in 2020 (millions) Growth in population 2000-2020
2 China 1,439,323,776 13.4%
3 India 1,380,004,385 37.1%
4 United States 331,002,651 17.3%
5 Indonesia 273,523,615 27.7%
6 Pakistan 220,892,340 44.9%
7 Brazil 212,559,417 21.9%
8 Nigeria 206,139,589 66.3%
9 Bangladesh 164,689,383 27.9%
10 Russia 145,934,462 -0.8%

To search for the population of Nigeria, what is the correct VLOOKUP syntax?

  • =VLOOKUP(“Nigeria”, A2:C10, 2, false)
  • =VLOOKUP(Nigeria, A2:C10, 3, false)
  • =VLOOKUP(Nigeria, A2:C10, 3, true)
  • =VLOOKUP(Nigeria, A2,C10, 2, true)

To search for the population of Nigeria, the syntax is =VLOOKUP(“Nigeria”, A2:C10, 2, false). “Nigeria” is the reference. A2:C10 is the table array. The 2 indicates the position of the column from which the value should be returned. And the word false instructs the function to return an exact match.

Question 5

The following is a selection from a spreadsheet:

ABCD
1 Location Building Height Year completed
2 Dubai Burj Khalifa 2,717 feet 2010
3 Shanghai Shanghai Tower 2,073 feet 2015
4 Mecca Makkah Royal Clock Tower 1,972 feet 2012
5 Shenzhen Ping An Finance Center 1,965 feet 2017
6 St. Petersburg Lakhta Center 1,516 feet 2019
7 Chicago Willis Tower 1,451 feet 1974

To search for the height of the building in Mecca, what is the correct VLOOKUP syntax?

  • =VLOOKUP(Mecca, A2:D7, 2, false)
  • =VLOOKUP(Mecca, A2:D7, 2, true)
  • =VLOOKUP(Mecca, A2,D7, 3, true)
  • =VLOOKUP(“Mecca”, A2:D7, 3, false)

To search for the height of the building in Mecca, the correct syntax is =VLOOKUP(“Mecca”, A2:D7, 3, false). “Mecca” is the reference. A2:D7 is the table array. The 3 indicates the number of the column from which the value should be returned. And the word false instructs the function to return an exact match.

L3 Use JOINS to aggregate data in SQL

Question 1

A data analyst wants to retrieve only records from a database that have matching values in two different tables. Which JOIN function should they use?

  • INNER JOIN
  • RIGHT JOIN
  • LEFT JOIN
  • OUTER JOIN

To retrieve only records from a database that have matching values in two different tables, the analyst should use INNER JOIN.

Question 2

You are writing a SQL query to instruct a database to count distinct values in a specified range. Which function should you include in your query?

  • COUNT DISTINCT
  • COUNT RANGE
  • COUNT VALUES
  • COUNT

To tell a database to return distinct values in a specified range, the analyst should use COUNT DISTINCT in their query.

Question 3

A data analyst wants to temporarily name a column in their query to make it easier to read and write. What technique should they use?

  • Aliasing
  • Tagging
  • Filtering
  • Naming

To temporarily name a column in a query to make it easier to read and write, the analyst should use aliasing.

L4 Work with subqueries

Question 1

Which of the following queries contain subqueries? Select all that apply.

  • SELECT call FROM recordings ORDER BY call.employee_id, call.start_time
  • SELECT first_name, last_name FROM customers WHERE …
  • SELECT employee _id FROM employees WHERE …
  • SELECT price FROM sales WHERE price …

The three queries with statements in parentheses contain subqueries.

Question 2

Fill in the blank: A data analyst uses aliasing to make it easier to read and write a query. Aliasing involves temporarily _ a table or column in a query.

  • hiding
  • removing
  • naming
  • copying

Aliasing involves temporarily naming a table or column in a query.

Weekly challenge 3

Question 1

Fill in the blank: Data aggregation involves creating a _ collection of data that originally came from multiple sources.

  • modified
  • summarized
  • localized
  • expanded

Data aggregation involves creating a summarized collection of data from multiple sources.

Question 2

A data analyst uses the SUM function to add together numbers from a spreadsheet. However, after getting a zero result, they realize the numbers are actually text. What function can they use to convert the text to a numeric value?

  • FIGURE
  • DIGIT
  • VALUE
  • CONVERT

The analyst can use the VALUE function to convert the text that represents a number to a numeric value.

Question 3

When using VLOOKUP, there are some common limitations that data analysts should be aware of. One of these limitations is that VLOOKUP can only return a value from the data to the left of the matched value.

  • True
  • False

One limitation of VLOOKUP is that it can only return a value from the data to the right of the matched value.

Question 4

Fill in the blank: When writing a function, a data analyst wraps a table array in dollar signs. This is an _ , which is used to lock the array so rows and columns don’t change if the function is copied.

  • arbitrary reference
  • accurate reference
  • absolute reference
  • authentic reference

Wrapping a table array in dollar signs creates an absolute reference, which locks the array so rows and columns don’t change if the function is copied.

Question 5

The following is a selection from a spreadsheet:

ABC
1 Country Population in 2020 (millions) Growth in population 2000-2020
2 China 1,439,323,776 13.4 %
3 India 1,380,004,385 37.1 %
4 United States 331,002,651 17.3 %
5 Indonesia 273,523,615 27.7%
6 Pakistan 220,892,340 44.9%
7 Brazil 212,559,417 21.9%
8 Nigeria 206,139,589 66.3%
9 Bangladesh 164,689,383 27.9%
10 Russia 145,934,462 -0.8%

To search for the population of Pakistan, what is the correct VLOOKUP syntax?

  • =VLOOKUP(Pakistan, A2:B10, 3, false)
  • =VLOOKUP(“Pakistan”, A2:B10, 3, false)
  • =VLOOKUP(Pakistan, A2*B10, 2, false)
  • =VLOOKUP(“Pakistan”, A2:B10, 2, false)

To search for the population of Pakistan, the syntax is =VLOOKUP(“Pakistan”, A2:B10, 2, false). “Pakistan” is the reference. A2:B10 is the table array. The 2 indicates the number of the column from which the value should be returned. And the word false instructs the function to return an exact match.

Question 6

When creating a SQL query, which JOIN clause returns all matching records in two or more database tables?

  • OUTER
  • RIGHT
  • INNER
  • LEFT

The INNER JOIN clause returns all matching records in two or more database tables.

Question 7

A data analyst writes a query that asks a database to return only distinct values in a specified range, rather than including repeating values. Which function do they use?

  • RETURN
  • COUNT DISTINCT
  • RETURN VALUES
  • COUNT

When writing SQL queries, an analyst can use the COUNT DISTINCT function to return only distinct values in a range.

Question 8

Which of the following terms describe a subquery? Select all that apply.

  • Inner select
  • Nested query
  • Inner query
  • Small query

A subquery can also be called an inner query, inner select, or nested query.

Week 4: Performing data calculations

Calculations are one of the more common tasks that data analysts complete during analysis. In this part of the course, you’ll explore formulas, functions, and pivot tables in spreadsheets and queries in SQL, all of which will help with your calculations. You’ll also learn about the benefits of using SQL to manage temporary tables.

Learning Objectives

  • Describe the use of functions to conduct basic calculations on data in spreadsheets
  • Discuss the use of pivot tables to conduct calculations on data in spreadsheets
  • Demonstrate an understanding of the use of SQL queries to complete calculations
  • Explain the importance of the data-validation process for ensuring accuracy and consistency in analysis
  • Discuss the use of SQL queries to manage temporary tables
  • Reflect on how conditional statements can be used to create complex queries and functions
  • Generate multiple points of summary based on a wide variety of conditions using COUNTIF, SUMIF, MAXIF, and AVERAGEIF

Answers to week 4 quiz questions

L2 Data calculations

Question 1

What is the correct spreadsheet formula for multiplying 50 and 233?

  • 50×233
  • =50×233
  • =50*233
  • 50*233

=50*233 is the correct formula for multiplying 50 and 233. Formulas begin with an equal sign (=). This is followed by the values to be computed. An asterisk (*) is the multiplication operator in spreadsheets.

Question 2

The following is a selection of a spreadsheet:

AB
1 Expense Amount
2 Rent $680.00
3 Healthcare $101.00
4 Groceries $185.00
5 Clothing $41.00
6 Transportation $22.00
7 Mobile phone $48.00
8 Dining out $79.00
9 Car insurance $65.00
10 Dog walker $40.00
11 Gym membership $19.00
12 Manicure $23.00

You are trying to determine what percentage of your monthly income is spent on big-ticket items, such as rent and groceries. To add together only the values from Column B that cost more than $150, what is the correct syntax?

  • =SUMIF(B2:B12,”<150″)
  • =SUMIF(B2:B12,>150)
  • =SUMIF(B2:B12,<150)
  • =SUMIF(B2:B12,”>150″)

To add together only the values from Column B that cost more than $100, the correct syntax is =SUMIF(B2:B12,”>150″). B2:B12 is the range, and more than 150 (>150) is the criteria.

Question 3

A data analyst is working with a spreadsheet from a cosmetics company.

You may click the link to create a copy of the dataset: Cosmetics Inc.

Which of the following is an example of an array in this spreadsheet?

  • All cells with number values
  • All cells with values greater than 100
  • The values in cells B2 through B31
  • Cells D7 and D14

The values in cells B2 through B31 together are an example of an array. An array is a collection of values in spreadsheet cells.

L3 Pivot tables

Question 1

The following is a sample pivot table from a furniture company:

productSUM of purchase_price
bed $799.99
bookcase $58.89
chair $234.50
chaise $399.95
couch $9,000.00
desk $509.85
fan $111.92
lamp $160.97
mirror $199.95
ottoman $299.99
rug $808.65
vase $19.98
Grand Total 12604.635

What is the purpose of the pivot table in this spreadsheet?

  • To organize all of the data into a smaller format
  • To find the average price of each product
  • To calculate purchase price data
  • To summarize data about each product

The purpose of the pivot table is to calculate purchase price data. The pivot table shows the total purchase price for each item and the total overall purchase price for all of the items. This pivot table doesn’t include all of the data from the transaction sheet.

Question 2

The following is a sample pivot table from a furniture company:

productSUM of purchase_price
bed $799.99
bookcase $58.89
chair $234.50
chaise $399.95
couch $9,000.00
desk $509.85
fan $111.92
lamp $160.97
mirror $199.95
ottoman $299.99
rug $808.65
vase $19.98
Grand Total 12604.635

How could the pivot table be adjusted to show the same data, but only for products categorized as beige?

  • Add a filter to show only beige products
  • Add a new column labeled beige
  • Sort the current row by product color
  • Summarize the values by product

To show the same data, but only for products categorized as beige, add a filter to show only beige products.

Question 3

The following is a sample pivot table from a furniture company:

productSUM of purchase_price
bed $799.99
bookcase $58.89
chair $234.50
chaise $399.95
couch $9,000.00
desk $509.85
fan $111.92
lamp $160.97
mirror $199.95
ottoman $299.99
rug $808.65
vase $19.98
Grand Total 12604.635

The value added to the pivot table is the purchase price of the products.

  • TRUE
  • FALSE

The value added to the pivot table is purchase price. In the pivot table editor, the Values menu shows “product_price” as the value in the pivot table.

Question 4

The following is a sample pivot table from a furniture company:

productSUM of purchase_priceCalculated Field 1
bed $799.99 $0.00
bookcase $58.89 $0.00
chair $234.50 $0.00
chaise $399.95 $0.00
couch $9,000.00 $0.00
desk $509.85 $0.00
fan $111.92 $0.00
lamp $160.97 $0.00
mirror $199.95 $0.00
ottoman $299.99 $0.00
rug $808.65 $0.00
vase $19.98 $0.00
Grand Total 12604.635 $0.00

Which spreadsheet tool should you use if you want to find an average value using values generated within a pivot table?

  • A filter
  • Conditional formatting
  • A calculated field
  • Data validation

To find an average value using values generated within a pivot table, use a calculated field. A calculated field is a new field within a pivot table that carries out certain calculations based on the values of other fields.

L4 SQL calculations

Question 1

You are creating a query to request worker information from your database. You will use that information to calculate employees’ weekly pay. What clause would you include to store pay values in a new weekly_pay column?

SELECT Employee_ID, number_of_hours, Hourly_rate
FROM Wages_table
  • (number_of_hours * Hourly_rate) AS weekly_pay
  • (weekly_pay * Hourly_rate) TO number_of_hours
  • (weekly_pay * Hourly_rate) AS number_of_hours
  • (number_of_hours * Hourly_rate) TO weekly_pay

To store pay values in the weekly_pay column, the correct statement is (number_of_hours * Hourly_rate) AS weekly_pay. The AS command gives a temporary name to the column.

Question 2

In a SQL query, which calculation does the modulo (%) operator perform?

  • It converts a decimal to a percent
  • It finds the square root of a number
  • It applies an exponent to a value
  • It returns the remainder of a division calculation

The modulo operator returns the remainder of a division calculation when included in a SQL query.

Question 3

You are working with a dataset with the column name “firstquarterexpenses.” How can you rename this column to make it more readable?

  • Firstquarterexpenses
  • first_quarter_expenses
  • first+quarter+expenses
  • first quarter expenses

You can rename the column first_quarter_expenses. Using underscores between words helps avoid potential issues while keeping the names readable.

L5 Data validation

Question 1

The entire data-validation process takes place before you begin your analysis.

  • True
  • False

The data-validation process takes place throughout your analysis. This process involves checking and rechecking the quality of your data so that it is complete, accurate, secure and consistent.

Question 2

You’re analyzing patient data for a health care company. During the data-validation process, you notice that the first date of service for some of the patients is later than the most recent date of service. Which type of data-validation check are you completing?

  • Data consistency
  • Data structure
  • Data type
  • Data range

This is a check for data consistency. During a data consistency check, you confirm that the data makes sense in the context of other related data.

Question 3

During analysis, you complete a data-validation check for errors in customer identification (ID) numbers. Customer IDs must be eight characters and can contain numbers only. Which of the following customer ID errors will a data-type check help you identify?

  • IDs with text
  • IDs that are repeated
  • IDs in the wrong column
  • IDs with more than eight characters

Completing a data-type check will help you identify customer IDs that contain text. The data type for IDs should be numeric only.

L6 Using SQL with temporary tables

Question 1

When are temporary tables automatically deleted?

  • After running a query in your SQL database
  • After running a report from the table
  • After completing all calculations in the table
  • After ending the session in a SQL database

Temporary tables are automatically deleted after ending the session in a SQL database.

Question 2

The following SQL query contains information about bike trips:

WITH 1_hr_trips AS (
    SELECT *
    FROM bigquery-public-data.new_york.citibike_trips
    WHERE tripduration = 60

What data will appear in the temporary table created through this query?

  • The total number of bike trips
  • Bike trips equal to or more than one hour
  • A random subset of bike trips
  • Bike trips that lasted exactly one hour

This temporary table will show bike trips that lasted exactly one hour. The name of the table is “1_hr_trips” and the query includes the condition that trips in the table equal one hour.

Question 3

What benefit does a CREATE TABLE statement add to a temporary table?

  • Metadata about the data in the table
  • Access for anyone to use the table
  • Automated calculations
  • Specific naming conventions

A CREATE TABLE statement provides access for anyone to use the temporary table.

Weekly challenge 4

Question 1

You are analyzing sales data in a spreadsheet. Which of the following could you find out by using the MAX function?

  • Total sales for the year
  • Difference between two months of sales
  • The month with the highest sales
  • Sales per month over a year

You could find out the month with the highest sales using the MAX function. The MAX function returns the largest numeric value from a range of cells.

Question 2

A data analyst is working with a spreadsheet from a furniture company.

Sample Transaction Table.

The analyst inputs a function to find the number of product prices that are less than $150.00. Which formula will return that result?

  • =SUMIF(G2:G30, “>150”)
  • =COUNTIF(G2:G30, “<150”)
  • =SUMIF(G2:G30, “<150”)
  • =COUNTIF(G2:G30, “>=150”)

The COUNTIF formula =COUNTIF(G2:G30, “<150”) will allow the analyst to count all product price values in Column G that are less than $150.

Question 3

A data analyst is working in a spreadsheet and uses the SUMIF function in the formula below as part of their analysis.

=SUMIF(A1:A25, "<10", C1:C25)

Which part of this formula is the criteria or condition?

  • “<10”
  • A1:A25
  • C1:C25
  • =SUMIF

The criteria or condition for this SUMIF formula is “<10”. This means that if any values in the range A1 through A25 are less than 10, their corresponding values in the range C1 through C25 will be added together.

Question 4

A data analyst is working in a spreadsheet and uses the SUMPRODUCT function in the formula below as part of their analysis.

=SUMPRODUCT(A2:A10,B2:B10)

How does the SUMPRODUCT function calculate the cell ranges identified in the parentheses?

  • It multiplies the values in the first range, then multiplies the values in the second range.
  • It adds the ranges, then multiplies them by the last value in the second array.
  • It adds the values in the first range, then adds the values in the second range.
  • It multiplies the ranges, then adds the sum of the products of the two ranges.

=SUMPRODUCT(A2:A10,B2:B10) calculates the cell ranges by multiplying each value in the first range by its corresponding value in the second range (the results are the products). Then, the formula adds those products together.

Question 5

A data analyst creates a pivot table in a spreadsheet containing movie data.

Movie Data Project).

If the analyst wants to summarize the data using the AVERAGE function in the Values menu, which spreadsheet columns could they add data from? Select all that apply.

  • Box Office Revenue
  • Budget
  • Movie Title
  • Genre

To summarize the data using the AVERAGE function, the analyst could use the Budget column or the Box Office Revenue column. Both have numeric values that the AVERAGE function could calculate.

Question 6

A data analyst uses the following SQL query to perform basic calculations on their data. Which types of operators is the analyst using in this SQL query? Select all that apply.

SELECT
    Yes_Responses,
    No_Responses,
    Total_Surveys,
    (Yes_Responses + No_Responses) / Total_Surveys AS Responses_Per_Survey
FROM 
    Survey_1
  • Subtraction
  • Multiplication
  • Addition
  • Division

The analyst is using the division operator (/) in this SQL query to divide the sum of “yes” and “no” responses by the total number of surveys.

Question 7

A data analyst uses the following query to perform a calculation on a company’s inventory. Which of the following will be the return in the “Overstock” column for this query?

SELECT
    Total_Inventory % Total_Stores AS Overstock
FROM
    Shipment_1
  • The remainder when the values in “Total_Inventory” are divided by the values in “Total_Stores”
  • The percentage of the “Total_Inventory” that is located in “Total_Stores”
  • The difference between the values in “Total_Inventory” and the values in “Total_Stores”
  • The combined total of the values in “Total_Inventory” and the values in “Total_Stores”

The return for this query will be the remainder when the total inventory is divided by the total number of stores. The modulo operator (%) calculates the remainder when two values are divided.

Question 8

A data analyst completes a calculation in a SQL query using the AVG function. Which of the following best describes the return for this query?

SELECT 
    AVG (salary) AS avg_employee_salary 
FROM 
    employees 
WHERE 
    salary < 30000
  • The number of all salaries in the “employees” table
  • A single average of all of the salaries less than $30,000
  • A single count of salaries that average less than $30,000
  • The annual salary for each employee

The return for this query would be a single average of all of the salaries less than $30,000. The AVG function is an aggregate function that returns the average value of a group. In this query, the group is “salary” and the condition is salaries less than $30,000.

Question 9

Use the following SQL query to answer the question:

SELECT
    location,
    SUM(customer_orders) AS total_orders
FROM
    bulk_orders

Which statement should you add after the FROM statement to organize rows by location?

  • EXTRACT location
  • WHERE location
  • AS location
  • GROUP BY location

You should add the GROUP BY statement to organize rows by location. In this query, GROUP BY groups rows from the Bulk_orders table with the same location value into summary rows.

Question 10

Fill in the blank: The data validation process involves checking and rechecking the quality of your data to make sure that it is complete and _. Select all that apply.

  • cited
  • accurate
  • consistent
  • secure

Data validation involves checking and rechecking the quality of your data to make sure it is complete, accurate, secure, and consistent.

Basic Statistics Mini-Course

Google Data Analytics Professional Certificate Course 1: Foundations – Cliffs Notes

Google Data Analytics Professional Certificate Course 2: Ask Questions – quiz answers

Google Data Analytics Professional Certificate Course 3: Prepare Data – quiz answers

Google Data Analytics Professional Certificate Course 4: Process Data – quiz answers

Google Data Analytics Professional Certificate Course 6: Share Data – quiz answers

Google Data Analytics Professional Certificate Course 7: Data Analysis with R – quiz answers

Google Data Analytics Professional Certificate Course 8: Capstone – quiz answers

IT career paths – everything you need to know

Back to DTI Courses

What are the 5 steps to the data analysis process?

Step One: Ask The Right Questions. So you're ready to get started. ... .
Step Two: Data Collection. This brings us to the next step: data collection. ... .
Step Three: Data Cleaning. You've collected and combined data from multiple sources. ... .
Step Four: Analyzing The Data. ... .
Step Five: Interpreting The Results..

What are the 6 steps of the data analysis process?

Let's get started with step one..
Step one: Defining the question. The first step in any data analysis process is to define your objective. ... .
Step two: Collecting the data. ... .
Step three: Cleaning the data. ... .
Step four: Analyzing the data. ... .
Step five: Sharing your results. ... .
Step six: Embrace your failures. ... .
Summary..

During which phase of data analysis would a data analyst use spreadsheets?

During which phase would a data analyst use spreadsheets or query languages to transform data in order to draw conclusions? The analyze step involves using data analytics tools such as spreadsheets and query languages to transform data in order to draw conclusions and make informed decisions.

What comes first in the data analysis process?

Steps of Data Analysis.
Step 1 - Determining the objective. The initial step is ofcourse to determine our objective, which can also be termed as a “problem statement”. ... .
Step two: Gathering the data. ... .
Step three: Cleaning the data. ... .
Step four: Interpreting the data. ... .
Step five: Sharing the results..