Tidy Data

Question 1
Marks : +2 | -2
Pass Ratio : 100%
Which of the following is a trait of tidy data?
each variable in one column
each observation in different row
one table for each kind of variable
none of the mentioned
Explanation:
The summary could be the sum of the observations, the number of occurrences, their mean value, and so on.
Question 2
Marks : +2 | -2
Pass Ratio : 100%
Which of the following is an example of tidy data?
complicated JSON from facebook API
complicated JSON from Twitter API
unformatted excel file
all of the mentioned
Explanation:
Tidy data is obtained after processing script.
Question 3
Marks : +2 | -2
Pass Ratio : 100%
Raw data in the real-world is tidy and properly formatted.
True
False
Explanation:
Data analysis is not a goal in itself; the goal is to enable the business to make better decisions.
Question 4
Marks : +2 | -2
Pass Ratio : 100%
Point out the wrong statement.
Tidy datasets are all alike but every messy dataset is messy in its own way
Most statistical datasets are data frames made up of rows and columns
Tidy datasets provide a standardized way to link the structure of a dataset with its semantics
None of the mentioned
Explanation:
The tidy data standard has been designed to simplify the development of data analysis tools that work well together.
Question 5
Marks : +2 | -2
Pass Ratio : 100%
Which of the following package is used for tidy data?
tidyr
souryr
NumPy
all of the mentioned
Explanation:
tidyr is used for tidy data with spread and gather functions.
Question 6
Marks : +2 | -2
Pass Ratio : 100%
Strange binary file generated from machines is an example of tidy data.
True
False
Explanation:
Data sets stored in spreadsheets, such as Microsoft’s Excel, are binary, not raw ASCII data files.
Question 7
Marks : +2 | -2
Pass Ratio : 100%
Point out the correct statement.
Nearly 80% of data analysis is spent on wrangling data
Nearly 20% of data analysis is spent on data dredging
Nearly 80% of data analysis is spent on the cleaning and preparing data
None of the mentioned
Explanation:
Data cleansing is the process of detecting and correcting (or removing) corrupt or inaccurate records from a record set, table, or database.
Question 8
Marks : +2 | -2
Pass Ratio : 100%
Which of the following is the most common problem with messy data?
Column headers are values
Variables are stored in both rows and columns
A single observational unit is stored in multiple tables
All of the mentioned
Explanation:
Real datasets can, and often do, violate the three precepts of tidy data in almost every way imaginable.
Question 9
Marks : +2 | -2
Pass Ratio : 100%
tidyr is a reframing of _______ designed to accompany the tidy data framework.
reshape5
dplyr
reshape2
all of the mentioned
Explanation:
tidyr does less reframing than reshape2.
Question 10
Marks : +2 | -2
Pass Ratio : 100%
Which of the following process involves structuring datasets to facilitate analysis?
Data tidying
Data mining
Data booting
All of the mentioned
Explanation:
The principles of tidy data provide a standard way to organize data values within a dataset.