Chapter 5 Intro to data wrangling with tidyverse
The tidyverse is a modern collection of R packages designed to make working with data easier and more consistent. Now, before we begin data wrangling with dplyr, let’s take a closer look at how the tidyverse works and why it has become so central to modern R practice.
5.1 What makes the tidyverse special?
The tidyverse is not just a group of packages - it represents a philosophy of data analysis. Its design emphasizes clarity, consistency, and reproducibility. When you learn one tidyverse package, many of the same ideas and syntax patterns carry over to others.
At its heart are three key ideas:
- Tidy data principles
- Every tidyverse package expects data to be in tidy format, where:
- each variable forms a column;
- each observation forms a row, and
- each value has its own cell.
- A consistent syntax and grammar .
Most tidyverse functions are verb-based and take a data frame as their first argument. In this context, a verb simply means a function that performs one clear action on your data - such as filtering, sorting, or summarising. Because each function both takes in and returns a data frame, you can easily chain them together in a logical sequence. This leads us to the next key feature.
5.2 Why learn dplyr first?
Among all tidyverse packages, dplyr is often the best starting point because it provides a simple, intuitive grammar for data manipulation. It introduces five core verbs - filter(), arrange(), mutate(), summarise(), and group_by() - that cover most day-to-day data wrangling tasks.
Once you understand how these verbs interact with each other (and with the pipe), you will find it much easier to explore, clean, and prepare real data for analysis or visualization.
In the next section, we will use dplyr verbs to clean and transform data, and combine operations into pipelines for more readable code. By the end, you will not only know how to use dplyr, but also start thinking the tidyverse way-using clear, expressive code that mirrors the logic of your analysis.
5.3 Data wrangling with dplyr
So far, we have been using base Rto learn the fundamentals - how to navigate RStudio, create and work with different data types, and produce simple descriptive summaries. When we learned to import data, we also took a brief, first look at the tidyverse, a collection of modern R packages that make working with data easier and more consistent.
Now, we are going to explore one of the most widely used tidyverse packages, dplyr , which provides a simple and intuitive way to manipulate data. With dplyr, you can filter, arrange, transform, and summarise data using functions that read almost like plain English. These functions (called verbs) work seamlessly together, and when combined with the pipe operator (%>%), they allow you to build clear, step-by-step workflows that are both efficient and easy to follow.
By the end of this lesson you should be able to:
Save data efficiently by converting a .dta file into an .RData object.
Apply data transformation techniques using the five key dplyr verbs:
filter()
arrange()
mutate()
summarise()
group_by()
Use the pipe operator (
%>%) to write cleaner and more readable R code.Combine multiple dplyr verbs to perform complex data manipulations (e.g., grouped filtering or mutating).
Perform various join operations to merge and relate data across multiple tables.
5.4 Import Data
ICAN-ICAR 2025 is a nationally representative, household-based survey of \(89,185\) children in \(56,913\) households. The survey assesses children aged 5-16 on foundational numeracy and reading skills.
Download the file “ican-icar-2025-v1.dta” from DataFirst and save it to the data directory in your main project folder (make a folder called data if you haven’t already). Note that the metadata for this data is found on the site.
Load the tidyverse collection of packages, which loads the following packages: ggplot2, tibble, tidyr, readr, purrr, and dplyr.
- ggplot2: For data visualization (this will be revisited in detail later)
- tibble: For creating and working with tidy data frames
- tidyr: For tidying messy data
- readr:For importing rectangular data (like CSV files) quickly and efficiently
- purrr:For functional programming
- dplyr: For data manipulation using verbs like filter(), mutate(), summarise(), and arrange().
Load the tidyverse by running the command below:
5.5 Working with Tibbles
Tibbles are data frames (under the hood) that work very well with tidyverse packages. They are easier to read and manipulate than traditional data frames, especially when you have many rows or columns. In addition, each column reports its data type, a nice feature borrowed from str(). If your data is currently a standard data frame, convert it to a tibble using as_tibble().
Tibbles are designed so that you do not accidently overwhelm your console when you print large data frames.
Here’s an example.
A nice feature of tibbles is that if you display them in the console (by typing dat, for example) only the first 10 rows and all columns are shown.
## # A tibble: 96,452 × 147
## ...1 CountryName TierOneUnit VillageID Location HHID SubmissionDate duration hh06a hh06b1 hh06b2 hh06c hh06d hh07a hh07b hh07c
## <dbl> <chr> <chr> <chr> <chr> <chr> <date> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>
## 1 1 Mozambique c101 c101 Urban uuid:f580… 2025-09-23 979 4 1 NA 1 1 2 2 1
## 2 2 Mozambique c101 c101 Urban uuid:b98e… 2025-09-23 986 7 1 NA 1 1 2 2 1
## 3 3 Mozambique c101 c101 Urban uuid:2611… 2025-09-23 2167 6 1 NA 1 1 2 2 1
## 4 4 Mozambique c101 c101 Urban uuid:d46b… 2025-09-23 2078 6 1 NA 1 1 1 1 1
## 5 5 Mozambique c101 c101 Urban uuid:df92… 2025-09-23 1251 5 1 NA 1 1 2 2 1
## 6 6 Mozambique c101 c101 Urban uuid:ade5… 2025-09-23 2199 5 1 NA 1 1 2 2 1
## 7 7 Mozambique c101 c101 Urban uuid:3f84… 2025-09-23 2889 5 1 NA 1 1 2 2 1
## 8 8 Mozambique c101 c101 Urban uuid:ff7b… 2025-09-23 1588 4 1 NA 1 0 2 2 1
## 9 9 Mozambique c101 c101 Urban uuid:7730… 2025-09-23 2203 4 1 NA 1 1 2 2 1
## 10 10 Mozambique c101 c101 Urban uuid:e210… 2025-09-24 4647 5 1 NA 2 1 2 2 1
## # ℹ 96,442 more rows
## # ℹ 131 more variables: hh07d <dbl>, hh07e <dbl>, hh07f <dbl>, hh07g <dbl>, hh07h <dbl>, hh07i <dbl>, hh07j <dbl>, hh07k <dbl>,
## # hh07l <dbl>, hh07m <dbl>, hh07n <dbl>, hh07o <dbl>, hh07p <dbl>, hh07p_3 <dbl>, hh07q <dbl>, ChildID <chr>, ch02 <dbl>, ch03 <dbl>,
## # ch04a <dbl>, ch04b <dbl>, ch04c <dbl>, ch04d <dbl>, ch04e <dbl>, ch04f <dbl>, ch05 <dbl>, ch06a <dbl>, ch06b <dbl>, ch06c <dbl>,
## # ch06d <dbl>, ch06e <dbl>, ch07a <dbl>, ch07b <dbl>, ch07c <dbl>, ch07d <dbl>, ch08 <dbl>, ch09 <dbl>, ch10a <dbl>, ch10b <dbl>,
## # ch10c <dbl>, pt00 <dbl>, pt01b <dbl>, pt01c <dbl>, pt01d <dbl>, pt01e <dbl>, pt01f <dbl>, pt02b <dbl>, pt02c <dbl>, pt02d <dbl>,
## # pt02e <dbl>, pt02f <dbl>, assessment <dbl>, sample <dbl>, l1_1 <dbl>, l1_2 <dbl>, l1_3 <dbl>, l1_4 <dbl>, l2_1 <dbl>, l2_2 <dbl>, …
To understand your data at a high-level, you can use skim() from skimr package :
| Name | dat |
| Number of rows | 96452 |
| Number of columns | 147 |
| _______________________ | |
| Column type frequency: | |
| character | 8 |
| Date | 1 |
| numeric | 138 |
| ________________________ | |
| Group variables | None |
Variable type: character
| skim_variable | n_missing | complete_rate | min | max | empty | n_unique | whitespace |
|---|---|---|---|---|---|---|---|
| CountryName | 0 | 1.0000000000000000000000 | 4 | 10 | 0 | 11 | 0 |
| TierOneUnit | 0 | 1.0000000000000000000000 | 4 | 4 | 0 | 121 | 0 |
| VillageID | 0 | 1.0000000000000000000000 | 4 | 4 | 0 | 627 | 0 |
| Location | 0 | 1.0000000000000000000000 | 5 | 5 | 0 | 2 | 0 |
| HHID | 0 | 1.0000000000000000000000 | 41 | 41 | 0 | 56880 | 0 |
| ChildID | 0 | 1.0000000000000000000000 | 48 | 66 | 0 | 96452 | 0 |
| AssessmentLanguage | 7153 | 0.9300000000000000488498 | 4 | 10 | 0 | 16 | 0 |
| EnrolmentStatus | 0 | 1.0000000000000000000000 | 13 | 18 | 0 | 3 | 0 |
Variable type: Date
| skim_variable | n_missing | complete_rate | min | max | median | n_unique |
|---|---|---|---|---|---|---|
| SubmissionDate | 0 | 1 | 2025-07-27 | 2025-11-11 | 2025-09-02 | 102 |
Variable type: numeric
| skim_variable | n_missing | complete_rate | mean | sd | p0 | p25 | p50 | p75 | p100 | hist |
|---|---|---|---|---|---|---|---|---|---|---|
| …1 | 0 | 1.00000000000000000000000 | 4.822650000000000000168e+04 | 2.784343999999999868990e+04 | 1.000000000000000000000e+00 | 24113.7500000000000000000000 | 4.822650000000000000168e+04 | 7.233925000000000000121e+04 | 9.645200000000000000074e+04 | ▇▇▇▇▇ |
| duration | 0 | 1.00000000000000000000000 | 2.142300000000000181769e+03 | 1.711170000000000072709e+03 | -3.813000000000000000010e+04 | 986.0000000000000000000000 | 1.658999999999999999927e+03 | 2.785999999999999999882e+03 | 2.014199999999999999949e+04 | ▁▁▁▇▁ |
| hh06a | 0 | 1.00000000000000000000000 | 7.629999999999999893419e+00 | 6.219999999999999751310e+00 | 1.000000000000000000000e+00 | 4.0000000000000000000000 | 6.000000000000000000000e+00 | 8.000000000000000000000e+00 | 8.000000000000000000000e+01 | ▇▁▁▁▁ |
| hh06b1 | 19208 | 0.80000000000000004440892 | 6.700000000000000399680e-01 | 4.699999999999999733546e-01 | 0.000000000000000000000e+00 | 0.0000000000000000000000 | 1.000000000000000000000e+00 | 1.000000000000000000000e+00 | 1.000000000000000000000e+00 | ▃▁▁▁▇ |
| hh06b2 | 77244 | 0.20000000000000001110223 | 2.879999999999999893419e+00 | 1.320000000000000062172e+00 | 1.000000000000000000000e+00 | 2.0000000000000000000000 | 2.000000000000000000000e+00 | 3.000000000000000000000e+00 | 8.000000000000000000000e+00 | ▇▅▂▁▁ |
| hh06c | 44238 | 0.54000000000000003552714 | 1.270000000000000017764e+00 | 4.400000000000000022204e-01 | 1.000000000000000000000e+00 | 1.0000000000000000000000 | 1.000000000000000000000e+00 | 2.000000000000000000000e+00 | 2.000000000000000000000e+00 | ▇▁▁▁▃ |
| hh06d | 0 | 1.00000000000000000000000 | 5.200000000000000177636e-01 | 5.000000000000000000000e-01 | 0.000000000000000000000e+00 | 0.0000000000000000000000 | 1.000000000000000000000e+00 | 1.000000000000000000000e+00 | 1.000000000000000000000e+00 | ▇▁▁▁▇ |
| hh07a | 0 | 1.00000000000000000000000 | 2.180000000000000159872e+00 | 4.600000000000000199840e-01 | 1.000000000000000000000e+00 | 2.0000000000000000000000 | 2.000000000000000000000e+00 | 2.000000000000000000000e+00 | 3.000000000000000000000e+00 | ▁▁▇▁▂ |
| hh07b | 0 | 1.00000000000000000000000 | 2.149999999999999911182e+00 | 5.300000000000000266454e-01 | 1.000000000000000000000e+00 | 2.0000000000000000000000 | 2.000000000000000000000e+00 | 2.000000000000000000000e+00 | 3.000000000000000000000e+00 | ▁▁▇▁▂ |
| hh07c | 0 | 1.00000000000000000000000 | 1.500000000000000000000e+00 | 8.000000000000000444089e-01 | 1.000000000000000000000e+00 | 1.0000000000000000000000 | 1.000000000000000000000e+00 | 2.000000000000000000000e+00 | 4.000000000000000000000e+00 | ▇▃▁▁▁ |
| hh07d | 0 | 1.00000000000000000000000 | 2.410000000000000142109e+00 | 1.580000000000000071054e+00 | 1.000000000000000000000e+00 | 1.0000000000000000000000 | 2.000000000000000000000e+00 | 4.000000000000000000000e+00 | 7.000000000000000000000e+00 | ▇▂▂▁▁ |
| hh07e | 34735 | 0.64000000000000001332268 | 9.899999999999999911182e-01 | 1.199999999999999955591e-01 | 0.000000000000000000000e+00 | 1.0000000000000000000000 | 1.000000000000000000000e+00 | 1.000000000000000000000e+00 | 1.000000000000000000000e+00 | ▁▁▁▁▇ |
| hh07f | 0 | 1.00000000000000000000000 | 9.300000000000000488498e-01 | 2.500000000000000000000e-01 | 0.000000000000000000000e+00 | 1.0000000000000000000000 | 1.000000000000000000000e+00 | 1.000000000000000000000e+00 | 1.000000000000000000000e+00 | ▁▁▁▁▇ |
| hh07g | 0 | 1.00000000000000000000000 | 4.299999999999999933387e-01 | 5.000000000000000000000e-01 | 0.000000000000000000000e+00 | 0.0000000000000000000000 | 0.000000000000000000000e+00 | 1.000000000000000000000e+00 | 1.000000000000000000000e+00 | ▇▁▁▁▆ |
| hh07h | 0 | 1.00000000000000000000000 | 3.499999999999999777955e-01 | 4.799999999999999822364e-01 | 0.000000000000000000000e+00 | 0.0000000000000000000000 | 0.000000000000000000000e+00 | 1.000000000000000000000e+00 | 1.000000000000000000000e+00 | ▇▁▁▁▅ |
| hh07i | 0 | 1.00000000000000000000000 | 5.100000000000000088818e-01 | 5.000000000000000000000e-01 | 0.000000000000000000000e+00 | 0.0000000000000000000000 | 1.000000000000000000000e+00 | 1.000000000000000000000e+00 | 1.000000000000000000000e+00 | ▇▁▁▁▇ |
| hh07j | 0 | 1.00000000000000000000000 | 2.200000000000000011102e-01 | 4.099999999999999755751e-01 | 0.000000000000000000000e+00 | 0.0000000000000000000000 | 0.000000000000000000000e+00 | 0.000000000000000000000e+00 | 1.000000000000000000000e+00 | ▇▁▁▁▂ |
| hh07k | 0 | 1.00000000000000000000000 | 5.699999999999999511502e-01 | 4.899999999999999911182e-01 | 0.000000000000000000000e+00 | 0.0000000000000000000000 | 1.000000000000000000000e+00 | 1.000000000000000000000e+00 | 1.000000000000000000000e+00 | ▆▁▁▁▇ |
| hh07l | 0 | 1.00000000000000000000000 | 4.899999999999999911182e-01 | 5.000000000000000000000e-01 | 0.000000000000000000000e+00 | 0.0000000000000000000000 | 0.000000000000000000000e+00 | 1.000000000000000000000e+00 | 1.000000000000000000000e+00 | ▇▁▁▁▇ |
| hh07m | 0 | 1.00000000000000000000000 | 8.900000000000000133227e-01 | 3.099999999999999977796e-01 | 0.000000000000000000000e+00 | 1.0000000000000000000000 | 1.000000000000000000000e+00 | 1.000000000000000000000e+00 | 1.000000000000000000000e+00 | ▁▁▁▁▇ |
| hh07n | 10304 | 0.89000000000000001332268 | 7.099999999999999644729e-01 | 4.500000000000000111022e-01 | 0.000000000000000000000e+00 | 0.0000000000000000000000 | 1.000000000000000000000e+00 | 1.000000000000000000000e+00 | 1.000000000000000000000e+00 | ▃▁▁▁▇ |
| hh07o | 0 | 1.00000000000000000000000 | 1.400000000000000133227e-01 | 3.499999999999999777955e-01 | 0.000000000000000000000e+00 | 0.0000000000000000000000 | 0.000000000000000000000e+00 | 0.000000000000000000000e+00 | 1.000000000000000000000e+00 | ▇▁▁▁▁ |
| hh07p | 0 | 1.00000000000000000000000 | 3.499999999999999777955e-01 | 4.799999999999999822364e-01 | 0.000000000000000000000e+00 | 0.0000000000000000000000 | 0.000000000000000000000e+00 | 1.000000000000000000000e+00 | 1.000000000000000000000e+00 | ▇▁▁▁▅ |
| hh07p_3 | 82754 | 0.14000000000000001332268 | 5.000000000000000277556e-02 | 2.099999999999999922284e-01 | 0.000000000000000000000e+00 | 0.0000000000000000000000 | 0.000000000000000000000e+00 | 0.000000000000000000000e+00 | 1.000000000000000000000e+00 | ▇▁▁▁▁ |
| hh07q | 0 | 1.00000000000000000000000 | 4.099999999999999755751e-01 | 4.899999999999999911182e-01 | 0.000000000000000000000e+00 | 0.0000000000000000000000 | 0.000000000000000000000e+00 | 1.000000000000000000000e+00 | 1.000000000000000000000e+00 | ▇▁▁▁▆ |
| ch02 | 0 | 1.00000000000000000000000 | 1.000000000000000000000e+01 | 3.250000000000000000000e+00 | 5.000000000000000000000e+00 | 7.0000000000000000000000 | 1.000000000000000000000e+01 | 1.300000000000000000065e+01 | 1.600000000000000000000e+01 | ▇▆▆▅▅ |
| ch03 | 0 | 1.00000000000000000000000 | 1.500000000000000000000e+00 | 5.000000000000000000000e-01 | 1.000000000000000000000e+00 | 1.0000000000000000000000 | 1.000000000000000000000e+00 | 2.000000000000000000000e+00 | 3.000000000000000000000e+00 | ▇▁▇▁▁ |
| ch04a | 0 | 1.00000000000000000000000 | 1.040000000000000035527e+00 | 2.300000000000000099920e-01 | 1.000000000000000000000e+00 | 1.0000000000000000000000 | 1.000000000000000000000e+00 | 1.000000000000000000000e+00 | 4.000000000000000000000e+00 | ▇▁▁▁▁ |
| ch04b | 0 | 1.00000000000000000000000 | 1.020000000000000017764e+00 | 2.000000000000000111022e-01 | 1.000000000000000000000e+00 | 1.0000000000000000000000 | 1.000000000000000000000e+00 | 1.000000000000000000000e+00 | 4.000000000000000000000e+00 | ▇▁▁▁▁ |
| ch04c | 0 | 1.00000000000000000000000 | 1.020000000000000017764e+00 | 1.900000000000000022204e-01 | 1.000000000000000000000e+00 | 1.0000000000000000000000 | 1.000000000000000000000e+00 | 1.000000000000000000000e+00 | 4.000000000000000000000e+00 | ▇▁▁▁▁ |
| ch04d | 0 | 1.00000000000000000000000 | 1.030000000000000026645e+00 | 2.300000000000000099920e-01 | 1.000000000000000000000e+00 | 1.0000000000000000000000 | 1.000000000000000000000e+00 | 1.000000000000000000000e+00 | 4.000000000000000000000e+00 | ▇▁▁▁▁ |
| ch04e | 0 | 1.00000000000000000000000 | 1.030000000000000026645e+00 | 2.300000000000000099920e-01 | 1.000000000000000000000e+00 | 1.0000000000000000000000 | 1.000000000000000000000e+00 | 1.000000000000000000000e+00 | 4.000000000000000000000e+00 | ▇▁▁▁▁ |
| ch04f | 0 | 1.00000000000000000000000 | 1.070000000000000062172e+00 | 3.099999999999999977796e-01 | 1.000000000000000000000e+00 | 1.0000000000000000000000 | 1.000000000000000000000e+00 | 1.000000000000000000000e+00 | 4.000000000000000000000e+00 | ▇▁▁▁▁ |
| ch05 | 0 | 1.00000000000000000000000 | 8.699999999999999955591e-01 | 3.300000000000000155431e-01 | 0.000000000000000000000e+00 | 1.0000000000000000000000 | 1.000000000000000000000e+00 | 1.000000000000000000000e+00 | 1.000000000000000000000e+00 | ▁▁▁▁▇ |
| ch06a | 12227 | 0.86999999999999999555911 | 4.070000000000000284217e+00 | 2.870000000000000106581e+00 | 0.000000000000000000000e+00 | 2.0000000000000000000000 | 4.000000000000000000000e+00 | 6.000000000000000000000e+00 | 1.200000000000000000043e+01 | ▇▅▆▂▁ |
| ch06b | 12227 | 0.86999999999999999555911 | 1.290000000000000035527e+00 | 5.200000000000000177636e-01 | 1.000000000000000000000e+00 | 1.0000000000000000000000 | 1.000000000000000000000e+00 | 2.000000000000000000000e+00 | 3.000000000000000000000e+00 | ▇▁▂▁▁ |
| ch06c | 12227 | 0.86999999999999999555911 | 8.299999999999999600320e-01 | 3.800000000000000044409e-01 | 0.000000000000000000000e+00 | 1.0000000000000000000000 | 1.000000000000000000000e+00 | 1.000000000000000000000e+00 | 1.000000000000000000000e+00 | ▂▁▁▁▇ |
| ch06d | 12227 | 0.86999999999999999555911 | 6.199999999999999955591e-01 | 4.899999999999999911182e-01 | 0.000000000000000000000e+00 | 0.0000000000000000000000 | 1.000000000000000000000e+00 | 1.000000000000000000000e+00 | 1.000000000000000000000e+00 | ▅▁▁▁▇ |
| ch06e | 20279 | 0.79000000000000003552714 | 2.000000000000000111022e-01 | 4.000000000000000222045e-01 | 0.000000000000000000000e+00 | 0.0000000000000000000000 | 0.000000000000000000000e+00 | 0.000000000000000000000e+00 | 1.000000000000000000000e+00 | ▇▁▁▁▂ |
| ch07a | 84225 | 0.13000000000000000444089 | 2.300000000000000099920e-01 | 4.199999999999999844569e-01 | 0.000000000000000000000e+00 | 0.0000000000000000000000 | 0.000000000000000000000e+00 | 0.000000000000000000000e+00 | 1.000000000000000000000e+00 | ▇▁▁▁▂ |
| ch07b | 93741 | 0.02999999999999999888978 | 2.022710000000000036364e+03 | 2.430000000000000159872e+00 | 2.014999999999999999879e+03 | 2022.0000000000000000000000 | 2.023999999999999999849e+03 | 2.023999999999999999849e+03 | 2.024999999999999999870e+03 | ▁▁▂▅▇ |
| ch07c | 93741 | 0.02999999999999999888978 | 3.740000000000000213163e+00 | 2.540000000000000035527e+00 | 0.000000000000000000000e+00 | 2.0000000000000000000000 | 3.000000000000000000000e+00 | 6.000000000000000000000e+00 | 1.200000000000000000043e+01 | ▇▆▆▂▁ |
| ch07d | 93741 | 0.02999999999999999888978 | 4.629999999999999893419e+00 | 1.840000000000000079936e+00 | 1.000000000000000000000e+00 | 3.0000000000000000000000 | 5.000000000000000000000e+00 | 6.000000000000000000000e+00 | 7.000000000000000000000e+00 | ▃▂▁▆▇ |
| ch08 | 0 | 1.00000000000000000000000 | 1.900000000000000022204e-01 | 4.000000000000000222045e-01 | 0.000000000000000000000e+00 | 0.0000000000000000000000 | 0.000000000000000000000e+00 | 0.000000000000000000000e+00 | 1.000000000000000000000e+00 | ▇▁▁▁▂ |
| ch09 | 0 | 1.00000000000000000000000 | 1.600000000000000033307e-01 | 3.699999999999999955591e-01 | 0.000000000000000000000e+00 | 0.0000000000000000000000 | 0.000000000000000000000e+00 | 0.000000000000000000000e+00 | 1.000000000000000000000e+00 | ▇▁▁▁▂ |
| ch10a | 12227 | 0.86999999999999999555911 | 4.899999999999999911182e-01 | 5.000000000000000000000e-01 | 0.000000000000000000000e+00 | 0.0000000000000000000000 | 0.000000000000000000000e+00 | 1.000000000000000000000e+00 | 1.000000000000000000000e+00 | ▇▁▁▁▇ |
| ch10b | 54923 | 0.42999999999999999333866 | 2.720000000000000195399e+00 | 1.590000000000000079936e+00 | 1.000000000000000000000e+00 | 2.0000000000000000000000 | 2.000000000000000000000e+00 | 3.000000000000000000000e+00 | 7.000000000000000000000e+00 | ▇▃▁▁▁ |
| ch10c | 0 | 1.00000000000000000000000 | 5.300000000000000266454e-01 | 9.100000000000000310862e-01 | 0.000000000000000000000e+00 | 0.0000000000000000000000 | 0.000000000000000000000e+00 | 1.000000000000000000000e+00 | 4.000000000000000000000e+00 | ▇▂▁▁▁ |
| pt00 | 0 | 1.00000000000000000000000 | 2.189999999999999946709e+00 | 1.110000000000000097700e+00 | 0.000000000000000000000e+00 | 1.0000000000000000000000 | 3.000000000000000000000e+00 | 3.000000000000000000000e+00 | 3.000000000000000000000e+00 | ▂▃▁▁▇ |
| pt01b | 14342 | 0.84999999999999997779554 | 3.604999999999999715740e+01 | 7.940000000000000390799e+00 | 1.800000000000000000087e+01 | 30.0000000000000000000000 | 3.500000000000000000000e+01 | 4.000000000000000000000e+01 | 8.000000000000000000000e+01 | ▅▇▂▁▁ |
| pt01c | 14342 | 0.84999999999999997779554 | 6.899999999999999467093e-01 | 4.600000000000000199840e-01 | 0.000000000000000000000e+00 | 0.0000000000000000000000 | 1.000000000000000000000e+00 | 1.000000000000000000000e+00 | 1.000000000000000000000e+00 | ▃▁▁▁▇ |
| pt01d | 40204 | 0.57999999999999996003197 | 1.530000000000000026645e+00 | 8.900000000000000133227e-01 | 0.000000000000000000000e+00 | 1.0000000000000000000000 | 1.000000000000000000000e+00 | 2.000000000000000000000e+00 | 4.000000000000000000000e+00 | ▂▇▆▃▁ |
| pt01e | 14342 | 0.84999999999999997779554 | 5.000000000000000000000e-01 | 5.000000000000000000000e-01 | 0.000000000000000000000e+00 | 0.0000000000000000000000 | 0.000000000000000000000e+00 | 1.000000000000000000000e+00 | 1.000000000000000000000e+00 | ▇▁▁▁▇ |
| pt01f | 55814 | 0.41999999999999998445688 | 2.100000000000000088818e+00 | 1.199999999999999955591e+00 | 1.000000000000000000000e+00 | 1.0000000000000000000000 | 1.000000000000000000000e+00 | 3.000000000000000000000e+00 | 4.000000000000000000000e+00 | ▇▁▁▅▃ |
| pt02b | 31684 | 0.67000000000000003996803 | 4.299000000000000199212e+01 | 9.539999999999999147349e+00 | 1.800000000000000000087e+01 | 36.0000000000000000000000 | 4.200000000000000000260e+01 | 4.900000000000000000087e+01 | 8.000000000000000000000e+01 | ▂▇▆▂▁ |
| pt02c | 31684 | 0.67000000000000003996803 | 7.299999999999999822364e-01 | 4.400000000000000022204e-01 | 0.000000000000000000000e+00 | 0.0000000000000000000000 | 1.000000000000000000000e+00 | 1.000000000000000000000e+00 | 1.000000000000000000000e+00 | ▃▁▁▁▇ |
| pt02d | 49106 | 0.48999999999999999111822 | 1.600000000000000088818e+00 | 9.200000000000000399680e-01 | 0.000000000000000000000e+00 | 1.0000000000000000000000 | 1.000000000000000000000e+00 | 2.000000000000000000000e+00 | 4.000000000000000000000e+00 | ▂▇▆▃▁ |
| pt02e | 31684 | 0.67000000000000003996803 | 8.699999999999999955591e-01 | 3.400000000000000244249e-01 | 0.000000000000000000000e+00 | 1.0000000000000000000000 | 1.000000000000000000000e+00 | 1.000000000000000000000e+00 | 1.000000000000000000000e+00 | ▁▁▁▁▇ |
| pt02f | 40199 | 0.57999999999999996003197 | 2.339999999999999857891e+00 | 1.250000000000000000000e+00 | 1.000000000000000000000e+00 | 1.0000000000000000000000 | 2.000000000000000000000e+00 | 4.000000000000000000000e+00 | 4.000000000000000000000e+00 | ▇▂▁▅▅ |
| assessment | 0 | 1.00000000000000000000000 | 9.300000000000000488498e-01 | 2.600000000000000088818e-01 | 0.000000000000000000000e+00 | 1.0000000000000000000000 | 1.000000000000000000000e+00 | 1.000000000000000000000e+00 | 1.000000000000000000000e+00 | ▁▁▁▁▇ |
| sample | 7153 | 0.93000000000000004884981 | 1.449999999999999955591e+00 | 5.000000000000000000000e-01 | 1.000000000000000000000e+00 | 1.0000000000000000000000 | 1.000000000000000000000e+00 | 2.000000000000000000000e+00 | 2.000000000000000000000e+00 | ▇▁▁▁▆ |
| l1_1 | 7153 | 0.93000000000000004884981 | 1.169999999999999928946e+00 | 9.100000000000000310862e-01 | 0.000000000000000000000e+00 | 0.0000000000000000000000 | 2.000000000000000000000e+00 | 2.000000000000000000000e+00 | 2.000000000000000000000e+00 | ▆▁▂▁▇ |
| l1_2 | 7153 | 0.93000000000000004884981 | 1.189999999999999946709e+00 | 8.699999999999999955591e-01 | 0.000000000000000000000e+00 | 0.0000000000000000000000 | 1.000000000000000000000e+00 | 2.000000000000000000000e+00 | 2.000000000000000000000e+00 | ▅▁▃▁▇ |
| l1_3 | 7153 | 0.93000000000000004884981 | 1.120000000000000106581e+00 | 8.800000000000000044409e-01 | 0.000000000000000000000e+00 | 0.0000000000000000000000 | 1.000000000000000000000e+00 | 2.000000000000000000000e+00 | 2.000000000000000000000e+00 | ▆▁▃▁▇ |
| l1_4 | 7153 | 0.93000000000000004884981 | 1.030000000000000026645e+00 | 8.800000000000000044409e-01 | 0.000000000000000000000e+00 | 0.0000000000000000000000 | 1.000000000000000000000e+00 | 2.000000000000000000000e+00 | 2.000000000000000000000e+00 | ▇▁▅▁▇ |
| l2_1 | 7153 | 0.93000000000000004884981 | 1.560000000000000053291e+00 | 7.399999999999999911182e-01 | 0.000000000000000000000e+00 | 1.0000000000000000000000 | 2.000000000000000000000e+00 | 2.000000000000000000000e+00 | 2.000000000000000000000e+00 | ▂▁▂▁▇ |
| l2_2 | 7153 | 0.93000000000000004884981 | 1.580000000000000071054e+00 | 7.299999999999999822364e-01 | 0.000000000000000000000e+00 | 1.0000000000000000000000 | 2.000000000000000000000e+00 | 2.000000000000000000000e+00 | 2.000000000000000000000e+00 | ▂▁▂▁▇ |
| l2_3 | 7153 | 0.93000000000000004884981 | 1.540000000000000035527e+00 | 7.500000000000000000000e-01 | 0.000000000000000000000e+00 | 1.0000000000000000000000 | 2.000000000000000000000e+00 | 2.000000000000000000000e+00 | 2.000000000000000000000e+00 | ▂▁▂▁▇ |
| l2_4 | 7153 | 0.93000000000000004884981 | 1.520000000000000017764e+00 | 7.600000000000000088818e-01 | 0.000000000000000000000e+00 | 1.0000000000000000000000 | 2.000000000000000000000e+00 | 2.000000000000000000000e+00 | 2.000000000000000000000e+00 | ▂▁▂▁▇ |
| l2_5 | 7153 | 0.93000000000000004884981 | 1.469999999999999973355e+00 | 7.900000000000000355271e-01 | 0.000000000000000000000e+00 | 1.0000000000000000000000 | 2.000000000000000000000e+00 | 2.000000000000000000000e+00 | 2.000000000000000000000e+00 | ▂▁▂▁▇ |
| l3_1 | 33971 | 0.65000000000000002220446 | 1.689999999999999946709e+00 | 6.500000000000000222045e-01 | 0.000000000000000000000e+00 | 2.0000000000000000000000 | 2.000000000000000000000e+00 | 2.000000000000000000000e+00 | 2.000000000000000000000e+00 | ▁▁▁▁▇ |
| l3_2 | 33971 | 0.65000000000000002220446 | 1.739999999999999991118e+00 | 5.999999999999999777955e-01 | 0.000000000000000000000e+00 | 2.0000000000000000000000 | 2.000000000000000000000e+00 | 2.000000000000000000000e+00 | 2.000000000000000000000e+00 | ▁▁▁▁▇ |
| l3_3 | 33971 | 0.65000000000000002220446 | 1.709999999999999964473e+00 | 6.300000000000000044409e-01 | 0.000000000000000000000e+00 | 2.0000000000000000000000 | 2.000000000000000000000e+00 | 2.000000000000000000000e+00 | 2.000000000000000000000e+00 | ▁▁▁▁▇ |
| l3_4 | 33971 | 0.65000000000000002220446 | 1.689999999999999946709e+00 | 6.400000000000000133227e-01 | 0.000000000000000000000e+00 | 2.0000000000000000000000 | 2.000000000000000000000e+00 | 2.000000000000000000000e+00 | 2.000000000000000000000e+00 | ▁▁▁▁▇ |
| l3_5 | 33971 | 0.65000000000000002220446 | 1.659999999999999920064e+00 | 6.700000000000000399680e-01 | 0.000000000000000000000e+00 | 2.0000000000000000000000 | 2.000000000000000000000e+00 | 2.000000000000000000000e+00 | 2.000000000000000000000e+00 | ▁▁▁▁▇ |
| l4_1 | 33971 | 0.65000000000000002220446 | 1.750000000000000000000e+00 | 5.999999999999999777955e-01 | 0.000000000000000000000e+00 | 2.0000000000000000000000 | 2.000000000000000000000e+00 | 2.000000000000000000000e+00 | 2.000000000000000000000e+00 | ▁▁▁▁▇ |
| l4_2 | 33971 | 0.65000000000000002220446 | 1.699999999999999955591e+00 | 6.300000000000000044409e-01 | 0.000000000000000000000e+00 | 2.0000000000000000000000 | 2.000000000000000000000e+00 | 2.000000000000000000000e+00 | 2.000000000000000000000e+00 | ▁▁▁▁▇ |
| l4_3 | 33971 | 0.65000000000000002220446 | 1.739999999999999991118e+00 | 5.999999999999999777955e-01 | 0.000000000000000000000e+00 | 2.0000000000000000000000 | 2.000000000000000000000e+00 | 2.000000000000000000000e+00 | 2.000000000000000000000e+00 | ▁▁▁▁▇ |
| l4_4 | 33971 | 0.65000000000000002220446 | 1.659999999999999920064e+00 | 6.600000000000000310862e-01 | 0.000000000000000000000e+00 | 2.0000000000000000000000 | 2.000000000000000000000e+00 | 2.000000000000000000000e+00 | 2.000000000000000000000e+00 | ▁▁▁▁▇ |
| l4_5 | 33971 | 0.65000000000000002220446 | 1.639999999999999902300e+00 | 6.800000000000000488498e-01 | 0.000000000000000000000e+00 | 1.0000000000000000000000 | 2.000000000000000000000e+00 | 2.000000000000000000000e+00 | 2.000000000000000000000e+00 | ▁▁▂▁▇ |
| l4_6 | 33971 | 0.65000000000000002220446 | 1.629999999999999893419e+00 | 6.800000000000000488498e-01 | 0.000000000000000000000e+00 | 2.0000000000000000000000 | 2.000000000000000000000e+00 | 2.000000000000000000000e+00 | 2.000000000000000000000e+00 | ▁▁▂▁▇ |
| l5_1 | 45843 | 0.52000000000000001776357 | 1.679999999999999937828e+00 | 6.700000000000000399680e-01 | 0.000000000000000000000e+00 | 2.0000000000000000000000 | 2.000000000000000000000e+00 | 2.000000000000000000000e+00 | 2.000000000000000000000e+00 | ▁▁▁▁▇ |
| l5_2 | 45843 | 0.52000000000000001776357 | 1.639999999999999902300e+00 | 6.899999999999999467093e-01 | 0.000000000000000000000e+00 | 2.0000000000000000000000 | 2.000000000000000000000e+00 | 2.000000000000000000000e+00 | 2.000000000000000000000e+00 | ▁▁▁▁▇ |
| l5_3 | 45843 | 0.52000000000000001776357 | 1.620000000000000106581e+00 | 6.999999999999999555911e-01 | 0.000000000000000000000e+00 | 1.0000000000000000000000 | 2.000000000000000000000e+00 | 2.000000000000000000000e+00 | 2.000000000000000000000e+00 | ▂▁▂▁▇ |
| l5_4 | 45843 | 0.52000000000000001776357 | 1.510000000000000008882e+00 | 7.600000000000000088818e-01 | 0.000000000000000000000e+00 | 1.0000000000000000000000 | 2.000000000000000000000e+00 | 2.000000000000000000000e+00 | 2.000000000000000000000e+00 | ▂▁▂▁▇ |
| l5_5 | 45843 | 0.52000000000000001776357 | 1.250000000000000000000e+00 | 8.599999999999999866773e-01 | 0.000000000000000000000e+00 | 0.0000000000000000000000 | 2.000000000000000000000e+00 | 2.000000000000000000000e+00 | 2.000000000000000000000e+00 | ▅▁▃▁▇ |
| l6_1 | 52471 | 0.46000000000000001998401 | 1.780000000000000026645e+00 | 5.500000000000000444089e-01 | 0.000000000000000000000e+00 | 2.0000000000000000000000 | 2.000000000000000000000e+00 | 2.000000000000000000000e+00 | 2.000000000000000000000e+00 | ▁▁▁▁▇ |
| l6_2 | 52471 | 0.46000000000000001998401 | 1.739999999999999991118e+00 | 5.699999999999999511502e-01 | 0.000000000000000000000e+00 | 2.0000000000000000000000 | 2.000000000000000000000e+00 | 2.000000000000000000000e+00 | 2.000000000000000000000e+00 | ▁▁▁▁▇ |
| l6_3 | 52471 | 0.46000000000000001998401 | 1.699999999999999955591e+00 | 5.999999999999999777955e-01 | 0.000000000000000000000e+00 | 2.0000000000000000000000 | 2.000000000000000000000e+00 | 2.000000000000000000000e+00 | 2.000000000000000000000e+00 | ▁▁▂▁▇ |
| l6_4 | 52471 | 0.46000000000000001998401 | 1.610000000000000097700e+00 | 6.700000000000000399680e-01 | 0.000000000000000000000e+00 | 1.0000000000000000000000 | 2.000000000000000000000e+00 | 2.000000000000000000000e+00 | 2.000000000000000000000e+00 | ▁▁▂▁▇ |
| l6_5 | 52471 | 0.46000000000000001998401 | 1.449999999999999955591e+00 | 7.700000000000000177636e-01 | 0.000000000000000000000e+00 | 1.0000000000000000000000 | 2.000000000000000000000e+00 | 2.000000000000000000000e+00 | 2.000000000000000000000e+00 | ▂▁▃▁▇ |
| n1 | 7153 | 0.93000000000000004884981 | 1.780000000000000026645e+00 | 5.500000000000000444089e-01 | 0.000000000000000000000e+00 | 2.0000000000000000000000 | 2.000000000000000000000e+00 | 2.000000000000000000000e+00 | 2.000000000000000000000e+00 | ▁▁▁▁▇ |
| n2 | 7153 | 0.93000000000000004884981 | 1.709999999999999964473e+00 | 5.799999999999999600320e-01 | 0.000000000000000000000e+00 | 2.0000000000000000000000 | 2.000000000000000000000e+00 | 2.000000000000000000000e+00 | 2.000000000000000000000e+00 | ▁▁▂▁▇ |
| n3 | 7153 | 0.93000000000000004884981 | 1.750000000000000000000e+00 | 5.799999999999999600320e-01 | 0.000000000000000000000e+00 | 2.0000000000000000000000 | 2.000000000000000000000e+00 | 2.000000000000000000000e+00 | 2.000000000000000000000e+00 | ▁▁▁▁▇ |
| n4 | 7153 | 0.93000000000000004884981 | 1.300000000000000044409e+00 | 6.800000000000000488498e-01 | 0.000000000000000000000e+00 | 1.0000000000000000000000 | 1.000000000000000000000e+00 | 2.000000000000000000000e+00 | 2.000000000000000000000e+00 | ▂▁▇▁▇ |
| n5 | 7153 | 0.93000000000000004884981 | 1.560000000000000053291e+00 | 6.899999999999999467093e-01 | 0.000000000000000000000e+00 | 1.0000000000000000000000 | 2.000000000000000000000e+00 | 2.000000000000000000000e+00 | 2.000000000000000000000e+00 | ▂▁▂▁▇ |
| n6 | 7153 | 0.93000000000000004884981 | 1.280000000000000026645e+00 | 7.299999999999999822364e-01 | 0.000000000000000000000e+00 | 1.0000000000000000000000 | 1.000000000000000000000e+00 | 2.000000000000000000000e+00 | 2.000000000000000000000e+00 | ▃▁▇▁▇ |
| n7 | 7153 | 0.93000000000000004884981 | 1.070000000000000062172e+00 | 7.500000000000000000000e-01 | 0.000000000000000000000e+00 | 0.0000000000000000000000 | 1.000000000000000000000e+00 | 2.000000000000000000000e+00 | 2.000000000000000000000e+00 | ▅▁▇▁▆ |
| n8 | 7153 | 0.93000000000000004884981 | 1.000000000000000000000e+00 | 7.399999999999999911182e-01 | 0.000000000000000000000e+00 | 0.0000000000000000000000 | 1.000000000000000000000e+00 | 2.000000000000000000000e+00 | 2.000000000000000000000e+00 | ▅▁▇▁▅ |
| n9 | 7153 | 0.93000000000000004884981 | 1.159999999999999920064e+00 | 8.399999999999999689138e-01 | 0.000000000000000000000e+00 | 0.0000000000000000000000 | 1.000000000000000000000e+00 | 2.000000000000000000000e+00 | 2.000000000000000000000e+00 | ▅▁▅▁▇ |
| n10 | 7153 | 0.93000000000000004884981 | 1.050000000000000044409e+00 | 8.199999999999999511502e-01 | 0.000000000000000000000e+00 | 0.0000000000000000000000 | 1.000000000000000000000e+00 | 2.000000000000000000000e+00 | 2.000000000000000000000e+00 | ▇▁▇▁▇ |
| n11 | 7153 | 0.93000000000000004884981 | 1.669999999999999928946e+00 | 6.400000000000000133227e-01 | 0.000000000000000000000e+00 | 2.0000000000000000000000 | 2.000000000000000000000e+00 | 2.000000000000000000000e+00 | 2.000000000000000000000e+00 | ▁▁▂▁▇ |
| n12 | 7153 | 0.93000000000000004884981 | 1.590000000000000079936e+00 | 6.800000000000000488498e-01 | 0.000000000000000000000e+00 | 1.0000000000000000000000 | 2.000000000000000000000e+00 | 2.000000000000000000000e+00 | 2.000000000000000000000e+00 | ▁▁▂▁▇ |
| n13 | 7153 | 0.93000000000000004884981 | 1.610000000000000097700e+00 | 6.700000000000000399680e-01 | 0.000000000000000000000e+00 | 1.0000000000000000000000 | 2.000000000000000000000e+00 | 2.000000000000000000000e+00 | 2.000000000000000000000e+00 | ▁▁▂▁▇ |
| n14 | 7153 | 0.93000000000000004884981 | 1.530000000000000026645e+00 | 7.099999999999999644729e-01 | 0.000000000000000000000e+00 | 1.0000000000000000000000 | 2.000000000000000000000e+00 | 2.000000000000000000000e+00 | 2.000000000000000000000e+00 | ▂▁▂▁▇ |
| n15 | 7153 | 0.93000000000000004884981 | 1.250000000000000000000e+00 | 6.999999999999999555911e-01 | 0.000000000000000000000e+00 | 1.0000000000000000000000 | 1.000000000000000000000e+00 | 2.000000000000000000000e+00 | 2.000000000000000000000e+00 | ▃▁▇▁▇ |
| n16 | 7153 | 0.93000000000000004884981 | 1.639999999999999902300e+00 | 6.800000000000000488498e-01 | 0.000000000000000000000e+00 | 2.0000000000000000000000 | 2.000000000000000000000e+00 | 2.000000000000000000000e+00 | 2.000000000000000000000e+00 | ▁▁▁▁▇ |
| n17 | 7153 | 0.93000000000000004884981 | 1.659999999999999920064e+00 | 6.800000000000000488498e-01 | 0.000000000000000000000e+00 | 2.0000000000000000000000 | 2.000000000000000000000e+00 | 2.000000000000000000000e+00 | 2.000000000000000000000e+00 | ▁▁▁▁▇ |
| n18 | 7153 | 0.93000000000000004884981 | 1.639999999999999902300e+00 | 6.899999999999999467093e-01 | 0.000000000000000000000e+00 | 2.0000000000000000000000 | 2.000000000000000000000e+00 | 2.000000000000000000000e+00 | 2.000000000000000000000e+00 | ▁▁▁▁▇ |
| n19 | 7153 | 0.93000000000000004884981 | 1.629999999999999893419e+00 | 6.899999999999999467093e-01 | 0.000000000000000000000e+00 | 2.0000000000000000000000 | 2.000000000000000000000e+00 | 2.000000000000000000000e+00 | 2.000000000000000000000e+00 | ▁▁▁▁▇ |
| n20 | 7153 | 0.93000000000000004884981 | 1.620000000000000106581e+00 | 6.999999999999999555911e-01 | 0.000000000000000000000e+00 | 1.0000000000000000000000 | 2.000000000000000000000e+00 | 2.000000000000000000000e+00 | 2.000000000000000000000e+00 | ▂▁▂▁▇ |
| n21 | 7153 | 0.93000000000000004884981 | 1.479999999999999982236e+00 | 7.800000000000000266454e-01 | 0.000000000000000000000e+00 | 1.0000000000000000000000 | 2.000000000000000000000e+00 | 2.000000000000000000000e+00 | 2.000000000000000000000e+00 | ▂▁▂▁▇ |
| n22 | 7153 | 0.93000000000000004884981 | 1.469999999999999973355e+00 | 7.800000000000000266454e-01 | 0.000000000000000000000e+00 | 1.0000000000000000000000 | 2.000000000000000000000e+00 | 2.000000000000000000000e+00 | 2.000000000000000000000e+00 | ▂▁▂▁▇ |
| n23 | 7153 | 0.93000000000000004884981 | 1.439999999999999946709e+00 | 7.900000000000000355271e-01 | 0.000000000000000000000e+00 | 1.0000000000000000000000 | 2.000000000000000000000e+00 | 2.000000000000000000000e+00 | 2.000000000000000000000e+00 | ▂▁▂▁▇ |
| n24 | 7153 | 0.93000000000000004884981 | 1.419999999999999928946e+00 | 8.000000000000000444089e-01 | 0.000000000000000000000e+00 | 1.0000000000000000000000 | 2.000000000000000000000e+00 | 2.000000000000000000000e+00 | 2.000000000000000000000e+00 | ▂▁▂▁▇ |
| n25 | 7153 | 0.93000000000000004884981 | 1.439999999999999946709e+00 | 7.900000000000000355271e-01 | 0.000000000000000000000e+00 | 1.0000000000000000000000 | 2.000000000000000000000e+00 | 2.000000000000000000000e+00 | 2.000000000000000000000e+00 | ▂▁▂▁▇ |
| n26 | 7153 | 0.93000000000000004884981 | 1.449999999999999955591e+00 | 7.800000000000000266454e-01 | 0.000000000000000000000e+00 | 1.0000000000000000000000 | 2.000000000000000000000e+00 | 2.000000000000000000000e+00 | 2.000000000000000000000e+00 | ▂▁▂▁▇ |
| n27 | 7153 | 0.93000000000000004884981 | 1.310000000000000053291e+00 | 8.299999999999999600320e-01 | 0.000000000000000000000e+00 | 1.0000000000000000000000 | 2.000000000000000000000e+00 | 2.000000000000000000000e+00 | 2.000000000000000000000e+00 | ▃▁▃▁▇ |
| n28 | 7153 | 0.93000000000000004884981 | 1.209999999999999964473e+00 | 8.299999999999999600320e-01 | 0.000000000000000000000e+00 | 0.0000000000000000000000 | 1.000000000000000000000e+00 | 2.000000000000000000000e+00 | 2.000000000000000000000e+00 | ▅▁▅▁▇ |
| n29 | 7153 | 0.93000000000000004884981 | 1.070000000000000062172e+00 | 8.199999999999999511502e-01 | 0.000000000000000000000e+00 | 0.0000000000000000000000 | 1.000000000000000000000e+00 | 2.000000000000000000000e+00 | 2.000000000000000000000e+00 | ▆▁▇▁▇ |
| n30 | 7153 | 0.93000000000000004884981 | 1.080000000000000071054e+00 | 8.499999999999999777955e-01 | 0.000000000000000000000e+00 | 0.0000000000000000000000 | 1.000000000000000000000e+00 | 2.000000000000000000000e+00 | 2.000000000000000000000e+00 | ▆▁▅▁▇ |
| n31 | 47996 | 0.50000000000000000000000 | 1.590000000000000079936e+00 | 6.300000000000000044409e-01 | 0.000000000000000000000e+00 | 1.0000000000000000000000 | 2.000000000000000000000e+00 | 2.000000000000000000000e+00 | 2.000000000000000000000e+00 | ▁▁▃▁▇ |
| n32 | 54717 | 0.42999999999999999333866 | 1.520000000000000017764e+00 | 6.400000000000000133227e-01 | 0.000000000000000000000e+00 | 1.0000000000000000000000 | 2.000000000000000000000e+00 | 2.000000000000000000000e+00 | 2.000000000000000000000e+00 | ▁▁▅▁▇ |
| n33 | 63307 | 0.34000000000000002442491 | 1.560000000000000053291e+00 | 6.500000000000000222045e-01 | 0.000000000000000000000e+00 | 1.0000000000000000000000 | 2.000000000000000000000e+00 | 2.000000000000000000000e+00 | 2.000000000000000000000e+00 | ▁▁▃▁▇ |
| n34 | 60377 | 0.36999999999999999555911 | 1.270000000000000017764e+00 | 7.800000000000000266454e-01 | 0.000000000000000000000e+00 | 1.0000000000000000000000 | 1.000000000000000000000e+00 | 2.000000000000000000000e+00 | 2.000000000000000000000e+00 | ▃▁▅▁▇ |
| n35 | 54717 | 0.42999999999999999333866 | 1.370000000000000106581e+00 | 7.299999999999999822364e-01 | 0.000000000000000000000e+00 | 1.0000000000000000000000 | 2.000000000000000000000e+00 | 2.000000000000000000000e+00 | 2.000000000000000000000e+00 | ▂▁▅▁▇ |
| n36 | 60377 | 0.36999999999999999555911 | 1.379999999999999893419e+00 | 7.299999999999999822364e-01 | 0.000000000000000000000e+00 | 1.0000000000000000000000 | 2.000000000000000000000e+00 | 2.000000000000000000000e+00 | 2.000000000000000000000e+00 | ▂▁▅▁▇ |
| IcarAssessTime | 23441 | 0.76000000000000000888178 | 6.509999999999999786837e+00 | 1.623000000000000042676e+01 | 1.020000000000000017764e+00 | 2.7000000000000001776357 | 4.900000000000000355271e+00 | 7.620000000000000106581e+00 | 7.097200000000000272727e+02 | ▇▁▁▁▁ |
| IcanAssessTime | 18215 | 0.81000000000000005329071 | 8.759999999999999786837e+00 | 2.378999999999999914865e+01 | 1.020000000000000017764e+00 | 3.6299999999999998934186 | 6.370000000000000106581e+00 | 9.949999999999999289457e+00 | 7.129800000000000181725e+02 | ▇▁▁▁▁ |
| AssHomeLang | 7153 | 0.93000000000000004884981 | 6.500000000000000222045e-01 | 4.799999999999999822364e-01 | 0.000000000000000000000e+00 | 0.0000000000000000000000 | 1.000000000000000000000e+00 | 1.000000000000000000000e+00 | 1.000000000000000000000e+00 | ▅▁▁▁▇ |
| HHWeightProvided | 0 | 1.00000000000000000000000 | 3.105789999999999963492e+03 | 3.794869999999999890658e+03 | 4.199999999999999844569e-01 | 584.4099999999999681676854 | 2.112119999999999890757e+03 | 3.926999999999999999781e+03 | 5.715222000000000116340e+04 | ▇▁▁▁▁ |
| MPLMath | 7314 | 0.92000000000000003996803 | 4.500000000000000111022e-01 | 5.000000000000000000000e-01 | 0.000000000000000000000e+00 | 0.0000000000000000000000 | 0.000000000000000000000e+00 | 1.000000000000000000000e+00 | 1.000000000000000000000e+00 | ▇▁▁▁▆ |
| MPLReading | 7314 | 0.92000000000000003996803 | 3.099999999999999977796e-01 | 4.600000000000000199840e-01 | 0.000000000000000000000e+00 | 0.0000000000000000000000 | 0.000000000000000000000e+00 | 1.000000000000000000000e+00 | 1.000000000000000000000e+00 | ▇▁▁▁▃ |
| MPLBoth | 7314 | 0.92000000000000003996803 | 2.899999999999999800160e-01 | 4.500000000000000111022e-01 | 0.000000000000000000000e+00 | 0.0000000000000000000000 | 0.000000000000000000000e+00 | 1.000000000000000000000e+00 | 1.000000000000000000000e+00 | ▇▁▁▁▃ |
| MathIRTScore | 7314 | 0.92000000000000003996803 | -4.199999999999999844569e-01 | 1.050000000000000044409e+00 | -3.069999999999999840128e+00 | -1.0700000000000000621725 | -3.499999999999999777955e-01 | 2.000000000000000111022e-01 | 2.140000000000000124345e+00 | ▂▃▇▅▂ |
| StandardErrorMath | 7314 | 0.92000000000000003996803 | 2.300000000000000099920e-01 | 1.700000000000000122125e-01 | 1.199999999999999955591e-01 | 0.1499999999999999944489 | 1.600000000000000033307e-01 | 2.500000000000000000000e-01 | 9.499999999999999555911e-01 | ▇▁▁▁▁ |
| ReadingIRTScore | 7314 | 0.92000000000000003996803 | -3.800000000000000044409e-01 | 1.310000000000000053291e+00 | -3.270000000000000017764e+00 | -1.3200000000000000621725 | -7.000000000000000666134e-02 | 5.100000000000000088818e-01 | 1.899999999999999911182e+00 | ▂▃▅▇▃ |
| StandardErrorReading | 7314 | 0.92000000000000003996803 | 3.499999999999999777955e-01 | 2.800000000000000266454e-01 | 8.000000000000000166533e-02 | 0.1100000000000000005551 | 2.399999999999999911182e-01 | 5.300000000000000266454e-01 | 1.080000000000000071054e+00 | ▇▂▂▁▂ |
5.6 The Five Key dplyr Verbs
5.6.1 Filtering rows with filter()
The filter() function from dplyr is used to extract rows from the data that meet certain conditions.
Here we illustrate the use of filter() by extracting all children from Mozambique.
## # A tibble: 8,255 × 147
## ...1 CountryName TierOneUnit VillageID Location HHID SubmissionDate duration hh06a hh06b1 hh06b2 hh06c hh06d hh07a hh07b hh07c
## <dbl> <chr> <chr> <chr> <chr> <chr> <date> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>
## 1 1 Mozambique c101 c101 Urban uuid:f580… 2025-09-23 979 4 1 NA 1 1 2 2 1
## 2 2 Mozambique c101 c101 Urban uuid:b98e… 2025-09-23 986 7 1 NA 1 1 2 2 1
## 3 3 Mozambique c101 c101 Urban uuid:2611… 2025-09-23 2167 6 1 NA 1 1 2 2 1
## 4 4 Mozambique c101 c101 Urban uuid:d46b… 2025-09-23 2078 6 1 NA 1 1 1 1 1
## 5 5 Mozambique c101 c101 Urban uuid:df92… 2025-09-23 1251 5 1 NA 1 1 2 2 1
## 6 6 Mozambique c101 c101 Urban uuid:ade5… 2025-09-23 2199 5 1 NA 1 1 2 2 1
## 7 7 Mozambique c101 c101 Urban uuid:3f84… 2025-09-23 2889 5 1 NA 1 1 2 2 1
## 8 8 Mozambique c101 c101 Urban uuid:ff7b… 2025-09-23 1588 4 1 NA 1 0 2 2 1
## 9 9 Mozambique c101 c101 Urban uuid:7730… 2025-09-23 2203 4 1 NA 1 1 2 2 1
## 10 10 Mozambique c101 c101 Urban uuid:e210… 2025-09-24 4647 5 1 NA 2 1 2 2 1
## # ℹ 8,245 more rows
## # ℹ 131 more variables: hh07d <dbl>, hh07e <dbl>, hh07f <dbl>, hh07g <dbl>, hh07h <dbl>, hh07i <dbl>, hh07j <dbl>, hh07k <dbl>,
## # hh07l <dbl>, hh07m <dbl>, hh07n <dbl>, hh07o <dbl>, hh07p <dbl>, hh07p_3 <dbl>, hh07q <dbl>, ChildID <chr>, ch02 <dbl>, ch03 <dbl>,
## # ch04a <dbl>, ch04b <dbl>, ch04c <dbl>, ch04d <dbl>, ch04e <dbl>, ch04f <dbl>, ch05 <dbl>, ch06a <dbl>, ch06b <dbl>, ch06c <dbl>,
## # ch06d <dbl>, ch06e <dbl>, ch07a <dbl>, ch07b <dbl>, ch07c <dbl>, ch07d <dbl>, ch08 <dbl>, ch09 <dbl>, ch10a <dbl>, ch10b <dbl>,
## # ch10c <dbl>, pt00 <dbl>, pt01b <dbl>, pt01c <dbl>, pt01d <dbl>, pt01e <dbl>, pt01f <dbl>, pt02b <dbl>, pt02c <dbl>, pt02d <dbl>,
## # pt02e <dbl>, pt02f <dbl>, assessment <dbl>, sample <dbl>, l1_1 <dbl>, l1_2 <dbl>, l1_3 <dbl>, l1_4 <dbl>, l2_1 <dbl>, l2_2 <dbl>, …
Next we extract the children from Mozambique that received an Reading IRT Score greater than \(1.5\). Multiple filter conditions are created with & (and) and | (or).
## # A tibble: 390 × 147
## ...1 CountryName TierOneUnit VillageID Location HHID SubmissionDate duration hh06a hh06b1 hh06b2 hh06c hh06d hh07a hh07b hh07c
## <dbl> <chr> <chr> <chr> <chr> <chr> <date> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>
## 1 18 Mozambique c101 c101 Urban uuid:789a… 2025-09-25 2511 7 1 NA 1 1 2 2 1
## 2 35 Mozambique c101 c101 Urban uuid:9a7b… 2025-09-25 2868 8 1 NA 1 1 2 2 1
## 3 204 Mozambique c101 c107 Urban uuid:00db… 2025-09-22 898 6 1 NA 2 1 2 2 1
## 4 221 Mozambique c101 c108 Rural uuid:a3f7… 2025-09-27 1643 12 1 NA 2 1 2 2 1
## 5 252 Mozambique c101 c108 Rural uuid:4b57… 2025-10-24 965 6 1 NA 2 1 2 2 1
## 6 421 Mozambique c101 c114 Rural uuid:b166… 2025-09-22 3217 4 1 NA 1 0 3 3 1
## 7 603 Mozambique c101 c120 Rural uuid:fff9… 2025-09-21 2585 6 1 NA 1 1 2 2 1
## 8 677 Mozambique c101 c122 Rural uuid:6354… 2025-09-27 1443 10 0 NA NA 1 2 2 1
## 9 730 Mozambique c102 c124 Rural uuid:06e4… 2025-10-24 630 4 0 NA NA 1 3 3 2
## 10 731 Mozambique c102 c124 Rural uuid:d48d… 2025-10-24 1187 3 0 NA NA 1 3 3 2
## # ℹ 380 more rows
## # ℹ 131 more variables: hh07d <dbl>, hh07e <dbl>, hh07f <dbl>, hh07g <dbl>, hh07h <dbl>, hh07i <dbl>, hh07j <dbl>, hh07k <dbl>,
## # hh07l <dbl>, hh07m <dbl>, hh07n <dbl>, hh07o <dbl>, hh07p <dbl>, hh07p_3 <dbl>, hh07q <dbl>, ChildID <chr>, ch02 <dbl>, ch03 <dbl>,
## # ch04a <dbl>, ch04b <dbl>, ch04c <dbl>, ch04d <dbl>, ch04e <dbl>, ch04f <dbl>, ch05 <dbl>, ch06a <dbl>, ch06b <dbl>, ch06c <dbl>,
## # ch06d <dbl>, ch06e <dbl>, ch07a <dbl>, ch07b <dbl>, ch07c <dbl>, ch07d <dbl>, ch08 <dbl>, ch09 <dbl>, ch10a <dbl>, ch10b <dbl>,
## # ch10c <dbl>, pt00 <dbl>, pt01b <dbl>, pt01c <dbl>, pt01d <dbl>, pt01e <dbl>, pt01f <dbl>, pt02b <dbl>, pt02c <dbl>, pt02d <dbl>,
## # pt02e <dbl>, pt02f <dbl>, assessment <dbl>, sample <dbl>, l1_1 <dbl>, l1_2 <dbl>, l1_3 <dbl>, l1_4 <dbl>, l2_1 <dbl>, l2_2 <dbl>, …
Here’s another way of writing the same condition as above:
## # A tibble: 390 × 147
## ...1 CountryName TierOneUnit VillageID Location HHID SubmissionDate duration hh06a hh06b1 hh06b2 hh06c hh06d hh07a hh07b hh07c
## <dbl> <chr> <chr> <chr> <chr> <chr> <date> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>
## 1 18 Mozambique c101 c101 Urban uuid:789a… 2025-09-25 2511 7 1 NA 1 1 2 2 1
## 2 35 Mozambique c101 c101 Urban uuid:9a7b… 2025-09-25 2868 8 1 NA 1 1 2 2 1
## 3 204 Mozambique c101 c107 Urban uuid:00db… 2025-09-22 898 6 1 NA 2 1 2 2 1
## 4 221 Mozambique c101 c108 Rural uuid:a3f7… 2025-09-27 1643 12 1 NA 2 1 2 2 1
## 5 252 Mozambique c101 c108 Rural uuid:4b57… 2025-10-24 965 6 1 NA 2 1 2 2 1
## 6 421 Mozambique c101 c114 Rural uuid:b166… 2025-09-22 3217 4 1 NA 1 0 3 3 1
## 7 603 Mozambique c101 c120 Rural uuid:fff9… 2025-09-21 2585 6 1 NA 1 1 2 2 1
## 8 677 Mozambique c101 c122 Rural uuid:6354… 2025-09-27 1443 10 0 NA NA 1 2 2 1
## 9 730 Mozambique c102 c124 Rural uuid:06e4… 2025-10-24 630 4 0 NA NA 1 3 3 2
## 10 731 Mozambique c102 c124 Rural uuid:d48d… 2025-10-24 1187 3 0 NA NA 1 3 3 2
## # ℹ 380 more rows
## # ℹ 131 more variables: hh07d <dbl>, hh07e <dbl>, hh07f <dbl>, hh07g <dbl>, hh07h <dbl>, hh07i <dbl>, hh07j <dbl>, hh07k <dbl>,
## # hh07l <dbl>, hh07m <dbl>, hh07n <dbl>, hh07o <dbl>, hh07p <dbl>, hh07p_3 <dbl>, hh07q <dbl>, ChildID <chr>, ch02 <dbl>, ch03 <dbl>,
## # ch04a <dbl>, ch04b <dbl>, ch04c <dbl>, ch04d <dbl>, ch04e <dbl>, ch04f <dbl>, ch05 <dbl>, ch06a <dbl>, ch06b <dbl>, ch06c <dbl>,
## # ch06d <dbl>, ch06e <dbl>, ch07a <dbl>, ch07b <dbl>, ch07c <dbl>, ch07d <dbl>, ch08 <dbl>, ch09 <dbl>, ch10a <dbl>, ch10b <dbl>,
## # ch10c <dbl>, pt00 <dbl>, pt01b <dbl>, pt01c <dbl>, pt01d <dbl>, pt01e <dbl>, pt01f <dbl>, pt02b <dbl>, pt02c <dbl>, pt02d <dbl>,
## # pt02e <dbl>, pt02f <dbl>, assessment <dbl>, sample <dbl>, l1_1 <dbl>, l1_2 <dbl>, l1_3 <dbl>, l1_4 <dbl>, l2_1 <dbl>, l2_2 <dbl>, …
The %in% command is often useful when using dplyr verbs:
# Extract children from Mozambique who are ages 5, 9, and 13 years old and have an Reading IRT Score greater than 1.5
filter(dat, CountryName == "Mozambique", ch02 %in% c(5, 9, 13), ReadingIRTScore > 1.5)## # A tibble: 74 × 147
## ...1 CountryName TierOneUnit VillageID Location HHID SubmissionDate duration hh06a hh06b1 hh06b2 hh06c hh06d hh07a hh07b hh07c
## <dbl> <chr> <chr> <chr> <chr> <chr> <date> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>
## 1 18 Mozambique c101 c101 Urban uuid:789a… 2025-09-25 2511 7 1 NA 1 1 2 2 1
## 2 252 Mozambique c101 c108 Rural uuid:4b57… 2025-10-24 965 6 1 NA 2 1 2 2 1
## 3 1083 Mozambique c102 c138 Urban uuid:d763… 2025-10-02 720 3 1 NA 1 1 3 3 1
## 4 1199 Mozambique c102 c141 Rural uuid:5603… 2025-10-09 581 4 1 NA 2 1 3 3 1
## 5 2165 Mozambique c103 c173 Urban uuid:a7af… 2025-08-11 2111 6 0 NA NA 0 3 3 4
## 6 2265 Mozambique c104 c176 Urban uuid:7fcf… 2025-09-22 2155 7 1 NA 1 1 2 2 1
## 7 2291 Mozambique c104 c177 Urban uuid:7363… 2025-09-18 991 3 1 NA 1 0 3 3 1
## 8 2343 Mozambique c104 c178 Urban uuid:63b9… 2025-09-22 2007 6 1 NA 1 1 2 2 1
## 9 2737 Mozambique c104 c187 Rural uuid:c37b… 2025-10-24 504 5 1 NA 2 1 3 3 4
## 10 3305 Mozambique c104 c205 Rural uuid:70cf… 2025-09-19 8859 2 1 NA 2 1 3 3 2
## # ℹ 64 more rows
## # ℹ 131 more variables: hh07d <dbl>, hh07e <dbl>, hh07f <dbl>, hh07g <dbl>, hh07h <dbl>, hh07i <dbl>, hh07j <dbl>, hh07k <dbl>,
## # hh07l <dbl>, hh07m <dbl>, hh07n <dbl>, hh07o <dbl>, hh07p <dbl>, hh07p_3 <dbl>, hh07q <dbl>, ChildID <chr>, ch02 <dbl>, ch03 <dbl>,
## # ch04a <dbl>, ch04b <dbl>, ch04c <dbl>, ch04d <dbl>, ch04e <dbl>, ch04f <dbl>, ch05 <dbl>, ch06a <dbl>, ch06b <dbl>, ch06c <dbl>,
## # ch06d <dbl>, ch06e <dbl>, ch07a <dbl>, ch07b <dbl>, ch07c <dbl>, ch07d <dbl>, ch08 <dbl>, ch09 <dbl>, ch10a <dbl>, ch10b <dbl>,
## # ch10c <dbl>, pt00 <dbl>, pt01b <dbl>, pt01c <dbl>, pt01d <dbl>, pt01e <dbl>, pt01f <dbl>, pt02b <dbl>, pt02c <dbl>, pt02d <dbl>,
## # pt02e <dbl>, pt02f <dbl>, assessment <dbl>, sample <dbl>, l1_1 <dbl>, l1_2 <dbl>, l1_3 <dbl>, l1_4 <dbl>, l2_1 <dbl>, l2_2 <dbl>, …
Let’s practice filtering data in R:
# Extract children from Kenya and Mali living in urban areas and are males
filter(dat, CountryName %in% c("Kenya", "Mali"), Location == "Urban", ch03 == 2)
# Extract children that live in a household with 5 to 10 individuals inclusive and were assessed in French
filter(dat, hh06a >= 5 & hh06a <= 10, AssessmentLanguage == "French")Try the following exercises on your own:
Extract all children who were assessed in English and met the minimum proficiency level for reading.
How many children from the previous question are from Tanzania.
Use
%in%to filter children from question 1 that live in 2, 9, or 20 indivdual household and receive help with homework.How many children from the previous question are assisted by their mother with homework?
5.6.2 Introducing the pipe
The pipe operator %>% is a very useful way of chaining together multiple operations. A typical format is something like:
data %>% operation 1 %>% operation 2
You read the code from left to right: Start with data, apply some operation (operation 1) to it, get a result, and then apply another operation (operation 2) to that result, to generate another result (the final result, in this example). A useful way to think of the pipe is as similar to “then”.
The main goal of the pipe is to make code easier, by focusing on the transformations rather than on what is being transformed. Usually this is the case, but it’s also possible to get carried away and end up with a huge whack of piped statements. Deciding when to break a block up is an art best learned by experience.
Example
## # A tibble: 8,255 × 147
## ...1 CountryName TierOneUnit VillageID Location HHID SubmissionDate duration hh06a hh06b1 hh06b2 hh06c hh06d hh07a hh07b hh07c
## <dbl> <chr> <chr> <chr> <chr> <chr> <date> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>
## 1 1 Mozambique c101 c101 Urban uuid:f580… 2025-09-23 979 4 1 NA 1 1 2 2 1
## 2 2 Mozambique c101 c101 Urban uuid:b98e… 2025-09-23 986 7 1 NA 1 1 2 2 1
## 3 3 Mozambique c101 c101 Urban uuid:2611… 2025-09-23 2167 6 1 NA 1 1 2 2 1
## 4 4 Mozambique c101 c101 Urban uuid:d46b… 2025-09-23 2078 6 1 NA 1 1 1 1 1
## 5 5 Mozambique c101 c101 Urban uuid:df92… 2025-09-23 1251 5 1 NA 1 1 2 2 1
## 6 6 Mozambique c101 c101 Urban uuid:ade5… 2025-09-23 2199 5 1 NA 1 1 2 2 1
## 7 7 Mozambique c101 c101 Urban uuid:3f84… 2025-09-23 2889 5 1 NA 1 1 2 2 1
## 8 8 Mozambique c101 c101 Urban uuid:ff7b… 2025-09-23 1588 4 1 NA 1 0 2 2 1
## 9 9 Mozambique c101 c101 Urban uuid:7730… 2025-09-23 2203 4 1 NA 1 1 2 2 1
## 10 10 Mozambique c101 c101 Urban uuid:e210… 2025-09-24 4647 5 1 NA 2 1 2 2 1
## # ℹ 8,245 more rows
## # ℹ 131 more variables: hh07d <dbl>, hh07e <dbl>, hh07f <dbl>, hh07g <dbl>, hh07h <dbl>, hh07i <dbl>, hh07j <dbl>, hh07k <dbl>,
## # hh07l <dbl>, hh07m <dbl>, hh07n <dbl>, hh07o <dbl>, hh07p <dbl>, hh07p_3 <dbl>, hh07q <dbl>, ChildID <chr>, ch02 <dbl>, ch03 <dbl>,
## # ch04a <dbl>, ch04b <dbl>, ch04c <dbl>, ch04d <dbl>, ch04e <dbl>, ch04f <dbl>, ch05 <dbl>, ch06a <dbl>, ch06b <dbl>, ch06c <dbl>,
## # ch06d <dbl>, ch06e <dbl>, ch07a <dbl>, ch07b <dbl>, ch07c <dbl>, ch07d <dbl>, ch08 <dbl>, ch09 <dbl>, ch10a <dbl>, ch10b <dbl>,
## # ch10c <dbl>, pt00 <dbl>, pt01b <dbl>, pt01c <dbl>, pt01d <dbl>, pt01e <dbl>, pt01f <dbl>, pt02b <dbl>, pt02c <dbl>, pt02d <dbl>,
## # pt02e <dbl>, pt02f <dbl>, assessment <dbl>, sample <dbl>, l1_1 <dbl>, l1_2 <dbl>, l1_3 <dbl>, l1_4 <dbl>, l2_1 <dbl>, l2_2 <dbl>, …
The main usefulness of the pipe is when combining multiple operations:
Guided practice
5.6.3 Arranging rows with arrange()
The arrange() function from dplyr lets you order rows in your data based on one or more columns.
Example
Ordering children from Mozambique in a descending order of age (note the use of desc):
## # A tibble: 390 × 147
## ...1 CountryName TierOneUnit VillageID Location HHID SubmissionDate duration hh06a hh06b1 hh06b2 hh06c hh06d hh07a hh07b hh07c
## <dbl> <chr> <chr> <chr> <chr> <chr> <date> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>
## 1 35 Mozambique c101 c101 Urban uuid:9a7b… 2025-09-25 2868 8 1 NA 1 1 2 2 1
## 2 677 Mozambique c101 c122 Rural uuid:6354… 2025-09-27 1443 10 0 NA NA 1 2 2 1
## 3 794 Mozambique c102 c127 Urban uuid:4c66… 2025-09-25 404 4 0 NA NA 1 3 3 2
## 4 1898 Mozambique c103 c165 Rural uuid:45c8… 2025-08-09 437 3 1 NA 2 1 2 2 1
## 5 2281 Mozambique c104 c176 Urban uuid:04e1… 2025-09-22 3718 9 1 NA 1 1 2 2 1
## 6 2310 Mozambique c104 c177 Urban uuid:b01c… 2025-09-18 2063 7 1 NA 1 1 2 2 1
## 7 2321 Mozambique c104 c177 Urban uuid:0f3a… 2025-09-18 4206 9 1 NA 1 1 3 3 1
## 8 3251 Mozambique c104 c203 Rural uuid:3877… 2025-09-21 643 3 0 NA NA 0 3 3 4
## 9 3355 Mozambique c105 c206 Urban uuid:0f4a… 2025-09-22 2927 7 1 NA 1 0 2 2 1
## 10 3378 Mozambique c105 c207 Urban uuid:7ecd… 2025-09-26 1052 5 1 NA 1 1 2 2 1
## # ℹ 380 more rows
## # ℹ 131 more variables: hh07d <dbl>, hh07e <dbl>, hh07f <dbl>, hh07g <dbl>, hh07h <dbl>, hh07i <dbl>, hh07j <dbl>, hh07k <dbl>,
## # hh07l <dbl>, hh07m <dbl>, hh07n <dbl>, hh07o <dbl>, hh07p <dbl>, hh07p_3 <dbl>, hh07q <dbl>, ChildID <chr>, ch02 <dbl>, ch03 <dbl>,
## # ch04a <dbl>, ch04b <dbl>, ch04c <dbl>, ch04d <dbl>, ch04e <dbl>, ch04f <dbl>, ch05 <dbl>, ch06a <dbl>, ch06b <dbl>, ch06c <dbl>,
## # ch06d <dbl>, ch06e <dbl>, ch07a <dbl>, ch07b <dbl>, ch07c <dbl>, ch07d <dbl>, ch08 <dbl>, ch09 <dbl>, ch10a <dbl>, ch10b <dbl>,
## # ch10c <dbl>, pt00 <dbl>, pt01b <dbl>, pt01c <dbl>, pt01d <dbl>, pt01e <dbl>, pt01f <dbl>, pt02b <dbl>, pt02c <dbl>, pt02d <dbl>,
## # pt02e <dbl>, pt02f <dbl>, assessment <dbl>, sample <dbl>, l1_1 <dbl>, l1_2 <dbl>, l1_3 <dbl>, l1_4 <dbl>, l2_1 <dbl>, l2_2 <dbl>, …
Subsequent arguments to arrange() can be used to arrange by multiple columns. Here we first children from Mozambique by age (in descending order) and then by household size (in ascending order)
Guided practice
We can also use the pipe to do the same thing:
Example
## # A tibble: 390 × 147
## ...1 CountryName TierOneUnit VillageID Location HHID SubmissionDate duration hh06a hh06b1 hh06b2 hh06c hh06d hh07a hh07b hh07c
## <dbl> <chr> <chr> <chr> <chr> <chr> <date> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>
## 1 35 Mozambique c101 c101 Urban uuid:9a7b… 2025-09-25 2868 8 1 NA 1 1 2 2 1
## 2 677 Mozambique c101 c122 Rural uuid:6354… 2025-09-27 1443 10 0 NA NA 1 2 2 1
## 3 794 Mozambique c102 c127 Urban uuid:4c66… 2025-09-25 404 4 0 NA NA 1 3 3 2
## 4 1898 Mozambique c103 c165 Rural uuid:45c8… 2025-08-09 437 3 1 NA 2 1 2 2 1
## 5 2281 Mozambique c104 c176 Urban uuid:04e1… 2025-09-22 3718 9 1 NA 1 1 2 2 1
## 6 2310 Mozambique c104 c177 Urban uuid:b01c… 2025-09-18 2063 7 1 NA 1 1 2 2 1
## 7 2321 Mozambique c104 c177 Urban uuid:0f3a… 2025-09-18 4206 9 1 NA 1 1 3 3 1
## 8 3251 Mozambique c104 c203 Rural uuid:3877… 2025-09-21 643 3 0 NA NA 0 3 3 4
## 9 3355 Mozambique c105 c206 Urban uuid:0f4a… 2025-09-22 2927 7 1 NA 1 0 2 2 1
## 10 3378 Mozambique c105 c207 Urban uuid:7ecd… 2025-09-26 1052 5 1 NA 1 1 2 2 1
## # ℹ 380 more rows
## # ℹ 131 more variables: hh07d <dbl>, hh07e <dbl>, hh07f <dbl>, hh07g <dbl>, hh07h <dbl>, hh07i <dbl>, hh07j <dbl>, hh07k <dbl>,
## # hh07l <dbl>, hh07m <dbl>, hh07n <dbl>, hh07o <dbl>, hh07p <dbl>, hh07p_3 <dbl>, hh07q <dbl>, ChildID <chr>, ch02 <dbl>, ch03 <dbl>,
## # ch04a <dbl>, ch04b <dbl>, ch04c <dbl>, ch04d <dbl>, ch04e <dbl>, ch04f <dbl>, ch05 <dbl>, ch06a <dbl>, ch06b <dbl>, ch06c <dbl>,
## # ch06d <dbl>, ch06e <dbl>, ch07a <dbl>, ch07b <dbl>, ch07c <dbl>, ch07d <dbl>, ch08 <dbl>, ch09 <dbl>, ch10a <dbl>, ch10b <dbl>,
## # ch10c <dbl>, pt00 <dbl>, pt01b <dbl>, pt01c <dbl>, pt01d <dbl>, pt01e <dbl>, pt01f <dbl>, pt02b <dbl>, pt02c <dbl>, pt02d <dbl>,
## # pt02e <dbl>, pt02f <dbl>, assessment <dbl>, sample <dbl>, l1_1 <dbl>, l1_2 <dbl>, l1_3 <dbl>, l1_4 <dbl>, l2_1 <dbl>, l2_2 <dbl>, …
Finally, here’s an example of combining filter and arrange operations with the pipe:
## # A tibble: 390 × 147
## ...1 CountryName TierOneUnit VillageID Location HHID SubmissionDate duration hh06a hh06b1 hh06b2 hh06c hh06d hh07a hh07b hh07c
## <dbl> <chr> <chr> <chr> <chr> <chr> <date> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>
## 1 35 Mozambique c101 c101 Urban uuid:9a7b… 2025-09-25 2868 8 1 NA 1 1 2 2 1
## 2 677 Mozambique c101 c122 Rural uuid:6354… 2025-09-27 1443 10 0 NA NA 1 2 2 1
## 3 794 Mozambique c102 c127 Urban uuid:4c66… 2025-09-25 404 4 0 NA NA 1 3 3 2
## 4 1898 Mozambique c103 c165 Rural uuid:45c8… 2025-08-09 437 3 1 NA 2 1 2 2 1
## 5 2281 Mozambique c104 c176 Urban uuid:04e1… 2025-09-22 3718 9 1 NA 1 1 2 2 1
## 6 2310 Mozambique c104 c177 Urban uuid:b01c… 2025-09-18 2063 7 1 NA 1 1 2 2 1
## 7 2321 Mozambique c104 c177 Urban uuid:0f3a… 2025-09-18 4206 9 1 NA 1 1 3 3 1
## 8 3251 Mozambique c104 c203 Rural uuid:3877… 2025-09-21 643 3 0 NA NA 0 3 3 4
## 9 3355 Mozambique c105 c206 Urban uuid:0f4a… 2025-09-22 2927 7 1 NA 1 0 2 2 1
## 10 3378 Mozambique c105 c207 Urban uuid:7ecd… 2025-09-26 1052 5 1 NA 1 1 2 2 1
## # ℹ 380 more rows
## # ℹ 131 more variables: hh07d <dbl>, hh07e <dbl>, hh07f <dbl>, hh07g <dbl>, hh07h <dbl>, hh07i <dbl>, hh07j <dbl>, hh07k <dbl>,
## # hh07l <dbl>, hh07m <dbl>, hh07n <dbl>, hh07o <dbl>, hh07p <dbl>, hh07p_3 <dbl>, hh07q <dbl>, ChildID <chr>, ch02 <dbl>, ch03 <dbl>,
## # ch04a <dbl>, ch04b <dbl>, ch04c <dbl>, ch04d <dbl>, ch04e <dbl>, ch04f <dbl>, ch05 <dbl>, ch06a <dbl>, ch06b <dbl>, ch06c <dbl>,
## # ch06d <dbl>, ch06e <dbl>, ch07a <dbl>, ch07b <dbl>, ch07c <dbl>, ch07d <dbl>, ch08 <dbl>, ch09 <dbl>, ch10a <dbl>, ch10b <dbl>,
## # ch10c <dbl>, pt00 <dbl>, pt01b <dbl>, pt01c <dbl>, pt01d <dbl>, pt01e <dbl>, pt01f <dbl>, pt02b <dbl>, pt02c <dbl>, pt02d <dbl>,
## # pt02e <dbl>, pt02f <dbl>, assessment <dbl>, sample <dbl>, l1_1 <dbl>, l1_2 <dbl>, l1_3 <dbl>, l1_4 <dbl>, l2_1 <dbl>, l2_2 <dbl>, …
Guided practice
Let’s attempt these exercises:
- Arrange children from urban Mali who receive help with homework in descending order using their total assessment time for math
#Tip: Use desc() to sort a column in descending order. Without desc(), arrange() sorts in ascending order by default.Independent Practice
Filter children who are 5 years old who have a lot of difficulty in seeing and attained an Math IRT score greater than 2 arrange them by descending ICAN assessment time. Combine
filter(),arrange(), and%>%into a single pipeline.Sort
datto find the most slowest child in reading assessment time. Did they meet minimum proficiency level for reading?How could you use arrange() to sort all missing values to the start? (Hint: use is.na()).
5.6.4 Selecting columns with select()
The ICAN-ICAR 2025 survey data has \(146\) columns. This is not unusual if one works with big data. It is highly likely that there is few columns (or variables) that you would be interested in. In that case, select() allows you to rapidly zoom in on a useful subset using operations based on the names of the variables.
select() is a quite similar to filter() for columns. The syntax is straightforward, the first argument gives the data, and then you list the variables you want to select!
Example
## # A tibble: 96,452 × 4
## ChildID Location ch02 MPLBoth
## <chr> <chr> <dbl> <dbl>
## 1 uuid:f5808f62-6a22-4247-ae68-9eb891657449_child1 Urban 8 0
## 2 uuid:b98e2157-5941-457c-8249-cea77f3c4b47_child1 Urban 12 0
## 3 uuid:26117653-2d45-474f-a7ea-6fa591e1bfe7_child1 Urban 8 0
## 4 uuid:d46bcfd9-f3e5-44ff-825e-6371ee6a1b9c_child1 Urban 9 0
## 5 uuid:df927f9a-f03b-480b-ada3-ddfde15b8ae8_child1 Urban 10 0
## 6 uuid:ade5f0d8-768a-479c-956b-976e569ff49d_child1 Urban 13 0
## 7 uuid:3f84d69b-e298-43d8-8c77-0ae0e5ad4ad8_child1 Urban 8 0
## 8 uuid:ff7b1d96-0e69-400c-955e-22097e185a4b_child1 Urban 8 1
## 9 uuid:7730b493-addc-4c60-8a8e-9d31376770c2_child1 Urban 16 0
## 10 uuid:e2108c51-8a38-452a-94f5-dd128cf38990_child1 Urban 8 0
## # ℹ 96,442 more rows
To exclude variables just put a minus sign in front of them:
Example
## # A tibble: 96,452 × 3
## ChildID ch02 MPLBoth
## <chr> <dbl> <dbl>
## 1 uuid:f5808f62-6a22-4247-ae68-9eb891657449_child1 8 0
## 2 uuid:b98e2157-5941-457c-8249-cea77f3c4b47_child1 12 0
## 3 uuid:26117653-2d45-474f-a7ea-6fa591e1bfe7_child1 8 0
## 4 uuid:d46bcfd9-f3e5-44ff-825e-6371ee6a1b9c_child1 9 0
## 5 uuid:df927f9a-f03b-480b-ada3-ddfde15b8ae8_child1 10 0
## 6 uuid:ade5f0d8-768a-479c-956b-976e569ff49d_child1 13 0
## 7 uuid:3f84d69b-e298-43d8-8c77-0ae0e5ad4ad8_child1 8 0
## 8 uuid:ff7b1d96-0e69-400c-955e-22097e185a4b_child1 8 1
## 9 uuid:7730b493-addc-4c60-8a8e-9d31376770c2_child1 16 0
## 10 uuid:e2108c51-8a38-452a-94f5-dd128cf38990_child1 8 0
## # ℹ 96,442 more rows
You can also use select() to reorder variables. A useful function here is everything(). This is useful if you have a handful of variables you’d like to move to the start of the data frame.
There are other helper functions that you can use within select():
- starts_with("abc") matches column names that begin with “abc”.
- ends_with("xyz") matches column names that end with "xyz". -contains(“ijk”) matches column names that contain “ijk”.
- num_range("x", 1:3) matchesx1,x2, andx3`.
# match columns that begin with "l1" but exclude location variable
dat %>%
select(ChildID, starts_with("l1"))## # A tibble: 96,452 × 5
## ChildID l1_1 l1_2 l1_3 l1_4
## <chr> <dbl> <dbl> <dbl> <dbl>
## 1 uuid:f5808f62-6a22-4247-ae68-9eb891657449_child1 1 1 1 1
## 2 uuid:b98e2157-5941-457c-8249-cea77f3c4b47_child1 2 2 1 2
## 3 uuid:26117653-2d45-474f-a7ea-6fa591e1bfe7_child1 0 0 0 0
## 4 uuid:d46bcfd9-f3e5-44ff-825e-6371ee6a1b9c_child1 2 1 2 2
## 5 uuid:df927f9a-f03b-480b-ada3-ddfde15b8ae8_child1 1 1 1 1
## 6 uuid:ade5f0d8-768a-479c-956b-976e569ff49d_child1 1 1 1 1
## 7 uuid:3f84d69b-e298-43d8-8c77-0ae0e5ad4ad8_child1 1 2 2 2
## 8 uuid:ff7b1d96-0e69-400c-955e-22097e185a4b_child1 2 2 2 2
## 9 uuid:7730b493-addc-4c60-8a8e-9d31376770c2_child1 2 1 2 1
## 10 uuid:e2108c51-8a38-452a-94f5-dd128cf38990_child1 2 1 0 1
## # ℹ 96,442 more rows
# match column names that contain "hh" (these are household-related variables)
dat %>%
select(ChildID, contains("hh"), -HHWeightProvided)## # A tibble: 96,452 × 25
## ChildID HHID hh06a hh06b1 hh06b2 hh06c hh06d hh07a hh07b hh07c hh07d hh07e hh07f hh07g hh07h hh07i hh07j hh07k hh07l hh07m hh07n
## <chr> <chr> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>
## 1 uuid:f5808… uuid… 4 1 NA 1 1 2 2 1 1 1 1 1 1 1 0 1 0 1 0
## 2 uuid:b98e2… uuid… 7 1 NA 1 1 2 2 1 1 1 1 1 1 1 1 1 1 1 0
## 3 uuid:26117… uuid… 6 1 NA 1 1 2 2 1 1 1 1 1 1 1 0 1 1 1 1
## 4 uuid:d46bc… uuid… 6 1 NA 1 1 1 1 1 4 1 1 1 1 1 0 1 1 1 0
## 5 uuid:df927… uuid… 5 1 NA 1 1 2 2 1 4 1 1 1 0 0 0 1 0 0 NA
## 6 uuid:ade5f… uuid… 5 1 NA 1 1 2 2 1 4 1 1 1 1 1 0 1 0 1 1
## 7 uuid:3f84d… uuid… 5 1 NA 1 1 2 2 1 1 1 1 1 1 1 1 1 1 1 1
## 8 uuid:ff7b1… uuid… 4 1 NA 1 0 2 2 1 1 1 1 1 1 0 1 1 0 1 1
## 9 uuid:7730b… uuid… 4 1 NA 1 1 2 2 1 4 1 1 1 1 0 0 1 0 1 1
## 10 uuid:e2108… uuid… 5 1 NA 2 1 2 2 1 4 1 1 0 0 0 0 0 1 1 1
## # ℹ 96,442 more rows
## # ℹ 4 more variables: hh07o <dbl>, hh07p <dbl>, hh07p_3 <dbl>, hh07q <dbl>
## # A tibble: 96,452 × 4
## ChildID l2_1 l2_2 l2_3
## <chr> <dbl> <dbl> <dbl>
## 1 uuid:f5808f62-6a22-4247-ae68-9eb891657449_child1 0 0 0
## 2 uuid:b98e2157-5941-457c-8249-cea77f3c4b47_child1 2 1 2
## 3 uuid:26117653-2d45-474f-a7ea-6fa591e1bfe7_child1 0 0 0
## 4 uuid:d46bcfd9-f3e5-44ff-825e-6371ee6a1b9c_child1 2 2 2
## 5 uuid:df927f9a-f03b-480b-ada3-ddfde15b8ae8_child1 1 1 1
## 6 uuid:ade5f0d8-768a-479c-956b-976e569ff49d_child1 2 2 2
## 7 uuid:3f84d69b-e298-43d8-8c77-0ae0e5ad4ad8_child1 0 2 2
## 8 uuid:ff7b1d96-0e69-400c-955e-22097e185a4b_child1 2 2 2
## 9 uuid:7730b493-addc-4c60-8a8e-9d31376770c2_child1 2 2 2
## 10 uuid:e2108c51-8a38-452a-94f5-dd128cf38990_child1 1 1 1
## # ℹ 96,442 more rows
See ?select() for more details
Independent Practice
Select only the child-relatd columns.
Exclude the numeracy-related items.
5.6.5 Adding new variables with mutate()
Mutating operations add a new column to a dataframe using existing columns. Here’s a trivial example to get started:
## # A tibble: 96,452 × 148
## ...1 CountryName TierOneUnit VillageID Location HHID SubmissionDate duration hh06a hh06b1 hh06b2 hh06c hh06d hh07a hh07b hh07c
## <dbl> <chr> <chr> <chr> <chr> <chr> <date> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>
## 1 1 Mozambique c101 c101 Urban uuid:f580… 2025-09-23 979 4 1 NA 1 1 2 2 1
## 2 2 Mozambique c101 c101 Urban uuid:b98e… 2025-09-23 986 7 1 NA 1 1 2 2 1
## 3 3 Mozambique c101 c101 Urban uuid:2611… 2025-09-23 2167 6 1 NA 1 1 2 2 1
## 4 4 Mozambique c101 c101 Urban uuid:d46b… 2025-09-23 2078 6 1 NA 1 1 1 1 1
## 5 5 Mozambique c101 c101 Urban uuid:df92… 2025-09-23 1251 5 1 NA 1 1 2 2 1
## 6 6 Mozambique c101 c101 Urban uuid:ade5… 2025-09-23 2199 5 1 NA 1 1 2 2 1
## 7 7 Mozambique c101 c101 Urban uuid:3f84… 2025-09-23 2889 5 1 NA 1 1 2 2 1
## 8 8 Mozambique c101 c101 Urban uuid:ff7b… 2025-09-23 1588 4 1 NA 1 0 2 2 1
## 9 9 Mozambique c101 c101 Urban uuid:7730… 2025-09-23 2203 4 1 NA 1 1 2 2 1
## 10 10 Mozambique c101 c101 Urban uuid:e210… 2025-09-24 4647 5 1 NA 2 1 2 2 1
## # ℹ 96,442 more rows
## # ℹ 132 more variables: hh07d <dbl>, hh07e <dbl>, hh07f <dbl>, hh07g <dbl>, hh07h <dbl>, hh07i <dbl>, hh07j <dbl>, hh07k <dbl>,
## # hh07l <dbl>, hh07m <dbl>, hh07n <dbl>, hh07o <dbl>, hh07p <dbl>, hh07p_3 <dbl>, hh07q <dbl>, ChildID <chr>, ch02 <dbl>, ch03 <dbl>,
## # ch04a <dbl>, ch04b <dbl>, ch04c <dbl>, ch04d <dbl>, ch04e <dbl>, ch04f <dbl>, ch05 <dbl>, ch06a <dbl>, ch06b <dbl>, ch06c <dbl>,
## # ch06d <dbl>, ch06e <dbl>, ch07a <dbl>, ch07b <dbl>, ch07c <dbl>, ch07d <dbl>, ch08 <dbl>, ch09 <dbl>, ch10a <dbl>, ch10b <dbl>,
## # ch10c <dbl>, pt00 <dbl>, pt01b <dbl>, pt01c <dbl>, pt01d <dbl>, pt01e <dbl>, pt01f <dbl>, pt02b <dbl>, pt02c <dbl>, pt02d <dbl>,
## # pt02e <dbl>, pt02f <dbl>, assessment <dbl>, sample <dbl>, l1_1 <dbl>, l1_2 <dbl>, l1_3 <dbl>, l1_4 <dbl>, l2_1 <dbl>, l2_2 <dbl>, …
Now we have \(147\) columns, including this_is.
A more useful use of mutate is to construct a new variable based on existing variables. This is the way that mutate is almost always used.
Example
# create "math_savvy" column using MathIRTScore if the child has a score above 2
dat_new = dat %>%
mutate(math_savvy = if_else(MathIRTScore > 2, "Yes", "No"))Hopefully, you’re getting used to the pipe by now, so let’s embed a mutating operation within a larger pipe than we’ve used before.
# count using count() how many children are math-savvy across countries
# first exclude all NAs from math_savvy column
dat %>%
mutate(math_savvy = if_else(MathIRTScore > 2, "Yes", "No")) %>%
filter(!is.na(math_savvy)) %>%
count(CountryName, math_savvy)## # A tibble: 12 × 3
## CountryName math_savvy n
## <chr> <chr> <int>
## 1 Bangladesh No 6479
## 2 Kenya No 6669
## 3 Mali No 9588
## 4 Mexico No 8109
## 5 Mozambique No 8020
## 6 Nepal No 4694
## 7 Nicaragua No 6230
## 8 Pakistan No 7236
## 9 Pakistan Yes 1966
## 10 Senegal No 8098
## 11 Tanzania No 13166
## 12 Uganda No 8883
They are all from Pakistan!
Independent Practice
- Convert both the ICAN and ICAR assessment time for each assessed child from seconds to hours. Extract all assessed children that took more than hour for both ICAN and ICAR assessments.
5.6.6 Aggregating over rows with summarise()
The summarise() verb (or summarize() will also work) summarises the rows in a data frame in some way. When applied to the whole data frame, it will collapse it to a single row. For example, here we calculate the average age and the median Reading IRT Score for children in Uganda living in rural areas:
dat %>%
filter(CountryName == "Uganda") %>%
summarise(mean_age = mean(ch02),
median_irt_score = median(ReadingIRTScore, na.rm = TRUE))## # A tibble: 1 × 2
## mean_age median_irt_score
## <dbl> <dbl>
## 1 9.90 -0.538
You need to watch out for NAs when using summarise(). If one exists, operations like mean() will return NA. You can exclude NAs from calculations using na.rm = TRUE.
summarise() is most useful when combined with group_by(), which imposes a grouping structure on a data frame. After applying group_by(), subsequent dplyr verbs will be applied to individual groups, basically repeating the code for each group. That means that summarise() will calculate a summary for each group:
# tell dplyr to group the data by Country
dat_new = dat %>%
group_by(CountryName)
# apply summarize() to see how many children in each country surveyed
dat_new %>%
summarize(count = n())## # A tibble: 11 × 2
## CountryName count
## <chr> <int>
## 1 Bangladesh 6664
## 2 Kenya 7076
## 3 Mali 10091
## 4 Mexico 8306
## 5 Mozambique 8255
## 6 Nepal 4801
## 7 Nicaragua 7310
## 8 Pakistan 10510
## 9 Senegal 9117
## 10 Tanzania 14795
## 11 Uganda 9527
# get sorted counts (and present differently)
dat %>%
group_by(CountryName) %>%
summarize(count = n()) %>%
arrange(desc(count)) %>%
head(10) # take first ten rows## # A tibble: 10 × 2
## CountryName count
## <chr> <int>
## 1 Tanzania 14795
## 2 Pakistan 10510
## 3 Mali 10091
## 4 Uganda 9527
## 5 Senegal 9117
## 6 Mexico 8306
## 7 Mozambique 8255
## 8 Nicaragua 7310
## 9 Kenya 7076
## 10 Bangladesh 6664
You can also pass your own summary functions to the pipe:
# my own function, computes the 60% quantile of the Reading IRT Score for each country
compute_60q = function(x){quantile(x, probs = 0.60, na.rm = TRUE)}
# use it in a grouped summary
dat %>%
group_by(CountryName) %>%
summarize(count = n(), q60 = compute_60q(ReadingIRTScore)) %>%
head(10)## # A tibble: 10 × 3
## CountryName count q60
## <chr> <int> <dbl>
## 1 Bangladesh 6664 0.356
## 2 Kenya 7076 0.451
## 3 Mali 10091 -1.74
## 4 Mexico 8306 0.629
## 5 Mozambique 8255 -0.991
## 6 Nepal 4801 0.392
## 7 Nicaragua 7310 0.647
## 8 Pakistan 10510 0.439
## 9 Senegal 9117 -0.164
## 10 Tanzania 14795 0.315