Summary: Bds: Toolbox
- This + 400k other summaries
- A unique study and practice tool
- Never study anything twice again
- Get the grades you hope for
- 100% sure, 100% understanding
Read the summary and the most important questions on BDS: Toolbox
-
1 tidyverse
-
What are some shortcut codes to use in select to search for variables with similarities in their names?
- starts_with("abc"): matches names that begin with “abc”
- ends_with("xyz"): matches names that end with “xyz”.
- contains("ijk"): matches names that contain “ijk”.
- matches("(.)\\1"): selects variables that match a regular expression.
- num_range("x", 1:3): matches x1, x2 and x3
- starts_with("abc"): matches names that begin with “abc”
-
2 SQL
This is a preview. There are 1 more flashcards available for chapter 2
Show more cards here -
How can you get two variables by name from a dataframe in sql, and select only items where the value for the first variable is larger than 5?
SELECTvar _1,var _2
FROMdataframe
WHEREvar _1 > 5 -
How can you select all variables from a dataframe, but only select observations where variable one has a value of 4, and variable 2 has a value of lower than 10?
SELECT *
FROMdata
WHERE var_1 = 4 AND var_2 < 10 -
How can you select two variables from the dataframe, rename them to 1 and 2, order the observations by values from var_1, and only show the first six results?
SELECT var_1 AS 1, var_2 AS 2
FROM data
ORDER BY var_1 DESC (descending order)
LIMIT 6 -
What would be the equivalent of the following tidyverse code in sql?data %>%select(id) %>%summarise(mean)
SELECTAVG (id)
FROM data -
Why does the following code not work?SELECT application_id, COUNT(*)GROUP BY application_idFROM domains
Insql , theorder of thestatements matter.
theGROUP BY argument always comes after the FROM argument. -
What is the sql equivalent of the following tidyversecode:domains %>% group_by(application_id) %>% summarise(n())
SELECTapplication_id ,COUNT (*)
FROM domainsGROUP BYapplication_id -
How would you leftjoin data_1 on data_2 by var_2 from data_1 and var_5 from data_2?
SELECT *
FROM data_2
LEFT JOIN data_1 ON var_2 = var_5 -
How can you make nested queries in sql?
For example:
SELECT var_1
FROM data_1
WHERE var_1 < ( SELECT AVG(var_1) FROM data_2)
This statement retrieves all instances of variable 1 in data_1, that have a lower value than the average of variable 1 in data_2 -
How can you run a sql statement in r?
con <- sqldatabase::connect()
DBI::dbGetQuery(con, " *sql statement as string* ")
- Higher grades + faster learning
- Never study anything twice
- 100% sure, 100% understanding