Summary: Bds: Toolbox

Study material generic cover image
  • This + 400k other summaries
  • A unique study and practice tool
  • Never study anything twice again
  • Get the grades you hope for
  • 100% sure, 100% understanding
PLEASE KNOW!!! There are just 48 flashcards and notes available for this material. This summary might not be complete. Please search similar or other summaries.
Use this summary
Remember faster, study better. Scientifically proven.
Trustpilot Logo

Read the summary and the most important questions on BDS: Toolbox

  • 1 tidyverse

  • What are some shortcut codes to use in select to search for variables with similarities in their names?

    • starts_with("abc"): matches names that begin with “abc” 
    • ends_with("xyz"): matches names that end with “xyz”. 
    • contains("ijk"): matches names that contain “ijk”. 
    • matches("(.)\\1"): selects variables that match a regular expression. 
    • num_range("x", 1:3): matches x1, x2 and x3
  • 2 SQL

    This is a preview. There are 1 more flashcards available for chapter 2
    Show more cards here

  • How can you get two variables by name from a dataframe in sql, and select only items where the value for the first variable is larger than 5?


    SELECT var_1, var_2
    FROM dataframe
    WHERE var_1 > 5
  • How can you select all variables from a dataframe, but only select observations where variable one has a value of 4, and variable 2 has a value of lower than 10?

    SELECT * 
    FROM data
    WHERE var_1 = 4 AND var_2 < 10
  • How can you select two variables from the dataframe, rename them to 1 and 2, order the observations by values from var_1, and only show the first six results?


    SELECT var_1 AS 1, var_2 AS 2
    FROM data
    ORDER BY var_1 DESC (descending order)
    LIMIT 6
  • What would be the equivalent of the following tidyverse code in sql?data %>%select(id) %>%summarise(mean)

    SELECT AVG(id)
    FROM data
  • Why does the following code not work?SELECT application_id, COUNT(*)GROUP BY application_idFROM domains

    In sql, the order of the statements matter.
    the GROUP BY argument always comes after the FROM argument.
  • What is the sql equivalent of the following tidyversecode:domains %>%  group_by(application_id) %>%  summarise(n())


    SELECT application_id, COUNT(*)
    FROM domains
    GROUP BY application_id
  • How would you leftjoin data_1 on data_2 by var_2 from data_1 and var_5 from data_2?


    SELECT *
    FROM data_2
    LEFT JOIN data_1 ON var_2 = var_5
  • How can you make nested queries in sql?

    For example:
    SELECT var_1
    FROM data_1
    WHERE var_1 < ( SELECT AVG(var_1) FROM data_2)

    This statement retrieves all instances of variable 1 in data_1, that have a lower value than the average of variable 1 in data_2
  • How can you run a sql statement in r?


    con <- sqldatabase::connect()

    DBI::dbGetQuery(con, " *sql statement as string* ")
PLEASE KNOW!!! There are just 48 flashcards and notes available for this material. This summary might not be complete. Please search similar or other summaries.

To read further, please click:

Read the full summary
This summary +380.000 other summaries A unique study tool A rehearsal system for this summary Studycoaching with videos
  • Higher grades + faster learning
  • Never study anything twice
  • 100% sure, 100% understanding
Discover Study Smart