Sign up for free to use this document yourself.
  • Start

  • Essential Concepts

  • Hypothesis Testing

  • Rolke Given

  • Troubleshooting

  • R COncepts

  • Useful Functions

  • Help

  • Often during a session you create objects that you need only for a short time. When you no longer need them use rm to get rid of them:

    x<-1:10
    sum(x^2)
    rm(x)

    the <- is the assignment character in R, it assigns what is on the right to the symbol on the left.

  • Data Types

  • Comparing population proportions

    #eSx

  • type I error

  • type II error

  • Now the first thing we need to recognize is

    One of the principles of Science is that it is impossible to prove that a theory is correct but it is always possible to prove that the theory is false (a theory can be falsified)

  • the <- is the assignment character in R, it assigns what is on the right to the symbol on the left.

    Data Types

    the most basic type of data in R is a vector, a list of values.

    Say we want the numbers 1.5, 3.6, 5.1 and 4.0 in an R vector called x, then we can type

    x <- c(1.5,3.6,5.1,4.0)

    “c” stands for concatenate, meaning “put together”

    There are various ways to generate a vector, here are some examples:

    x <- 1:10
    x <- 10:1
    x <- 1:202
    x <- c(1:10,1:10
    2)

    Sometimes you need parentheses:

    n <- 10
    1:n-1
    1:(n-1)

    The rep (“repeat”) command is very useful:

    rep(1,10)
    rep(1:3,10)
    rep(1:3,each=3)

    To find out how many elements a vector has use the length command

    length(x)

    The elements of a vector are accessed with the bracket notation:

    x <- 1:10*5
    x[3]
    x[1:3]
    x[c(1,3,8)]
    x[-3]
    x[-c(1,2,5)]

  • Let’s start with

    ls()

    shows you a “listing” of the files (data, routines etc.)

    If you have worked for a while you might have things you need to save, do that with

    File > Save Workspace

    If you quit the program without saving your stuff everything you did will be lost. R has a somewhat unusual file system, everything belonging to the same project (data, routines, graphs etc.) are stored in just one file, with the extension .RData.

    To quit R, type

    q()

    or click the x in the upper right corner.

    R has a nice recall feature, using the up and down arrow keys. Also, typing

    history()

    shows you the most recent things entered.

    R is case-sensitive, so a and A are two different things.

    Often during a session you create objects that you need only for a short time. When you no longer need them use rm to get rid of them:

    x<-1:10
    sum(x^2)
    rm(x)

    Instead of numbers a vector can also consist of characters (letters, numbers, symbols etc.) These are identified by quotes:

    x <- c(“A”,”B”,”7”,”%”)

    A vector is either numeric or character, but never both. You can turn one into the other (if possible) as follows:

    x <- 1:10
    as.character(x)

    x <- c(“1”,”5”)
    as.numeric(x)

    A third type of data is logical, with values either TRUE or FALSE.

    x <- 1:10
    x>4
    [1] FALSE FALSE FALSE FALSE TRUE TRUE TRUE TRUE TRUE TRUE

    these are often used as conditions:

    x[x>4]
    [1] 5 6 7 8 9 10

    Data Frames

    data frames are the basic format for data in R. They are essentially vectors put together as columns.

  • Error: could not find function __

    There are a few things you should check :

    Did you write the name of your function correctly? Names are case sensitive.
    Did you install the package that contains the function? install.packages(“thePackage”) (this only needs to be done once)
    Did you attach that package to the workspace ? require(thePackage) or library(thePackage) (this should be done every time you start a new R session)
    If you’re not sure in which package that function is situated, you can do a few things.

    If you’re sure you installed and attached/loaded the right package, type help.search(“some.function”) or ??some.function to get an information box that can tell you in which package it is contained.
    find and getAnywhere can also be used to locate functions.
    If you have no clue about the package, you can use findFn in the sos package as explained in this answer.
    RSiteSearch(“some.function”) or searching with rseek are alternative ways to find the function.

  • to see what data sets are attached use

    search()

    this also shows you which libraries are attached.

  • Data Frames

    data frames are the basic format for data in R. They are essentially vectors put together as columns.

  • ggplot

  • Student’s t-Test
    Description

    Performs one and two sample t-tests on vectors of data.

    Usage

    t.test(x, …)

    Default S3 method:

    t.test(x, y = NULL,
    alternative = c(“two.sided”, “less”, “greater”),
    mu = 0, paired = FALSE, var.equal = FALSE,
    conf.level = 0.95, …)

    S3 method for class ‘formula’

    t.test(formula, data, subset, na.action, …)
    Arguments

    x
    a (non-empty) numeric vector of data values.
    y
    an optional (non-empty) numeric vector of data values.
    alternative
    a character string specifying the alternative hypothesis, must be one of “two.sided” (default), “greater” or “less”. You can specify just the initial letter.
    mu
    a number indicating the true value of the mean (or difference in means if you are performing a two sample test).
    paired
    a logical indicating whether you want a paired t-test.
    var.equal
    a logical variable indicating whether to treat the two variances as being equal. If TRUE then the pooled variance is used to estimate the variance otherwise the Welch (or Satterthwaite) approximation to the degrees of freedom is used.
    conf.level
    confidence level of the interval.
    formula
    a formula of the form lhs ~ rhs where lhs is a numeric variable giving the data values and rhs a factor with two levels giving the corresponding groups.
    data
    an optional matrix or data frame (or similar: see model.frame) containing the variables in the formula formula. By default the variables are taken from environment(formula).
    subset
    an optional vector specifying a subset of observations to be used.
    na.action
    a function which indicates what should happen when the data contain NAs. Defaults to getOption(“na.action”).

    further arguments to be passed to or from methods.
    Details

    The formula interface is only applicable for the 2-sample tests.

    alternative = “greater” is the alternative that x has a larger mean than y.

    If paired is TRUE then both x and y must be specified and they must be the same length. Missing values are silently removed (in pairs if paired is TRUE). If var.equal is TRUE then the pooled estimate of the variance is used. By default, if var.equal is FALSE then the variance is estimated separately for both groups and the Welch modification to the degrees of freedom is used.

    If the input data are effectively constant (compared to the larger of the two means) an error is generated.

    Value

    A list with class “htest” containing the following components:

    statistic
    the value of the t-statistic.
    parameter
    the degrees of freedom for the t-statistic.
    p.value
    the p-value for the test.
    conf.int
    a confidence interval for the mean appropriate to the specified alternative hypothesis.
    estimate
    the estimated mean or difference in means depending on whether it was a one-sample test or a two-sample test.
    null.value
    the specified hypothesized value of the mean or mean difference depending on whether it was a one-sample test or a two-sample test.
    alternative
    a character string describing the alternative hypothesis.
    method
    a character string indicating what type of t-test was performed.
    data.name
    a character string giving the name(s) of the data.
    See Also

    prop.test

    Examples

    require(graphics)

    t.test(1:10, y = c(7:20)) # P = .00001855
    t.test(1:10, y = c(7:20, 200)) # P = .1245 — NOT significant anymore

    Classical example: Student’s sleep data

    plot(extra ~ group, data = sleep)

    Traditional interface

    with(sleep, t.test(extra[group == 1], extra[group == 2]))

    Formula interface

    t.test(extra ~ group, data = sleep)

  • vector, a list of values.

    Say we want the numbers 1.5, 3.6, 5.1 and 4.0 in an R vector called x, then we can type

    x <- c(1.5,3.6,5.1,4.0)

    “c” stands for concatenate, meaning “put together”

  • But we are still flipping a fair coin, so we should not reject the theory at all, doing so is an error. Soon we will call this the type I error. The 27% will be called the type I error probability α.

  • But there is also a downside to this. Let’s select a Slightly unfair (p=0.6) coin. Now the coin is NOT fair, and we should reject the theory. But we are doing so only 46% of the time, the other 54% of the runs wrongly make the theory look ok. This mistake is called the type II error. The 54% is called the type II error probability β. The percentage of runs that correctly reject the theory is called the power of the test.

  • Because of this a hypothesis test is set up so the data can proof the theory to be false:

    Example 1: Null Hypothesis H0: the new treatment is NOT better than the old one.

    Example 2: Null Hypothesis H0: the theory of evolution is correct

    Example 3: Null Hypothesis H0: the coin is fair

    but NOT proving the theory is false is not the same as accepting the theory as true! That is why we say we fail to reject the null hypothesis instead of just saying we accept the null hypothesis.

  • Paren {base} R Documentation

    Parentheses and Braces

    Description

    Open parenthesis, (, and open brace, {, are .Primitive functions in R.

    Effectively, ( is semantically equivalent to the identity function(x) x, whereas { is slightly more interesting, see examples.

    Usage

    ( \dots )

    { \dots }
    Value

    For (, the result of evaluating the argument. This has visibility set, so will auto-print if used at top-level.

    For {, the result of the last expression evaluated. This has the visibility of the last evaluation.

    References

    Becker, R. A., Chambers, J. M. and Wilks, A. R. (1988) The New S Language. Wadsworth & Brooks/Cole.

    See Also

    if, return, etc for other objects used in the R language itself.

    Syntax for operator precedence.

    Examples

    f <- get(“(“)
    e <- expression(3 + 2 * 4)
    identical(f(e), e)

    do <- get(“{“)
    do(x <- 3, y <- 2*x-3, 6-x-y); x; y

    note the differences

    (2+3)
    {2+3; 4+5}
    (invisible(2+3))
    {invisible(2+3)}

  • ggMarginal {ggExtra} R Documentation
    Add marginal density/histogram to ggplot2 scatterplots

    Description

    Create a ggplot2 scatterplot with marginal density plots (default) or histograms, or add the marginal plots to an existing scatterplot.

    Usage

    ggMarginal(p, data, x, y, type = c(“density”, “histogram”, “boxplot”),
    margins = c(“both”, “x”, “y”), size = 5, …, xparams, yparams)
    Arguments

    p
    A ggplot2 scatterplot to add marginal plots to. If p is not provided, then all of data, x, and y must be provided.
    data
    The data.frame to use for creating the marginal plots. Optional if p is provided and the marginal plots are reflecting the same data.
    x
    The name of the variable along the x axis. Optional if p is provided and the x aesthetic is set in the main plot.
    y
    The name of the variable along the y axis. Optional if p is provided and the y aesthetic is set in the main plot.
    type
    What type of marginal plot to show. One of: [density, histogram, boxplot].
    margins
    Along which margins to show the plots. One of: [both, x, y].
    size
    Integer describing the relative size of the marginal plots compared to the main plot. A size of 5 means that the main plot is 5x wider and 5x taller than the marginal plots.

    Extra parameters to pass to the marginal plots. Any parameter that geom_line(), geom_histogram(), or geom_boxplot() accepts can be used. For example, colour = “red” can be used for any marginal plot type, and binwidth = 10 can be used for histograms.
    xparams
    List of extra parameters to use only for the marginal plot along the x axis.
    yparams
    List of extra parameters to use only for the marginal plot along the y axis.
    Value

    An object of class ggExtraPlot. This object can be printed to show the plots or saved using any of the typical image-saving functions (for example, using png() or pdf()).

    Note

    The grid and gtable packages are required for this function.

    Since the size parameter is used by ggMarginal, if you want to pass a size to the marginal plots, you cannot use the … parameter. Instead, you must pass size to both xparams and yparams. For example, ggMarginal(p, size = 2) will change the size of the main vs marginal plot, while ggMarginal(p, xparams = list(size=2), yparams = list(size=2)) will make the density plot outline thicker.

    See Also

    Demo Shiny app

    Examples

    basic usage

    p <- ggplot2::ggplot(mtcars, ggplot2::aes(wt, mpg)) + ggplot2::geom_point()
    ggMarginal(p)

    using some parameters

    set.seed(30)
    df <- data.frame(x = rnorm(500, 50, 10), y = runif(500, 0, 50))
    p2 <- ggplot2::ggplot(df, ggplot2::aes(x, y)) + ggplot2::geom_point()
    ggMarginal(p2)
    ggMarginal(p2, type = “histogram”)
    ggMarginal(p2, margins = “x”)
    ggMarginal(p2, size = 2)
    ggMarginal(p2, colour = “red”)
    ggMarginal(p2, colour = “red”, xparams = list(colour = “blue”, size = 3))
    ggMarginal(p2, type = “histogram”, bins = 10)

    specifying the data directly instead of providing a plot

    ggMarginal(data = df, x = “x”, y = “y”)

    more examples showing how the marginal plots are properly aligned even when

    the main plot axis/margins/size/etc are changed

    set.seed(30)
    df2 <- data.frame(x = c(rnorm(250, 50, 10), rnorm(250, 100, 10)),
    y = runif(500, 0, 50))
    p2 <- ggplot2::ggplot(df2, ggplot2::aes(x, y)) + ggplot2::geom_point()
    ggMarginal(p2)

    p2 <- p2 + ggplot2::ggtitle(“Random data”) + ggplot2::theme_bw(30)
    ggMarginal(p2)

    p3 <- ggplot2::ggplot(df2, ggplot2::aes(log(x), y - 500)) + ggplot2::geom_point()
    ggMarginal(p3)

    p4 <- p3 + ggplot2::scale_x_continuous(limits = c(2, 6)) + ggplot2::theme_bw(50)
    ggMarginal(p4)

  • The elements of a vector are accessed with the bracket notation:

    x <- 1:10*5
    x[3]
    x[1:3]
    x[c(1,3,8)]
    x[-3]
    x[-c(1,2,5)]

    To find out how many elements a vector has use the length command

    length(x)

  • Vector Arithmetic

    R allows us to apply any mathemetical functions to a whole vector:

    x <- 1:10

    2*x

    x^2

    log(x)

    sum(x)

    y <- 21:30

    x+y

    x^2+y^2

    mean(x+y)

  • hjk

  • specifying the data directly instead of providing a plot

    ggMarginal(data = df, x = “x”, y = “y”)

{"cards":[{"_id":"6f7e1100dea0f68d5c000030","treeId":"6d70d97c5797d3c002000020","seq":8534873,"position":0.125,"parentId":null,"content":"Start\n\n\n"},{"_id":"6f7e1204dea0f68d5c000031","treeId":"6d70d97c5797d3c002000020","seq":8534894,"position":1,"parentId":"6f7e1100dea0f68d5c000030","content":"Often during a session you create objects that you need only for a short time. When you no longer need them use rm to get rid of them:\n\n> x**<-**1:10\n> sum(x^2)\n> rm(x)\n\nthe <- is the **assignment character** in R, it **assigns** **what is** on the **right** to the symbol on the left.\n\n"},{"_id":"6f7e1797dea0f68d5c000032","treeId":"6d70d97c5797d3c002000020","seq":8534897,"position":2,"parentId":"6f7e1100dea0f68d5c000030","content":"##**Data Types**"},{"_id":"6f7e18d5dea0f68d5c000033","treeId":"6d70d97c5797d3c002000020","seq":8535162,"position":1,"parentId":"6f7e1797dea0f68d5c000032","content":"vector, a list of values.\n\nSay we want the numbers 1.5, 3.6, 5.1 and 4.0 in an R vector called x, then we can type\n\n> x <- c(1.5,3.6,5.1,4.0)\n\n\"c\" stands for concatenate, meaning \"put together\""},{"_id":"6f7e1e33dea0f68d5c000034","treeId":"6d70d97c5797d3c002000020","seq":8535435,"position":1,"parentId":"6f7e18d5dea0f68d5c000033","content":"The **elements of a vector** are accessed with the **bracket** notation:\n\n> x <- 1:10*5\n> x[3]\n> x[1:3]\n> x[c(1,3,8)]\n> x[-3]\n> x[-c(1,2,5)]\n\n\n\n\nTo find out how many **elements** a vector has use the length command\n\n> length(x)"},{"_id":"6d71d462feb7f47985000027","treeId":"6d70d97c5797d3c002000020","seq":8535461,"position":2,"parentId":"6f7e18d5dea0f68d5c000033","content":"# **Vector Arithmetic**\n\n## R allows us to apply any mathemetical functions to a whole vector:\n\n> x <- 1:10\n\n> 2*x\n\n> x^2\n\n> log(x)\n\n> sum(x)\n\n> y <- 21:30\n\n> x+y\n\n> x^2+y^2\n\n> mean(x+y)"},{"_id":"6f7f7255f54fe194d1000032","treeId":"6d70d97c5797d3c002000020","seq":8536111,"position":0.15625,"parentId":null,"content":"## **Essential Concepts**"},{"_id":"6f7f736ff54fe194d1000033","treeId":"6d70d97c5797d3c002000020","seq":8536116,"position":1,"parentId":"6f7f7255f54fe194d1000032","content":"**Comparing population proportions**\n\n#eSx\n\n"},{"_id":"6f7e59cfdea0f68d5c000038","treeId":"6d70d97c5797d3c002000020","seq":8535549,"position":0.1875,"parentId":null,"content":"## Hypothesis Testing"},{"_id":"6f7e62eadea0f68d5c00003b","treeId":"6d70d97c5797d3c002000020","seq":8535569,"position":0.5,"parentId":"6f7e59cfdea0f68d5c000038","content":"**type I error**"},{"_id":"6f7e63e6dea0f68d5c00003c","treeId":"6d70d97c5797d3c002000020","seq":8535568,"position":1,"parentId":"6f7e62eadea0f68d5c00003b","content":"But we are still flipping a fair coin, so we should not reject the theory at all, doing so is an error. Soon we will call this the type I error. The 27% will be called the type I error probability α.\n\n"},{"_id":"6f7e6493dea0f68d5c00003d","treeId":"6d70d97c5797d3c002000020","seq":8535570,"position":0.75,"parentId":"6f7e59cfdea0f68d5c000038","content":"**type II error**"},{"_id":"6f7e64d5dea0f68d5c00003e","treeId":"6d70d97c5797d3c002000020","seq":8535571,"position":1,"parentId":"6f7e6493dea0f68d5c00003d","content":"But there is also a downside to this. Let's select a Slightly unfair (p=0.6) coin. Now the coin is NOT fair, and we should reject the theory. But we are doing so only 46% of the time, the other 54% of the runs wrongly make the theory look ok. This mistake is called the type II error. The 54% is called the type II error probability β. The percentage of runs that correctly reject the theory is called the power of the test.\n\n"},{"_id":"6f7e5a5edea0f68d5c000039","treeId":"6d70d97c5797d3c002000020","seq":8535547,"position":1,"parentId":"6f7e59cfdea0f68d5c000038","content":"Now the first thing we need to recognize is\n\nOne of the principles of Science is that it is **impossible** to prove that a **theory** is **correct** but it is **always possible** to **prove** that the **theory is false** (a theory can be **falsified**) \n\n"},{"_id":"6f7e5aaddea0f68d5c00003a","treeId":"6d70d97c5797d3c002000020","seq":8535555,"position":1,"parentId":"6f7e5a5edea0f68d5c000039","content":"Because of this a **hypothesis test** is set up so the **data** can **proof** the **theory** to be **false**:\n\nExample 1: Null Hypothesis H0: the new treatment is NOT better than the old one.\n\nExample 2: Null Hypothesis H0: the theory of evolution is correct\n\nExample 3: Null Hypothesis H0: the coin is fair\n\n**but NOT proving the theory is false is not the same as accepting the theory as true!** That is why we say we fail to reject the null hypothesis instead of just saying we accept the null hypothesis.\n\n"},{"_id":"6d71d21bfeb7f47985000025","treeId":"6d70d97c5797d3c002000020","seq":8247562,"position":0.25,"parentId":null,"content":"Rolke Given\n"},{"_id":"6d71e76dfeb7f4798500002e","treeId":"6d70d97c5797d3c002000020","seq":8244999,"position":0.5,"parentId":"6d71d21bfeb7f47985000025","content":"the <- is the **assignment character** in R, it assigns what is on the right to the symbol on the left.\n\nData Types\n\nthe most basic type of data in R is a vector, a list of values.\n\nSay we want the numbers 1.5, 3.6, 5.1 and 4.0 in an R vector called x, then we can type\n\n> x <- c(1.5,3.6,5.1,4.0)\n\n\"c\" stands for concatenate, meaning \"put together\"\n\nThere are various ways to generate a vector, here are some examples:\n\n> x <- 1:10 \n> x <- 10:1\n> x <- 1:20*2\n> x <- c(1:10,1:10*2)\n\nSometimes you need parentheses:\n\n> n <- 10\n> 1:n-1\n> 1:(n-1)\n\nThe rep (\"repeat\") command is very useful:\n\n> rep(1,10)\n> rep(1:3,10)\n> rep(1:3,each=3)\n\nTo find out how many elements a vector has use the length command\n\n> length(x)\n\nThe elements of a vector are accessed with the bracket notation:\n\n> x <- 1:10*5\n> x[3]\n> x[1:3]\n> x[c(1,3,8)]\n> x[-3]\n> x[-c(1,2,5)]\n\n"},{"_id":"6d71d356feb7f47985000026","treeId":"6d70d97c5797d3c002000020","seq":8303540,"position":1,"parentId":"6d71d21bfeb7f47985000025","content":"Let's start with\n\n> ls()\n\nshows you a \"listing\" of the files (data, routines etc.)\n\nIf you have worked for a while you might have things you need to save, do that with\n\nFile > Save Workspace\n\nIf you quit the program without saving your stuff everything you did will be lost. R has a somewhat unusual file system, everything belonging to the same project (data, routines, graphs etc.) are stored in just one file, with the extension .RData.\n\nTo quit R, type\n\n> q()\n\nor click the x in the upper right corner.\n\nR has a nice recall feature, using the up and down arrow keys. Also, typing\n\n> history()\n\nshows you the most recent things entered.\n\nR is case-sensitive, so a and A are two different things.\n\nOften during a session you create objects that you need only for a short time. When you no longer need them use rm to get rid of them:\n\n> x<-1:10\n> sum(x^2)\n> rm(x)\n\nInstead of numbers a vector can also consist of characters (letters, numbers, symbols etc.) These are identified by quotes:\n\n> x <- c(\"A\",\"B\",\"7\",\"%\")\n\nA vector is either numeric or character, but never both. You can turn one into the other (if possible) as follows:\n\n> x <- 1:10\n> as.character(x)\n\n> x <- c(\"1\",\"5\")\n> as.numeric(x)\n\n\nA third type of data is logical, with values either TRUE or FALSE.\n\n> x <- 1:10\n> x>4\n[1] FALSE FALSE FALSE FALSE TRUE TRUE TRUE TRUE TRUE TRUE\n\nthese are often used as conditions:\n\n> x[x>4] \n[1] 5 6 7 8 9 10\n\nData Frames\n\ndata frames are the basic format for data in R. They are essentially vectors put together as columns.\n\n"},{"_id":"6d71d62efeb7f47985000028","treeId":"6d70d97c5797d3c002000020","seq":8244945,"position":0.5,"parentId":null,"content":"Troubleshooting"},{"_id":"6d71d69cfeb7f47985000029","treeId":"6d70d97c5797d3c002000020","seq":8247578,"position":1,"parentId":"6d71d62efeb7f47985000028","content":"**Error: could not find function ____**\n\n There are a few things you should check :\n\nDid you write the name of your function correctly? Names are case sensitive.\nDid you install the package that contains the function? install.packages(\"thePackage\") (this only needs to be done once)\nDid you attach that package to the workspace ? require(thePackage) or library(thePackage) (this should be done every time you start a new R session)\nIf you're not sure in which package that function is situated, you can do a few things.\n\nIf you're sure you installed and attached/loaded the right package, type help.search(\"some.function\") or ??some.function to get an information box that can tell you in which package it is contained.\nfind and getAnywhere can also be used to locate functions.\nIf you have no clue about the package, you can use findFn in the sos package as explained in this answer.\nRSiteSearch(\"some.function\") or searching with rseek are alternative ways to find the function.\n"},{"_id":"6d71fd0efeb7f4798500002f","treeId":"6d70d97c5797d3c002000020","seq":8245031,"position":2,"parentId":"6d71d62efeb7f47985000028","content":"to see what data sets are **attached** use\n\n> search()\n\nthis also shows you which libraries are attached."},{"_id":"6d70d9b65797d3c002000022","treeId":"6d70d97c5797d3c002000020","seq":8244648,"position":1,"parentId":null,"content":"R COncepts"},{"_id":"6d71df0bfeb7f4798500002a","treeId":"6d70d97c5797d3c002000020","seq":8535546,"position":2,"parentId":"6d70d9b65797d3c002000022","content":""},{"_id":"6d70dad35797d3c002000023","treeId":"6d70d97c5797d3c002000020","seq":8244987,"position":2,"parentId":"6d71df0bfeb7f4798500002a","content":"## Paren {base}\tR Documentation\nParentheses and Braces\n\nDescription\n\nOpen parenthesis, (, and open brace, {, are .Primitive functions in R.\n\nEffectively, ( is semantically equivalent to the identity function(x) x, whereas { is slightly more interesting, see examples.\n\nUsage\n\n( \\dots )\n\n{ \\dots }\nValue\n\nFor (, the result of evaluating the argument. This has visibility set, so will auto-print if used at top-level.\n\nFor {, the result of the last expression evaluated. This has the visibility of the last evaluation.\n\nReferences\n\nBecker, R. A., Chambers, J. M. and Wilks, A. R. (1988) The New S Language. Wadsworth & Brooks/Cole.\n\nSee Also\n\nif, return, etc for other objects used in the R language itself.\n\nSyntax for operator precedence.\n\nExamples\n\nf <- get(\"(\")\ne <- expression(3 + 2 * 4)\nidentical(f(e), e)\n\ndo <- get(\"{\")\ndo(x <- 3, y <- 2*x-3, 6-x-y); x; y\n\n## note the differences\n(2+3)\n{2+3; 4+5}\n(invisible(2+3))\n{invisible(2+3)}"},{"_id":"6d70db505797d3c002000024","treeId":"6d70d97c5797d3c002000020","seq":8244963,"position":1,"parentId":"6d70dad35797d3c002000023","content":"hjk"},{"_id":"6f7e215ddea0f68d5c000035","treeId":"6d70d97c5797d3c002000020","seq":8535446,"position":3,"parentId":"6d70d9b65797d3c002000022","content":"##**Data Frames**\n\ndata frames are the basic format for data in R. They are essentially vectors put together as columns.\n\n"},{"_id":"6dd0cc6d432006925d00001b","treeId":"6d70d97c5797d3c002000020","seq":8303537,"position":2,"parentId":null,"content":"Useful Functions"},{"_id":"6dd0ccfb432006925d00001c","treeId":"6d70d97c5797d3c002000020","seq":8303538,"position":1,"parentId":"6dd0cc6d432006925d00001b","content":"ggplot"},{"_id":"6dd0cd71432006925d00001d","treeId":"6d70d97c5797d3c002000020","seq":8303555,"position":1,"parentId":"6dd0ccfb432006925d00001c","content":"\nggMarginal {ggExtra}\tR Documentation\nAdd marginal density/histogram to ggplot2 scatterplots\n\nDescription\n\nCreate a ggplot2 scatterplot with marginal density plots (default) or histograms, or add the marginal plots to an existing scatterplot.\n\nUsage\n\nggMarginal(p, data, x, y, type = c(\"density\", \"histogram\", \"boxplot\"),\n margins = c(\"both\", \"x\", \"y\"), size = 5, ..., xparams, yparams)\nArguments\n\np\t\nA ggplot2 scatterplot to add marginal plots to. If p is not provided, then all of data, x, and y must be provided.\ndata\t\nThe data.frame to use for creating the marginal plots. Optional if p is provided and the marginal plots are reflecting the same data.\nx\t\nThe name of the variable along the x axis. Optional if p is provided and the x aesthetic is set in the main plot.\ny\t\nThe name of the variable along the y axis. Optional if p is provided and the y aesthetic is set in the main plot.\ntype\t\nWhat type of marginal plot to show. One of: [density, histogram, boxplot].\nmargins\t\nAlong which margins to show the plots. One of: [both, x, y].\nsize\t\nInteger describing the relative size of the marginal plots compared to the main plot. A size of 5 means that the main plot is 5x wider and 5x taller than the marginal plots.\n...\t\nExtra parameters to pass to the marginal plots. Any parameter that geom_line(), geom_histogram(), or geom_boxplot() accepts can be used. For example, colour = \"red\" can be used for any marginal plot type, and binwidth = 10 can be used for histograms.\nxparams\t\nList of extra parameters to use only for the marginal plot along the x axis.\nyparams\t\nList of extra parameters to use only for the marginal plot along the y axis.\nValue\n\nAn object of class ggExtraPlot. This object can be printed to show the plots or saved using any of the typical image-saving functions (for example, using png() or pdf()).\n\nNote\n\nThe grid and gtable packages are required for this function.\n\nSince the size parameter is used by ggMarginal, if you want to pass a size to the marginal plots, you cannot use the ... parameter. Instead, you must pass size to both xparams and yparams. For example, ggMarginal(p, size = 2) will change the size of the main vs marginal plot, while ggMarginal(p, xparams = list(size=2), yparams = list(size=2)) will make the density plot outline thicker.\n\nSee Also\n\nDemo Shiny app\n\nExamples\n\n\n# basic usage\np <- ggplot2::ggplot(mtcars, ggplot2::aes(wt, mpg)) + ggplot2::geom_point()\nggMarginal(p)\n \n# using some parameters\nset.seed(30)\ndf <- data.frame(x = rnorm(500, 50, 10), y = runif(500, 0, 50))\np2 <- ggplot2::ggplot(df, ggplot2::aes(x, y)) + ggplot2::geom_point()\nggMarginal(p2)\nggMarginal(p2, type = \"histogram\")\nggMarginal(p2, margins = \"x\")\nggMarginal(p2, size = 2)\nggMarginal(p2, colour = \"red\")\nggMarginal(p2, colour = \"red\", xparams = list(colour = \"blue\", size = 3))\nggMarginal(p2, type = \"histogram\", bins = 10)\n\n# specifying the data directly instead of providing a plot \nggMarginal(data = df, x = \"x\", y = \"y\")\n\n# more examples showing how the marginal plots are properly aligned even when\n# the main plot axis/margins/size/etc are changed\nset.seed(30)\ndf2 <- data.frame(x = c(rnorm(250, 50, 10), rnorm(250, 100, 10)),\n y = runif(500, 0, 50))\np2 <- ggplot2::ggplot(df2, ggplot2::aes(x, y)) + ggplot2::geom_point()\nggMarginal(p2)\n\np2 <- p2 + ggplot2::ggtitle(\"Random data\") + ggplot2::theme_bw(30)\nggMarginal(p2)\n\np3 <- ggplot2::ggplot(df2, ggplot2::aes(log(x), y - 500)) + ggplot2::geom_point()\nggMarginal(p3)\n\np4 <- p3 + ggplot2::scale_x_continuous(limits = c(2, 6)) + ggplot2::theme_bw(50)\nggMarginal(p4)"},{"_id":"6dd0e6e3432006925d00001f","treeId":"6d70d97c5797d3c002000020","seq":8303560,"position":2,"parentId":"6dd0cd71432006925d00001d","content":"# specifying the data directly instead of providing a plot \nggMarginal(data = df, x = \"x\", y = \"y\")\n"},{"_id":"6f7e316cdea0f68d5c000036","treeId":"6d70d97c5797d3c002000020","seq":8535480,"position":3,"parentId":null,"content":"# Help"},{"_id":"6f7e31f0dea0f68d5c000037","treeId":"6d70d97c5797d3c002000020","seq":8535483,"position":1,"parentId":"6f7e316cdea0f68d5c000036","content":"**Student's t-Test\n**\nDescription\n\nPerforms one and two sample t-tests on vectors of data.\n\nUsage\n\nt.test(x, ...)\n\n## Default S3 method:\nt.test(x, y = NULL,\n alternative = c(\"two.sided\", \"less\", \"greater\"),\n mu = 0, paired = FALSE, var.equal = FALSE,\n conf.level = 0.95, ...)\n\n## S3 method for class 'formula'\nt.test(formula, data, subset, na.action, ...)\nArguments\n\nx\t\na (non-empty) numeric vector of data values.\ny\t\nan optional (non-empty) numeric vector of data values.\nalternative\t\na character string specifying the alternative hypothesis, must be one of \"two.sided\" (default), \"greater\" or \"less\". You can specify just the initial letter.\nmu\t\na number indicating the true value of the mean (or difference in means if you are performing a two sample test).\npaired\t\na logical indicating whether you want a paired t-test.\nvar.equal\t\na logical variable indicating whether to treat the two variances as being equal. If TRUE then the pooled variance is used to estimate the variance otherwise the Welch (or Satterthwaite) approximation to the degrees of freedom is used.\nconf.level\t\nconfidence level of the interval.\nformula\t\na formula of the form lhs ~ rhs where lhs is a numeric variable giving the data values and rhs a factor with two levels giving the corresponding groups.\ndata\t\nan optional matrix or data frame (or similar: see model.frame) containing the variables in the formula formula. By default the variables are taken from environment(formula).\nsubset\t\nan optional vector specifying a subset of observations to be used.\nna.action\t\na function which indicates what should happen when the data contain NAs. Defaults to getOption(\"na.action\").\n...\t\nfurther arguments to be passed to or from methods.\nDetails\n\nThe formula interface is only applicable for the 2-sample tests.\n\nalternative = \"greater\" is the alternative that x has a larger mean than y.\n\nIf paired is TRUE then both x and y must be specified and they must be the same length. Missing values are silently removed (in pairs if paired is TRUE). If var.equal is TRUE then the pooled estimate of the variance is used. By default, if var.equal is FALSE then the variance is estimated separately for both groups and the Welch modification to the degrees of freedom is used.\n\nIf the input data are effectively constant (compared to the larger of the two means) an error is generated.\n\nValue\n\nA list with class \"htest\" containing the following components:\n\nstatistic\t\nthe value of the t-statistic.\nparameter\t\nthe degrees of freedom for the t-statistic.\np.value\t\nthe p-value for the test.\nconf.int\t\na confidence interval for the mean appropriate to the specified alternative hypothesis.\nestimate\t\nthe estimated mean or difference in means depending on whether it was a one-sample test or a two-sample test.\nnull.value\t\nthe specified hypothesized value of the mean or mean difference depending on whether it was a one-sample test or a two-sample test.\nalternative\t\na character string describing the alternative hypothesis.\nmethod\t\na character string indicating what type of t-test was performed.\ndata.name\t\na character string giving the name(s) of the data.\nSee Also\n\nprop.test\n\nExamples\n\nrequire(graphics)\n\nt.test(1:10, y = c(7:20)) # P = .00001855\nt.test(1:10, y = c(7:20, 200)) # P = .1245 -- NOT significant anymore\n\n## Classical example: Student's sleep data\nplot(extra ~ group, data = sleep)\n## Traditional interface\nwith(sleep, t.test(extra[group == 1], extra[group == 2]))\n## Formula interface\nt.test(extra ~ group, data = sleep)\n"}],"tree":{"_id":"6d70d97c5797d3c002000020","name":"R Help Center","publicUrl":"r_con_rolke"}}