{ "cells": [ { "cell_type": "markdown", "metadata": {}, "source": [ "### I am going to make up the tech variable as a batch effect. For example, tech1 is measured by technician 1 and tech2 is measure by technician 2. And I am not interested in the effect of the tech variale, but would like to ignore or remove the effect of the tech variable. " ] }, { "cell_type": "code", "execution_count": 5, "metadata": {}, "outputs": [ { "data": { "text/html": [ "\n", "\n", "\n", "\t\n", "\t\n", "\n", "\n", "\t\n", "\t\n", "\t\n", "\t\n", "\t\n", "\t\n", "\n", "
A data.frame: 6 × 2
weightgroup
<dbl><fct>
14.17ctrl
25.58ctrl
35.18ctrl
46.11ctrl
54.50ctrl
64.61ctrl
\n" ], "text/latex": [ "A data.frame: 6 × 2\n", "\\begin{tabular}{r|ll}\n", " & weight & group\\\\\n", " & & \\\\\n", "\\hline\n", "\t1 & 4.17 & ctrl\\\\\n", "\t2 & 5.58 & ctrl\\\\\n", "\t3 & 5.18 & ctrl\\\\\n", "\t4 & 6.11 & ctrl\\\\\n", "\t5 & 4.50 & ctrl\\\\\n", "\t6 & 4.61 & ctrl\\\\\n", "\\end{tabular}\n" ], "text/markdown": [ "\n", "A data.frame: 6 × 2\n", "\n", "| | weight <dbl> | group <fct> |\n", "|---|---|---|\n", "| 1 | 4.17 | ctrl |\n", "| 2 | 5.58 | ctrl |\n", "| 3 | 5.18 | ctrl |\n", "| 4 | 6.11 | ctrl |\n", "| 5 | 4.50 | ctrl |\n", "| 6 | 4.61 | ctrl |\n", "\n" ], "text/plain": [ " weight group\n", "1 4.17 ctrl \n", "2 5.58 ctrl \n", "3 5.18 ctrl \n", "4 6.11 ctrl \n", "5 4.50 ctrl \n", "6 4.61 ctrl " ] }, "metadata": {}, "output_type": "display_data" } ], "source": [ "head(PlantGrowth)" ] }, { "cell_type": "code", "execution_count": 6, "metadata": {}, "outputs": [ { "data": { "text/html": [ "\n", "\n", "\n", "\t\n", "\t\n", "\n", "\n", "\t\n", "\t\n", "\t\n", "\t\n", "\t\n", "\t\n", "\n", "
A data.frame: 6 × 3
weightgrouptech
<dbl><fct><fct>
14.17ctrltech1
25.58ctrltech2
35.18ctrltech1
46.11ctrltech2
54.50ctrltech1
64.61ctrltech2
\n" ], "text/latex": [ "A data.frame: 6 × 3\n", "\\begin{tabular}{r|lll}\n", " & weight & group & tech\\\\\n", " & & & \\\\\n", "\\hline\n", "\t1 & 4.17 & ctrl & tech1\\\\\n", "\t2 & 5.58 & ctrl & tech2\\\\\n", "\t3 & 5.18 & ctrl & tech1\\\\\n", "\t4 & 6.11 & ctrl & tech2\\\\\n", "\t5 & 4.50 & ctrl & tech1\\\\\n", "\t6 & 4.61 & ctrl & tech2\\\\\n", "\\end{tabular}\n" ], "text/markdown": [ "\n", "A data.frame: 6 × 3\n", "\n", "| | weight <dbl> | group <fct> | tech <fct> |\n", "|---|---|---|---|\n", "| 1 | 4.17 | ctrl | tech1 |\n", "| 2 | 5.58 | ctrl | tech2 |\n", "| 3 | 5.18 | ctrl | tech1 |\n", "| 4 | 6.11 | ctrl | tech2 |\n", "| 5 | 4.50 | ctrl | tech1 |\n", "| 6 | 4.61 | ctrl | tech2 |\n", "\n" ], "text/plain": [ " weight group tech \n", "1 4.17 ctrl tech1\n", "2 5.58 ctrl tech2\n", "3 5.18 ctrl tech1\n", "4 6.11 ctrl tech2\n", "5 4.50 ctrl tech1\n", "6 4.61 ctrl tech2" ] }, "metadata": {}, "output_type": "display_data" } ], "source": [ "tech <- rep(c('tech1', 'tech2'),nrow(PlantGrowth)/2)\n", "PlantGrowth$tech <- as.factor(tech)\n", "head(PlantGrowth)" ] }, { "cell_type": "code", "execution_count": 7, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "'data.frame':\t30 obs. of 3 variables:\n", " $ weight: num 4.17 5.58 5.18 6.11 4.5 4.61 5.17 4.53 5.33 5.14 ...\n", " $ group : Factor w/ 3 levels \"ctrl\",\"trt1\",..: 1 1 1 1 1 1 1 1 1 1 ...\n", " $ tech : Factor w/ 2 levels \"tech1\",\"tech2\": 1 2 1 2 1 2 1 2 1 2 ...\n" ] } ], "source": [ "str(PlantGrowth)" ] }, { "cell_type": "code", "execution_count": 8, "metadata": { "scrolled": true }, "outputs": [ { "data": { "text/plain": [ "\n", "Call:\n", "lm(formula = weight ~ group + tech, data = PlantGrowth)\n", "\n", "Residuals:\n", " Min 1Q Median 3Q Max \n", "-0.9710 -0.3765 -0.0330 0.2100 1.2600 \n", "\n", "Coefficients:\n", " Estimate Std. Error t value Pr(>|t|) \n", "(Intercept) 5.1410 0.2280 22.550 <2e-16 ***\n", "grouptrt1 -0.3710 0.2792 -1.329 0.1955 \n", "grouptrt2 0.4940 0.2792 1.769 0.0886 . \n", "techtech2 -0.2180 0.2280 -0.956 0.3478 \n", "---\n", "Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1\n", "\n", "Residual standard error: 0.6244 on 26 degrees of freedom\n", "Multiple R-squared: 0.2891,\tAdjusted R-squared: 0.2071 \n", "F-statistic: 3.525 on 3 and 26 DF, p-value: 0.02881\n" ] }, "metadata": {}, "output_type": "display_data" } ], "source": [ "fit <- lm(weight ~ group + tech, data=PlantGrowth)\n", "summary(fit)" ] }, { "cell_type": "code", "execution_count": 20, "metadata": {}, "outputs": [ { "data": { "text/html": [ "
(Intercept)
5.141
grouptrt1
-0.371
grouptrt2
0.494
techtech2
-0.218
\n" ], "text/latex": [ "\\begin{description*}\n", "\\item[(Intercept)] 5.141\n", "\\item[grouptrt1] -0.371\n", "\\item[grouptrt2] 0.494\n", "\\item[techtech2] -0.218\n", "\\end{description*}\n" ], "text/markdown": [ "(Intercept)\n", ": 5.141grouptrt1\n", ": -0.371grouptrt2\n", ": 0.494techtech2\n", ": -0.218\n", "\n" ], "text/plain": [ "(Intercept) grouptrt1 grouptrt2 techtech2 \n", " 5.141 -0.371 0.494 -0.218 " ] }, "metadata": {}, "output_type": "display_data" } ], "source": [ "coefficients(fit)" ] }, { "cell_type": "code", "execution_count": 9, "metadata": { "scrolled": true }, "outputs": [ { "data": { "text/plain": [ " group emmean SE df lower.CL upper.CL\n", " ctrl 5.03 0.197 26 4.63 5.44\n", " trt1 4.66 0.197 26 4.26 5.07\n", " trt2 5.53 0.197 26 5.12 5.93\n", "\n", "Results are averaged over the levels of: tech \n", "Confidence level used: 0.95 " ] }, "metadata": {}, "output_type": "display_data" } ], "source": [ "emmeans(fit, \"group\")" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "#### My question is, if I knew the coefficient of tech1, then ignoring both of tech(1,2) coefficients would be a way of estimating the effect of group? \n", "\n", "#### Is there a nice way of removing the unwanted effect? " ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [] } ], "metadata": { "kernelspec": { "display_name": "R", "language": "R", "name": "ir" }, "language_info": { "codemirror_mode": "r", "file_extension": ".r", "mimetype": "text/x-r-source", "name": "R", "pygments_lexer": "r", "version": "4.0.4" } }, "nbformat": 4, "nbformat_minor": 4 }