Course ID: QCB 508 / QCB 408
Title: Foundations of Statistical Genomics
Semester: 2022 Spring
Instructor: Dr. John D. Storey
Preface: This is a brief summary based on the conversations during Yushi’s office hour in Spring 2022. Due to the pandemic, we needed to hold our office hours virtually for this semester. I was unable to provide detailed proof or answers without a blackboard. I listed these details here and I hope that they can help our friends to learn something.
Mar 17, 2022
\(\underline{Understand\,\,distribution\,\,of\,\,a\,\,random\,\,variable\,\,(a\,\,simplified\,\,version)}\)
If my note on May 15, 2022 is too much, please check this simplified explanation and lemma.
\(\underline{Lemma}\). Let \((\Omega,\mathcal{F},\mathbb{P})\) be a probability space, \((E,\mathcal{E})\) a measurable space, \(A\) an event in \(\mathcal{E}\). Define the indicator function \[\begin{equation}\mathbb{1}_{A}=\mathbb{1}_{A}(X) = \mathbb{1}_{A}\circ X = \begin{cases} 1, \,\,\,\text{if } X\in A,\\0, \,\,\,\text{if } X\notin A. \end{cases} \end{equation}\] Then \[\begin{equation} \mathbb{P}(X\in A)=\mathbb{E}[\mathbb{1}_{A}]. \end{equation}\]
Proof. Intuitively \[\begin{equation} \mathbb{P}(X\in A) = \mathbb{P}(X\in A)\cdot 1 + \mathbb{P}(X\notin A)\cdot 0 = \mathbb{P}(\mathbb{1}_{A}=1)\cdot 1 + \mathbb{P}(\mathbb{1}_A=0)\cdot 0 = \sum_i i \cdot \mathbb{P}(\mathbb{1}_A=i) = \mathbb{E}[\mathbb{1}_{A}]. \end{equation}\]
\(\underline{Example\,\,(Distribution\,\,of\,\,random\,\,genotype)}.\) Let \(X\) be the random genotype in the random-allele-frequency model. We want to characterize the distribution of \(X\) but the only information we know is the conditional distribution of \(X\) given individual-allele-frequency \(\pi\), written as \(X|\pi \sim \text{Binomial}(2,\pi)\). Here \(\pi\) comes from some other distributions. A strong candidate for this is BN\((p,f)\). We can use the above corollary to derive the distribution of \(X\). There are three possible outcomes of \(X\) as 0, 1, or 2. For event \(A_2=\{\omega\in \Omega: X(\omega)=2\}\), we have \[\begin{align} \mathbb{P}(X=2) &= \mathbb{P}(X\in A_2)= \mathbb{P}(X\in A_2)\cdot 1 + \mathbb{P}(X\notin A_2)\cdot 0 =\mathbb{E}[\mathbb{1}_{A_2}] = \mathbb{E}\big[\mathbb{E}[\mathbb{1}_{A_2}|\pi]\big] \nonumber \\ &= \mathbb{E}\big[\mathbb{E}[\mathbb{1}_{X=2}|\pi]\big] = \mathbb{E}[\pi^2] = (\mathbb{E}[\pi])^2 + \mathbb{V}(\pi). \end{align}\] The same strategy yields \(\mathbb{P}(X=0)\) and \(\mathbb{P}(X=1)\).
Mar 15, 2022
\(\underline{Understand\,\,distribution\,\,of\,\,a\,\,random\,\,variable\,\,by\,\,integrals}\)
\(\underline{Notations}\). Let \((\Omega,\mathcal{F},\mathbb{P})\) be a probability space, \((E,\mathcal{E})\) be a measurable space. Expectations of random variable \(X:\Omega\mapsto E\) can be interpreted as integrals with respect to \(\mathbb{P}\). Let \(\mu\) be the distribution of \(X\). We can interpret \(\mu\) as the image of \(\mathbb{P}\) under \(X\). To better understand the red text, let \(\mu=\mathbb{P}\circ X^{-1}\) while defining mapping \(\mathbb{P}\circ X^{-1}:\mathcal{E}\mapsto \bar{\mathbb{R}}_{+}\) by \[\begin{equation} \mathbb{P}\circ X^{-1}(A) = \mathbb{P}\big(X^{-1}(A)\big),\,\,\,\,\,\,A\in \mathcal{E}, \end{equation}\] which is a measure on \((E,\mathcal{E})\). For event \(A\in \mathcal{E}\), we write the probability that \(X\) is in \(A\) as \[\begin{equation} \mu(A) = \mathbb{P}\circ X^{-1}(A) = \mathbb{P}\big(X^{-1}(A)\big) = \mathbb{P}(X\in A) = \mathbb{P}\{\omega \in \Omega:X(\omega) \in A\}, \end{equation}\] which is the image of \(\mathbb{P}\) under \(X\). Note that \(X^{-1}(A)\) denotes the event for every \(A\) in \(\mathcal{E}\). i.e., \(X^{-1}(A) = \{X\in A\} = \{\omega \in \Omega:X(\omega) \in A\}\). It also suggests that \(X\) is measurable relative to \(\mathcal{F}\) and \(\mathcal{E}\).
\(\underline{Theorem}\). Let function \(f:E\mapsto \Omega\) be measurable relative to \(\mathcal{E}\) and \(\mathcal{F}\). We define the composition \(Y=f\circ X\), which is a random variable taking values in \((\Omega, \mathcal{F})\) by \[\begin{equation} Y(\omega) = f\circ X(\omega) = f\big(X(\omega)\big),\,\,\,\,\,\,\text{for}\,\,\,\omega \in \Omega. \end{equation}\] If \(f\) is positive \(\mathcal{E}\)-measurable, then \[\begin{equation} \mathbb{E}[f\circ X] = \int_{\Omega}\mathbb{P}(d\omega)f\big(X(\omega)\big) = \int_{\Omega} f\big(X(\omega)\big) d\mathbb{P} = \mathbb{P}(f\circ X) = (\mathbb{P}\circ X^{-1})(f) = \mu(f). \end{equation}\] Notice that \(\mathbb{P}(f\circ X) = (\mathbb{P}\circ X^{-1})(f)\) holds since the function \(g(f)=\mathbb{P}(f\circ X)\), which is a mapping \(g: \mathcal{E}_{+}\mapsto \bar{\mathbb{R}}_{+}\), satisfies that \(g(f)=\mu(f)\) for a unique measure \(\mu\) on \((E,\mathcal{E})\) according to the integral characterization theorem. More specifically, this unique measure \(\mu\) is \(\mathbb{P}\circ X^{-1}\) due to the fact that \(\mu(A)=g(\mathbb{1}_{A})=\mathbb{P}(\mathbb{1}_A\circ X)=\mathbb{P}\big(X^{-1}(A)\big)=\mathbb{P}\circ X^{-1}(A)\) for \(A\in \mathcal{E}\). Check the corollary below for detailed definition of \(\mathbb{1}_{A}\circ X\).
\(\underline{Corollary}\). Define the indicator function \[\begin{equation}\mathbb{1}_{A}(X) = \mathbb{1}_{A}\circ X = \begin{cases} 1, \,\,\,\text{if } X\in A,\\0, \,\,\,\text{if } X\notin A. \end{cases} \end{equation}\] Notice that \(\mathbb{1}_{A}\) is positive \(\mathcal{E}\)-measurable. Let \(f=\mathbb{1}_{A}\), then we write \(\mu\), the distribution of \(X\), as \[\begin{equation} \mu(A) = \mu (\mathbb{1}_{A}) = \mathbb{E}[\mathbb{1}_{A}\circ X], \end{equation}\] which is the integral understanding of the distribution of a random variable. A useful application of this corollary is to calculate the p.d.f., i.e., to characterize the distribution, of a random variable when having prior information regarding the target random variable conditioning on other variables. Let us take a look at the following example.
\(\underline{Example\,\,(Distribution\,\,of\,\,random\,\,genotype)}.\) Let \(X\) be the random genotype in the random-allele-frequency model. We want to characterize the distribution of \(X\) but the only information we know is the conditional distribution of \(X\) given individual-allele-frequency \(\pi\), written as \(X|\pi \sim \text{Binomial}(2,\pi)\). Here \(\pi\) comes from some other distributions. A strong candidate for this is BN\((p,f)\). We can use the above corollary to derive the distribution of \(X\). There are three possible outcomes of \(X\) as 0, 1, or 2. For event \(A_0=\{\omega\in \Omega: X(\omega)=0\}\), we have \[\begin{align} \mathbb{P}(X=0) &= \mathbb{P}(X\in A_0)= \mathbb{P}(X\in A_0)\cdot 1 + \mathbb{P}(X\notin A_0)\cdot 0 =\mathbb{E}[\mathbb{1}_{A_0} \circ X] = \mathbb{E}\big[\mathbb{E}[\mathbb{1}_{A_0}\circ X|\pi]\big] \nonumber \\ &= \mathbb{E}\big[\mathbb{E}[\mathbb{1}_{X=0}|\pi]\big] = \mathbb{E}\big[(1-\pi)^2\big] = \cdots \end{align}\] The remaining path can be walked through by following \(\mathbb{E}[Z^2]=(\mathbb{E}[Z])^2+\mathbb{V}(Z)\) while taking \(Z=1-\pi\). The same strategy yields \(\mathbb{P}(X=1)\) and \(\mathbb{P}(X=2)\).
Feb 24, 2022
\(\underline{Conditional\,\,distribution\,\,to\,\,model\,\,the\,\,allele\,\,evolution\,\,in\,\,finite\,\,population}\)
Feb 22, 2022
\(\underline{Basics\,\,of\,\,IBD\,\,model:\,\,dissemble\,\,genotype\,\,into\,\,alleles}\)
Let \(X\) and \(Y\) be two genotypes at the same loci drawn randomly from a population under the identical-by-descent (IBD) model with allele frequency \(p\) and inbreeding coefficient \(f\). We are interested in evaluating the covariance between genotype \(X\) and \(Y\). Recall that IBD is defined based on alleles, i.e., the probability that two alleles are IBD. Hence a basic setup is to decompose \(X\) and \(Y\) into alleles level. Suppose \(X\) is supported by two alleles \(X_1\) and \(X_2\), \(Y\) is supported by two alleles \(Y_1\) and \(Y_2\). Suppose \(X_1\) (and \(Y_1\)) comes from the maternal side and \(X_2\) (and \(Y_2\)) comes from the paternal side. We write \[X=X_1+X_2,\,\,\,\,\,\,Y=Y_1+Y_2,\] thereby deriving their covariance \[\begin{align} \text{Cov}(X,Y) &= \text{Cov}(X_1+X_2, Y_1+Y_2) \nonumber \\ &= \mathbb{E}[(X_1+X_2)(Y_1+Y_2)]-\mathbb{E}[X_1+X_2]\mathbb{E}[Y_1+Y_2] \nonumber \\ &= \mathbb{E}[X_1Y_1]+\mathbb{E}[X_1Y_2]+\mathbb{E}[X_2Y_1]+\mathbb{E}[X_2Y_2] - \mathbb{E}[X_1]\mathbb{E}[Y_1]-\mathbb{E}[X_1]\mathbb{E}[Y_2]-\mathbb{E}[X_2]\mathbb{E}[Y_1]-\mathbb{E}[X_2]\mathbb{E}[Y_2] \nonumber \\ &= (\mathbb{E}[X_1Y_1]-\mathbb{E}[X_1]\mathbb{E}[Y_1]) + (\mathbb{E}[X_1Y_2]-\mathbb{E}[X_1]\mathbb{E}[Y_2]) + (\mathbb{E}[X_2Y_1]-\mathbb{E}[X_2]\mathbb{E}[Y_1]) + (\mathbb{E}[X_2Y_2]-\mathbb{E}[X_2]\mathbb{E}[Y_2]) \nonumber \\ &= \text{Cov}(X_1,Y_1) + \text{Cov}(X_1,Y_2) + \text{Cov}(X_2,Y_1) + \text{Cov}(X_2,Y_2) = \sum_{j=1}^2 \sum_{i=1}^2 \text{Cov}(X_i,Y_j) \nonumber \\ &= 4\text{Cov}(X_1,Y_1). \end{align}\]
The last line holds since all 4 terms of covariance in line 5 are equal.
Feb 15, 2022
\(\underline{Hints\,\,for\,\,the\,\,law\,\,of\,\,total\,\,variance}\)
Let us begin by proving a simple version of the law of total variance
\[\begin{align} \mathbb{V}(X) &= \mathbb{E}\big[(X-\mathbb{E}[X])^2\big] \nonumber \\ &= \mathbb{E}\big[(X-\mathbb{E}[X|Y]+\mathbb{E}[X|Y]-\mathbb{E}[X])^2\big] \nonumber \\ &= \Big\{\mathbb{E}\big[(X-\mathbb{E}[X|Y])^2\big] + 2\mathbb{E}\big[(X-\mathbb{E}[X|Y])(\mathbb{E}[X|Y]-\mathbb{E}[X])\big] \Big\}+ \mathbb{E}\big[(\mathbb{E}[X|Y]-\mathbb{E}[X])^2\big] \nonumber \\ &= \Big\{\mathbb{E}[X^2]-\mathbb{E}[(\mathbb{E}[X|Y])^2]\Big\} + \mathbb{E}\big[(\mathbb{E}[X|Y]-\mathbb{E}[\mathbb{E}[X|Y]])^2\big] \nonumber \\ &= \mathbb{E}\big[\mathbb{E}[X^2|Y]-(\mathbb{E}[X|Y])^2\big] + \mathbb{V}(\mathbb{E}[X|Y]) \nonumber \\ &= \mathbb{E}[\mathbb{V}(X|Y)] + \mathbb{V}(\mathbb{E}[X|Y]). \end{align}\]
Line 1 (expand the variance/covariance term by definition) and line 2 (little trick to introduce the conditional term) may provide hints for proving the following equation
\[\begin{equation} \text{Cov}(X,Y)=\mathbb{E}[\text{Cov}(X,Y|Z)]+\text{Cov}(\mathbb{E}[X|Z],\mathbb{E}[Y|Z]). \end{equation}\]
More specifically, try to think about two questions: 1) what is the math definition of \(\text{Cov}(X,Y)\); 2) how to introduce the conditional terms into the expanded version of \(\text{Cov}(X,Y)\). More details for deriving the first term in line 4 from line 3
\[\begin{align} \mathbb{E}\big[(X-\mathbb{E}[X|Y])\big]&=\mathbb{E}[X^2]-2\mathbb{E}\big[X\mathbb{E}[X|Y]\big]+\mathbb{E}\big[(\mathbb{E}[X|Y])^2\big] \nonumber \\ 2\mathbb{E}\big[(X-\mathbb{E}[X|Y])(\mathbb{E}[X|Y]-\mathbb{E}[X])\big] &= 2\mathbb{E}\big[X\mathbb{E}[X|Y]\big]-2\mathbb{E}\big[XE[X]\big]-2\mathbb{E}\big[(\mathbb{E}[X|Y])^2\big]+2\mathbb{E}\big[\mathbb{E}[X|Y]\mathbb{E}[X]\big] \nonumber \\ &= 2\mathbb{E}\big[X\mathbb{E}[X|Y]\big]-2(\mathbb{E}[X])^2-2\mathbb{E}\big[(\mathbb{E}[X|Y])^2\big]+2(\mathbb{E}[X])^2 \nonumber \\ &= 2\mathbb{E}\big[X\mathbb{E}[X|Y]\big]-2\mathbb{E}\big[(\mathbb{E}[X|Y])^2\big] \nonumber \\ \Rightarrow \mathbb{E}\big[(X-\mathbb{E}[X|Y])\big]+ 2\mathbb{E}\big[(X-\mathbb{E}[X|Y])(\mathbb{E}[X|Y]-\mathbb{E}[X])\big] &= \mathbb{E}[X^2]-\mathbb{E}\big[(\mathbb{E}[X|Y])^2\big]. \end{align}\]
Course ID: STAT 215 / STAT 115 / BST 282
Title: Introduction to Computational Biology and Bioinformatics
Semester: 2019 Spring
Instructor: Dr. X. Shirley Liu
Course ID: BST 215
Title: Linear and Longitudinal Regression
Semester: 2018 Summer
Instructor: Dr. Garrett Fitzmaurice