- screen: stopped jobs / resume / kill. unix.stackexchange
- shell script - replace a character in a string: stackoverflow
July 3, 2015
Useful links for unix
June 18, 2015
Useful links for plots
R:
- ggplot2 tutorials: Edwin Chen's blog CEB institute Quick-R
- Box plot: ggplot2
- Violin plot: ggplot2
- Multiple subplots: sthda.com
Python:
- Plots: matplotlib
- Error in plot when DISPLAY is undefined: stack overflow.
- Drawing distribution line using histogram. stack overflow.
March 30, 2015
Tag SNP and Singleton SNP
Tag SNP:
A group of SNPs in a region of a genome may be in high linkage disequilibrium (LD). In such case, one SNP, called a tag SNP, represents the whole group. As sequencing SNPs is costly, often only the tag SNP, instead of the all the SNPs in the group, is sequenced to find genetic variation that may be associated with a phenotype.
Singleton SNP:
Sometimes one tag SNP represents only itself i.e., it is not in high LD with any other SNPs in that region. Such tag SNP is called a singleton SNP.
References:
1) Wikipedia, Tag SNP, accessed on 29 March 2015.
2) Xiayi Ke et al. (2008), Singleton SNPs in the human genome and implications for genome-wide association studies, European Journal of Human Genetics, 16, 506–515.
February 26, 2015
Scale parameter in probability distributions
A parameter s is called a scale parameter of a probability distribution when it follows the following property:
F(x|s,θ) = F(x/s | 1,θ)
Here, F is the cumulative distribution function (cdf) and θ represents other parameters in the distribution.
When the probability distribution function (pdf) is defined for all values, then s must follow the following property:
fs(x) = f(x/s)/s
Here, f is pdf of the standard distribution, and fs is pdf of the scaled distributions.
Intuitively, higher scale parameter means higher spread of the distribution. Standard normal distribution with different scales are shown in the following plot.
And, here is matlab code to generate the above plot.
F(x|s,θ) = F(x/s | 1,θ)
Here, F is the cumulative distribution function (cdf) and θ represents other parameters in the distribution.
When the probability distribution function (pdf) is defined for all values, then s must follow the following property:
fs(x) = f(x/s)/s
Here, f is pdf of the standard distribution, and fs is pdf of the scaled distributions.
Intuitively, higher scale parameter means higher spread of the distribution. Standard normal distribution with different scales are shown in the following plot.
And, here is matlab code to generate the above plot.
scales = [5,2,1,0.5,0.3]; x = -10:.001:10; for s = scales y = normpdf(x./s) ./ s ; plot(x,y); hold on end legend('scale=5', 'scale=2', 'scale=1', 'scale=0.5', 'scale=0.3') title('Standard normal distribution with different scale parameters')
February 22, 2015
Useful Bioinformatics Links
- Samtools:
- Install: sudo apt-get install samtools
- References
- Convert BAM file to Fastq:
- Install picard: sudo apt-get install picard-tools
- Use SamToFastq command.
- Picard project page
Subscribe to:
Posts (Atom)