July 20, 2014

Databases in Bioinformatics - 1

Databases are nowadays an indispensable part of Biology, and of course, of Bioinformatics. Online bioinformatics databases boomed in the last decade. It is impossible for a person to know about all of them. Yet, there are some important databases every bioinformatician should know. Dr. Bob Lessick Associate Director, Center for Biotechnology Education, Johns Hopkins University, mentioned some of those databases in an online course, Bioinformatics: Life Sciences on Your Computer at Coursera. This blog summarizes databases taught in the course.

1. Pubmed (http://pubmed.gov): 
Pubmed is a free database of scientific publications (references and abstracts) on life sciences and biomedical topics. It is hosted by the US National Library of Medicine (NLM) at the National Institutes of Health (NIH). It contains more than 23 million citations from biomedical literature. You can do free-text search as well as advanced search. Following are some example queries.

  • Tan AC [author] (Tan AC [au]) : finds papers authored by persons whose last name is Tan and the initials are AC.
  • Tan AC [au] AND plos one [journal] : finds AC Tan's papers published in the PloS One journal.
  • RNAi [title] AND mello [au]  : finds papers authored by Mello, with RNAi in the title.
  • immunoglo* : finds papers with words starting with immunoglo. Note: * can be put only at the end of the query.
  • "last 10 days" [edat] AND nature [journal]: finds nature papers entered into pubmed in last 10 days.
  • 2014/02 [pdat] AND nature[journal]: finds nature papers published in Feb, 2014.
  • 2014/02:2014/03 AND nature[journal]: finds nature papers published in Feb, 2014.
  • (DNA[title] OR RNA[title] ) AND 2014/02:2014/04[pdat] AND science[journal] : finds science papers published between Feb 2014 and April 2014 which have either DNA or RNA in their titles.
MeSH stands for Medical Subject Headings. It contains controlled vocabulary for medical fields. Multiple words may mean the same phenomena. For example, both P53 and TP53 mean the same gene. MeSH gives specialized vocabularies for this term. And if you search with that vocabulary in PubMed, you'lll get all the papers related to that term, no matter how they are spelled in the manuscript. If you search in MeSH database, you'd get its controlled term - "Genes, p53". Now you can build queries for PubMed like below.
  • "Genes, p53" [MeSH]
  • "Genes, p53" [MeSH] AND nature [journal] AND 2013 [pdat]  
3. NCBI Nucleotide (http://www.ncbi.nlm.nih.gov/nucleotide)
Nucleotide is a database of sequences collected from diffrent sources. You can find various kinds of information about genomes, genes, transcripts, etc.

The following figure shows the top portion of a BRCA1 transcript.
  • Locus contains several information - Accession number, gene length, type (mRNA means it is a splice RNA), genome type (linear or circular), type of organism (PRI = primates), last modification date. 
  • Every time the gene sequence is changed, its version number is added with GI id). 
  • Here, the sequence is from Homo Sapiense organism. 
  • Publication about this sequence is listed in the reference section.
  • You can get the sequence in FASTA format, by clicking on "FASTA" link placed below the title. Note: the sequence is also shown in Gene Bank format at the end of this record page.
  • You can also see the version history and compare those from the "Display Settings" menu. 
The sequence record page also contains features as shown below.

  • Each feature is followed by a location in the sequence. If you click on a feature title in the left column, the corresponding sequence will be highlighted.
  • Here, one exon is located at position 1 to 213.
  • CDS, the coding region including the start codon and stop codon, is an important feature. The translated amion acid sequence is also available here.

July 19, 2014

Prepare a workable Ubuntu desktop computer

I started using computer with the Windows operating system. I started with the ancient Windows-95 back in 1999. I must accept that Windows is pretty easy to learn. But, the bad thing was, we used cracked versions, not the copyrighted ones. Being a poor country, copyright violations were (and still are) very common in Bangladesh. I was not an exception. But I felt guilty inside. I was not aware of Linux at the beginning. I was thrilled when I knew about Linux, a free open-source operating system: I don't have to violate law anymore! But then I found, to my utter dismay, that the system is very very very complex. The installation process, software installation, and basically everything was far more complex than Windows. Graphical interface was very poor; you have to write commands to run your programs; you have to memorize syntax. A scarcity of software made the situation worse. I failed not run mp3 files after trying for a day! Ultimately, I came back to Windows along with the guilty feeling.

Now, Linux has improved very much. It is almost as easy to learn as Windows; sometimes it is even easier than Windows. Especially, Ubuntu has improved a great deal. In the meantime, I also got rid of my command-line fear. Internet has also made software easily available. Few days ago, I thought, why don't I start using Linux now? Although I now use copyrighted version of Windows, I still feel a soft corner Linux philosophy - free and open source. So, I decided to move to Ubuntu.

Ubuntu is now mature enough. It is really easy to use these day. If you want to use it, may be to avoid the guilty feeling of law violation, you just need to have an honest will, and a good internet connection. That's it!

Here I note down a few instructions I used to prepare a workable Ubuntu desktop computer for me.

Basic Installation

  1. Download Ubuntu. I downloaded the 32-bit version. [url
  2. Create a bootable usb drive. (You may also create a DVD) [url
  3. Plug-in the usb (or DVD) and (re)start your computer. 
  4. Set your computer to boot from USB (or DVD). You have to choose appropriate boot order. The priority of USB (or DVD) has to be higher than that of hard disk. [url]
  5. Install Ubuntu. [url]

Software Installations

You can install common software from Ubuntu Software Center using graphical user interface (gui) easily. However, sometimes you may need to write commands in Terminal (Press Ctrl+Alt+T). Personally, I prefer terminal to gui. So, here I note down the commands to write in terminal.




  1. Avro. [url]
  2. Skype. 
    1. Download: [url
    2. Install command: sudo dpkg -i skype-ubuntu-precise_4.2.0.11-1_i386.deb 
    3. Sound troubleshooting: [url]
    4. Background noise: System Settings > Sound > Input. Make sure your recording device is not amplified.
    5. To run skype with pulseaudio: PULSE_LATENCY_MSEC=60 skype
  3. Google Chrome 
    1. Download: [url
    2. Install command: sudo dpkg -i google-chrome-stable_current_i386.deb 
  4. Dictionary plug-in in Chrome [url
  5. VLC media player [url
    1. Install command: sudo apt-get install vlc 
  6. Git 
    1. sudo apt-get install git 
    2. git config --global user.name "User Name" 
    3. git config --global user.email "UserEmail@example.com" 
  7. Apache Web Server [url
    1. sudo apt-get install apache2 
    2. check by hitting in browser: http://localhost/ 
  8. PHP 
    1. sudo apt-get install php5 
    2. sudo apt-get install libapache2-mod-php5 
    3. To check installation, create a small php page (test.php) in /var/www/html
      <?php
      phpinfo();
      ?>
    4. Finally, hit in web browser (mozilla/chrome) with http://localhost/test.php. See this for detail configuration. 
  9. Latex [url
    1. TexStudio: sudo apt-get install texstudio 
    2. Extra packages: sudo apt-get install texlive-latex-extra 
  10. R. [url
  11. RStudio. 
    1. Download: [url
    2. Install command: sudo dpkg -i rstudio-0.98.945-i386.deb 
  12. Teamviewer [url
    1. Download: wget http://download.teamviewer.com/download/teamviewer_linux.deb 
    2. sudo dpkg -i teamviewer_linux.deb 
  13. Copy [url
    1. wget https://copy.com/install/linux/Copy.tgz 
    2. sudo tar -xvpzf Copy.tgz -C /etc 
    3. cd /etc/copy/ 
    4. ./x86/CopyAgent 
  14. Bittorent
    1. Install command:  sudo apt-get install bittorrent
  15. MySQL:
    1. sudo apt-get install mysql-server
    2. sudo apt-get install mysql-workbench
  16. SQLite:
    1. sudo apt-get install sqlite
    2. sudo apt-get install sqlitebrowser 

Tips

  1. Shortcut to open a terminal: Ctrl + Alt + T 
  2. Shortcut to copy-paste Terminal: Ctrl+Shift+C / Ctrl+Shift+V 
  3. Shortcut to toggle desktop: Ctrl+Super+D 
  4. Keyboard shortcuts: [url
  5. Install/Uninstall *.deb files. [url
    1. sudo dpkg -i package_file.deb 
  6. Fix dependency error / installation error: 
    1. sudo apt-get -f install  
  7. Disable certificate authentication for wifi connection: 
    1. Open /etc/NetworkManager/system-connections/YOUR-CONNECTION file. 
    2. Edit one configuration: system-ca-certs=false 
  8. If you cannot edit a text file, you might need administrator permission. You can do so by opening gedit (text editor) in admin mode, and then edit the file. 
    1. command: sudo gedit 
  9. Close an unresponsive program: [url]
    1. Write a command in terminal: xkill
    2. You mouse will look like 'x' and click on the unresponsive program window.