Bioinformatic languages

Unfortunately you cannot do everything you would want directly on the Linux command line. Even the tasks you can do are sometimes not very efficient or easy. Fortunately there are many other programming languages you can use. There is a large amount of other programming languages and it can be hard to know which one to learn. Below is a list of commonly used bioinformatic program languages with a brief summary on their purpose and links to online resources to introduce you to the language.

awk

awk is typically used as a data extraction and reporting tool. It is very good due to its power and versatility.

Tutorial:

http://www.grymoire.com/Unix/Awk.html

Manual:

https://www.gnu.org/software/gawk/manual/gawk.html

Python

Python is used for software development and other applications It is favoured in Bioinformatics due to its relative ease to learn and it is able to handle strings well (i.e. genetic code). There are also many packages for python that help the analysis of biological data and other scientific data.

Python website:

https://www.python.org/

BioPython:

https://biopython.org/

Perl

Perl is a similar language to python with similar uses. The main difference is how they look.

Tutorial:

https://www.tutorialspoint.com/perl/perl_introduction.htm

Perl website:

https://www.perl.org/

BioPerl:

https://bioperl.org/

Python or Perl?

Generally speaking people normally choose to either learn python or perl. However there are many bioinformatic tools written in perl and many written in python. Therefore it is good to know a bit in both so you can debug scripts and specialise in one so you can write your own scripts. Our suggestion would be to specialise in Python due to its ease to learn and its growing popularity in the Bioinformatics field. However there are also other upcoming programming languages that are becoming more popular in bioinformatics, so you may like to give them a look.

Ruby

A programming language that is becoming more popular in bioinformatics due to its beginner friendliness.

Tutorial:

http://rubylearning.com/

Ruby website:

https://www.ruby-lang.org/en/

BioRuby:

http://bioruby.org/

Why learn Ruby?

http://www.bestprogramminglanguagefor.me/why-learn-ruby

Golang (AKA Go)

Another easy programming language with a library specifically made for bioinformatics.

Tutorial:

https://tour.golang.org/welcome/1

Go website:

https://golang.org/

biogo:

https://github.com/biogo/biogo

R

R is a very powerful programming language for statistical analysis and visualisation. It unfortunately has a large barrier to entry and normally quite unclear documentation. However it will unlikely be surpassed by another language any time soon due to its widespread use and large amount of very useful and powerful packages for various uses in the public domain. We would recommend using the IDE Rstudio when using R.

Tutorial:

https://swirlstats.com/

R Website:

https://www.r-project.org/

Cran Website:

https://cran.r-project.org/

Rstudio Website:

https://www.rstudio.com/