top of page
  • Writer's picturecebercoto

First Steps

Ok, so where do you start to do bioinformatics? In the only possible place, in the Linux command line. What is Linux? If you have to ask that question, we are not to a great start. Linux is a free, open-source operative system (OS) that anyone can download and install replacing or alongside Windows (or iOS). Regarding iOS, this is also an UNIX OS like Linux is, but instead of free and open-source, it is paid and closed-source, and this will be the only time I will be talking about it. Bioinformatics, or science in general, should not be done in an OS as closed and endogamic as iOS.


So you need a PC with Linux on it. You can remove Windows altogether and install Linux (a little bit radical for newbies), you can reformat your HDD to leave one partition for Windows and another for Linux (which also involves deleting Windows at some point), you can add an additional HDD (or favourably SSD) and install Linux there so you don’t have to touch your Windows installation anywhere, you can install a virtual machine in your windows installation (like VirtualBox or VMWare) and install Linux within it (this process can be tricky and usually requires a very powerful PC to run smoothly, maybe I will do a post on how to do this in the future from the point of view of bioinformatics), or, the best option, if you have Windows 10 you are in luck since now (as of the date of writing this), if you are in the insiders preview, you can run Linux natively within windows with the so-called “Windows Subsystem for Linux”. I believe this feature will pass on to the stable Windows 10 like in a month or two from now, in case you want to wait. For this, just go to the Microsoft store and look for “Linux”: you’ll get a big link to the Linux apps with available flavours of Linux (as I am writing this, Ubuntu, openSUSE and SUSE Enterprise).


And what is a flavour (or a distribution) of Linux? It the compilation or package of the OS that a bunch of people have put together. Some of them are free, some of them are paid (which is absurd to me). The most popular distributions include Debian, Ubuntu, Readhat and Fedora. For bioinformatics for beginners, I would recommend you use Ubuntu, since it has everything, for free and it is the easiest one of them to install and manage if you are a total n00b. here is the link of the Windows Subsystem for Ubuntu in Windows 10. You just have to click install and BOOM, you have Ubuntu in your Windows 10 PC running natively, fully featured and with access to 100% of the hardware in your box. It is impossible to make it easier.


Another option is that you have access to some High Performance Computer (HPC) which could be part of your University or Business infrastructure. These are usually quite powerful, but a pain to use in general for some reasons that I will go into in another post, and definitely not the thing you want for learning. Your much better off for learning with your intel i5 or preferably i7 and your 16 Gb or preferably 32 Gb of RAM memory.


One last semi-popular option these days for bioinformatics is to use a cloud computing service like Amazon’s or Google’s. These provide a lot of power and scalability, but are even a bigger pain to use than HPCs and you pay for CPU time, so also not the best option to learn. I will probably make a post in the future to compare local servers with HPCs and Cloud Computing with all the Pros and Cons of each so everyone can find their best solution, but that’s when you know some shit about bioinformatics, so, let’s not get ahead of ourselves.

14 views0 comments
bottom of page