create account

Learning to code for Bioinformatics by gwajnberg

View this thread on: hive.blogpeakd.comecency.com
· @gwajnberg ·
$36.52
Learning to code for Bioinformatics
![](https://images.ecency.com/DQmQeij544UMh4GFdVU2MATf7KJ5GmSsYB6cULMBR7bxpQF/copy_of_copy_of_copy_of_banner_pizza_news_1_.png)
*banner built in canvas*

<div class = "text-justify">
<p>&emsp; Hello all! In the last group meeting here in my lab was talking that she was having some problems parsing some outputs from bioinformatics tools. She had two outputs, one fasta file ( a plain text file that contains nucleotide sequences plus an informative header) and one table (tsv file) with the sequences ids that are also in the fasta files plus some genome ids from organisms. She was doing manually finding curating the table, finding the genome ids and checking and after finding the corresponding sequence ids , she was moving into the fasta file to find the sequence id. But we are talking about a fasta file with thousands of sequences and also the table had also some thousands of rows.</p>

![](https://images.ecency.com/DQmbp87hy9vER7Y6W36pTN26hrhYJGK7pKKWBY6UvwpgPZX/image.png)
*A hypothetical fasta file( with small nucleotide sequences)*

![](https://images.ecency.com/DQme52x9Tu1yunHaATydgwz1JrUjWKAs1P7qByaXjPS9uyj/image.png)
*A hypothetical table with similar data*

<p>&emsp; She is a biologist, who used to do PCR experiments and other wet lab work, so I volunteered to help her make her life easier. There isn't a need for a huge code to solve that, we made a list with the genome ids of interest opened the two files, did the association between the fasta file and the table, and filtered by the genome_ids of interests. In the end we produced a fasta file with a header concatenating the sequence and genome id together.  Her fasta file in the end instead of hundreds or thousands of sequences had now only some dozens. Programming facilitates a lot when we deal with large amount of data.</p>

## How to start

<p>&emsp; My advice here is not to follow the exact path that I made, everyone has a different path to learning programming, and being a biologists is very challenging to learn to code! Why? Because biologists usually (in most of cases) learned that nature is subjective, and everything has exceptions. But computers are logical they follow steps and exact steps. For example, one question that this co-worker had when we were coding wasn't understanding why during a condition line (if ), NC_001111.11 was different from NC_001111.10 , because for her is the same organisms, the only difference is that it is the genome version, however, a computer doesn't think that way right? This is the human interpretation of the data. For those who laugh after reading this example, don't do that! It is pretty normal this mistake. Sometimes it takes time to understand that a computer isn't a human and doesn't think like a human.</p>

### R

<p>&emsp;  I didn't start to code with R, but I guess this is a good way to start building some logic for coding. Why? You can see the progress of your script line by line. If you type "1+1"  and press enter, it will show the answer in the console. So if you type "sum= 1+1" press 'enter key' and after that, you just type 'sum' and 'enter key'  you will have the same result, so it is easy to check the value of a variable, there isn't a need for typing the function "print" to get a result. I have been using it a lot , and in addition it is very good to play with stats and graphical plots, but still, you have tons of functions for helping you out. Of course, you can still perform loops and conditions if you want like any coding language. In addition, there is a cool IDE called Rstudio, also available in the cloud for free (with some memory limitations of course) that you can see your script, console and outputs at the same time that you are generating. </p>

![](https://images.ecency.com/DQmczn6Qr8RuDKPRDNcXqRwCaZs1qd55EYfUzXFrQHmAB7x/image.png)
*screenshot of R studio*

### Python

<p>&emsp; Probably this is the most popular first language to learn for biologists and similar careers, probably because there are lot's of libraries available in the area, for example, there is a library to recognize fasta files from 'BioPython' and separate the headers and sequences. I think still that python has fewer ready-to-go functions than R still. There are good IDE's for python, the one I use is called spyder, but there are some people that use pycharm for example. Also python has some interesting libraries that also deal with plots, like matplotlib and plotly</p>

![](https://images.ecency.com/DQmTRtFtTKxFuhkr9RudJZk2UbsqnyeuAPZbyyRLi5v4EXF/image.png)
*screenshot from a random script that I have*

### Perl

<p>&emsp; I learned to code using Perl, usually devs hate this language, since it is a very flexible language and still can make a code work even without indentations, if you don't use the 'use strict'  rule in the beginning you even don't need to declare a variable to start using it! A Perl code can be very confusing depending by who wrote it. But still it is a possibility for learning, it doesn't have also the tons of libraries python has, so probably we use more lines for coding programs in Perl compared to python, but I love Perl to train a bit of Regex sintaxes. I don't know any good IDE for Perl, I used to code in emacs on a Linux system, at least there I could put some indentation in my codes. </p> 

### Ruby

<p>&emsp; I don't have much experience in Ruby, but it is similar to perl, a very flexible language good to parse a large amount of data. People usually use Nano, emacs, or similar text editors to code on it. </p>

 ## Conclusion

<p>&emsp; Independent of the way that you choose to learn to code, it will build some experience to learn another language when you need it. You need to think of coding languages as a spoken language. Usually when you learn 1 or 2 , it is easier to learn the 3rd or 4th language.  Here are some resources for learning:</p>

[Tutorial for learning R in biology sciences](http://compgenomr.github.io/book/)
[Biopython tutorial](https://www.tutorialspoint.com/biopython/index.htm)
[Python for beginners](https://www.python.org/about/gettingstarted/)
[Perl for beginners](https://www.perl.com/pub/2000/10/begperl1.html/)

<p>&emsp;  I hope this article helps someone in your path. </p>

![](https://images.ecency.com/DQmbniAPLZJ4yJsiNn4UMyZjRX1scfGyDNrVCiyy1Ytrj1n/copy_of_copy_of_copy_of_by_rantree.png)


</div>
👍  , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , and 486 others
👎  , , , ,
properties (23)
authorgwajnberg
permlinklearning-to-code-for-bioinformatics
categoryhive-169321
json_metadata{"links":["http://compgenomr.github.io/book/","https://www.tutorialspoint.com/biopython/index.htm","https://www.python.org/about/gettingstarted/","https://www.perl.com/pub/2000/10/begperl1.html/"],"image":["https://images.ecency.com/DQmQeij544UMh4GFdVU2MATf7KJ5GmSsYB6cULMBR7bxpQF/copy_of_copy_of_copy_of_banner_pizza_news_1_.png","https://images.ecency.com/DQmbp87hy9vER7Y6W36pTN26hrhYJGK7pKKWBY6UvwpgPZX/image.png","https://images.ecency.com/DQme52x9Tu1yunHaATydgwz1JrUjWKAs1P7qByaXjPS9uyj/image.png","https://images.ecency.com/DQmczn6Qr8RuDKPRDNcXqRwCaZs1qd55EYfUzXFrQHmAB7x/image.png","https://images.ecency.com/DQmTRtFtTKxFuhkr9RudJZk2UbsqnyeuAPZbyyRLi5v4EXF/image.png","https://images.ecency.com/DQmbniAPLZJ4yJsiNn4UMyZjRX1scfGyDNrVCiyy1Ytrj1n/copy_of_copy_of_copy_of_by_rantree.png"],"thumbnails":["https://images.ecency.com/DQmQeij544UMh4GFdVU2MATf7KJ5GmSsYB6cULMBR7bxpQF/copy_of_copy_of_copy_of_banner_pizza_news_1_.png","https://images.ecency.com/DQmbp87hy9vER7Y6W36pTN26hrhYJGK7pKKWBY6UvwpgPZX/image.png","https://images.ecency.com/DQme52x9Tu1yunHaATydgwz1JrUjWKAs1P7qByaXjPS9uyj/image.png","https://images.ecency.com/DQmczn6Qr8RuDKPRDNcXqRwCaZs1qd55EYfUzXFrQHmAB7x/image.png","https://images.ecency.com/DQmTRtFtTKxFuhkr9RudJZk2UbsqnyeuAPZbyyRLi5v4EXF/image.png","https://images.ecency.com/DQmbniAPLZJ4yJsiNn4UMyZjRX1scfGyDNrVCiyy1Ytrj1n/copy_of_copy_of_copy_of_by_rantree.png"],"tags":["hive-169321","stemsocial","stem","hive-engine","neoxian","archon","proofofbrain","python","r","science"],"description":"","app":"ecency/3.0.28-vision","format":"markdown+html"}
created2022-11-17 22:38:42
last_update2022-11-17 22:38:42
depth0
children4
last_payout2022-11-24 22:38:42
cashout_time1969-12-31 23:59:59
total_payout_value18.310 HBD
curator_payout_value18.213 HBD
pending_payout_value0.000 HBD
promoted0.000 HBD
body_length6,357
author_reputation131,971,272,516,403
root_title"Learning to code for Bioinformatics"
beneficiaries[]
max_accepted_payout1,000,000.000 HBD
percent_hbd10,000
post_id118,416,262
net_rshares79,066,049,934,527
author_curate_reward""
vote details (555)
@hivebuzz ·
Congratulations @gwajnberg! You have completed the following achievement on the Hive blockchain and have been rewarded with new badge(s):

<table><tr><td><img src="https://images.hive.blog/60x70/http://hivebuzz.me/@gwajnberg/replies.png?202211191534"></td><td>You got more than 2250 replies.<br>Your next target is to reach 2500 replies.</td></tr>
</table>

<sub>_You can view your badges on [your board](https://hivebuzz.me/@gwajnberg) and compare yourself to others in the [Ranking](https://hivebuzz.me/ranking)_</sub>
<sub>_If you no longer want to receive notifications, reply to this comment with the word_ `STOP`</sub>



**Check out the last post from @hivebuzz:**
<table><tr><td><a href="/hive-102201/@hivebuzz/wc2022-sponsor-leofinance"><img src="https://images.hive.blog/64x128/https://i.imgur.com/cZoqk0z.png"></a></td><td><a href="/hive-102201/@hivebuzz/wc2022-sponsor-leofinance">World Cup Contest - New Sponsor - LeoFinance joins the party with 1000 more HIVE!</a></td></tr><tr><td><a href="/hive-102201/@hivebuzz/wc2022-sponsor-dcrops"><img src="https://images.hive.blog/64x128/https://i.imgur.com/cZoqk0z.png"></a></td><td><a href="/hive-102201/@hivebuzz/wc2022-sponsor-dcrops">World Cup Contest - New Sponsor and Prizes - dCrops adds 30000 CROP and 300 NFTs</a></td></tr><tr><td><a href="/hive-102201/@hivebuzz/wc2022"><img src="https://images.hive.blog/64x128/https://i.imgur.com/lj0aF9Y.png"></a></td><td><a href="/hive-102201/@hivebuzz/wc2022">HiveBuzz World Cup Contest - Collect badges and win prizes - More than 5500 HIVE to win</a></td></tr></table>
properties (22)
authorhivebuzz
permlinknotify-gwajnberg-20221119t155656
categoryhive-169321
json_metadata{"image":["http://hivebuzz.me/notify.t6.png"]}
created2022-11-19 15:56:57
last_update2022-11-19 15:56:57
depth1
children0
last_payout2022-11-26 15:56:57
cashout_time1969-12-31 23:59:59
total_payout_value0.000 HBD
curator_payout_value0.000 HBD
pending_payout_value0.000 HBD
promoted0.000 HBD
body_length1,573
author_reputation369,401,201,564,729
root_title"Learning to code for Bioinformatics"
beneficiaries[]
max_accepted_payout1,000,000.000 HBD
percent_hbd10,000
post_id118,458,360
net_rshares0
@hivebuzz ·
Congratulations @gwajnberg! You received a personal badge!

<table><tr><td>https://images.hive.blog/70x70/http://hivebuzz.me/badges/worldcup-2022.png?202211232231</td><td><p>You successfully registered for the HiveBuzz World Cup 2022 Contest.</p><p>We hope you will enjoy this event and will have fun. Good luck!</p></td></tr></table>

<sub>_You can view your badges on [your board](https://hivebuzz.me/@gwajnberg) and compare yourself to others in the [Ranking](https://hivebuzz.me/ranking)_</sub>
properties (22)
authorhivebuzz
permlinknotify-gwajnberg-20221124t000533
categoryhive-169321
json_metadata{"image":["http://hivebuzz.me/notify.t6.png"]}
created2022-11-24 00:05:33
last_update2022-11-24 00:05:33
depth1
children0
last_payout2022-12-01 00:05:33
cashout_time1969-12-31 23:59:59
total_payout_value0.000 HBD
curator_payout_value0.000 HBD
pending_payout_value0.000 HBD
promoted0.000 HBD
body_length499
author_reputation369,401,201,564,729
root_title"Learning to code for Bioinformatics"
beneficiaries[]
max_accepted_payout1,000,000.000 HBD
percent_hbd10,000
post_id118,568,438
net_rshares0
@poshtoken ·
https://twitter.com/1451373011018338304/status/1593373261731069953
<sub> The rewards earned on this comment will go directly to the people( @gwajnberg ) sharing the post on Twitter as long as they are registered with @poshtoken. Sign up at https://hiveposh.com.</sub>
👍  
properties (23)
authorposhtoken
permlinkre-gwajnberg-learning-to-code-for-bioinformatics-1314
categoryhive-169321
json_metadata"{"app":"Poshtoken 0.0.1","payoutToUser":["gwajnberg"]}"
created2022-11-17 22:40:18
last_update2022-11-17 22:40:18
depth1
children0
last_payout2022-11-24 22:40:18
cashout_time1969-12-31 23:59:59
total_payout_value0.000 HBD
curator_payout_value0.000 HBD
pending_payout_value0.000 HBD
promoted0.000 HBD
body_length268
author_reputation3,920,364,705,121,933
root_title"Learning to code for Bioinformatics"
beneficiaries
0.
accountreward.app
weight10,000
max_accepted_payout1,000,000.000 HBD
percent_hbd0
post_id118,416,291
net_rshares11,835,314,479
author_curate_reward""
vote details (1)
@stemsocial ·
re-gwajnberg-learning-to-code-for-bioinformatics-20221118t081011022z
<div class='text-justify'> <div class='pull-left'>
 <img src='https://stem.openhive.network/images/stemsocialsupport7.png'> </div>

Thanks for your contribution to the <a href='/trending/hive-196387'>STEMsocial community</a>. Feel free to join us on <a href='https://discord.gg/9c7pKVD'>discord</a> to get to know the rest of us!

Please consider delegating to the @stemsocial account (85% of the curation rewards are returned).

You may also include @stemsocial as a beneficiary of the rewards of this post to get a stronger support.&nbsp;<br />&nbsp;<br />
</div>
👍  
properties (23)
authorstemsocial
permlinkre-gwajnberg-learning-to-code-for-bioinformatics-20221118t081011022z
categoryhive-169321
json_metadata{"app":"STEMsocial"}
created2022-11-18 08:10:09
last_update2022-11-18 08:10:09
depth1
children0
last_payout2022-11-25 08:10:09
cashout_time1969-12-31 23:59:59
total_payout_value0.000 HBD
curator_payout_value0.000 HBD
pending_payout_value0.000 HBD
promoted0.000 HBD
body_length565
author_reputation22,406,444,000,684
root_title"Learning to code for Bioinformatics"
beneficiaries[]
max_accepted_payout1,000,000.000 HBD
percent_hbd10,000
post_id118,424,744
net_rshares11,598,100,190
author_curate_reward""
vote details (1)