create account

Ten Rules For Development of Biological Databases by remlaps

View this thread on: hive.blogpeakd.comecency.com
· @remlaps · (edited)
$24.27
Ten Rules For Development of Biological Databases
### Introduction
Today I read the article, [Ten Simple Rules for Developing Public Biological Databases](http://journals.plos.org/ploscompbiol/article?id=10.1371/journal.pcbi.1005128) in the Open Access [PLOS Computational Biology Journal](http://journals.plos.org/ploscompbiol/).  The article is by Mohamed Helmy, Alexander Crits-Christoph, and Gary D. Bader.  It was published on November 10, 2016.

According to the authors, there are a large number of public biological databases with varying degrees of quality.  Some are sophisticated, user-friendly, stable, and professional.  Others are difficult to use, aging, and contain unreliable data.  In order to improve the quality of data available to biologists, the authors suggest 10 rules, which are detailed below.

[![computer-1294359_1280.png](https://s11.postimg.org/hsmc6jmdv/computer_1294359_1280.png)](https://postimg.org/image/u7946vdvz/)
*\[Image Source: Pixabay.com, License: CC0, Public Domain\]*

### Ten Rules:
#### 1. Donโ€™t reinvent the wheel.

Begin with a comprehensive literature review, in order to guarantee that your database is consistent with related utilities in the field.

#### 2. The three most important things in database development are data quality, data quality, and data quality.

Develop operating procedures and quality standards to make sure that your database has the highest possible quality of data.

#### 3. Know your audience

Is your audience technically sophisticated to compose their own queries, or do they need a web interface?  What will they do with the data?  What tools and APIs will they need?

#### 4. Use modern technology

Technologies like  HTML5, CSS3, and JavaScript are recommended.  So are reusable tools such as Twitter Bootstrap, javascript libraries, Shiny for R, and Django for Python and freely available databases like mysql and nosql.  Mongo DB and Apache Lucene are recommended for spreading databases across arrays of servers.

#### 5. Put yourself in your userโ€™s shoes

 I can't say it better than the authors:
> The process of graphical user interface design should be heavily influenced by principles of consistent and appealing graphical design, information visualization, and user-specific needs (see Rule 4).

#### 6. Keep search simple and organized

Search should be quick and easy.  The output should be well organized.

#### 7. Give users data where they need it

Some users want to access the data online interactively, others want it offline in a spread-sheet or some other utility.  The database should be designed to present the data where it will be desired.

#### 8. Support open science

Publish your data model in a journal and your source code in github.

#### 9. Tell the world

"If you build it they will come," is usually wrong.  In fact, the database needs to be widely promoted in order to receive use.  The authors provide these steps:
  1. Publish an article describing the database
  2. Index your web site in search engines.
  3. Register your database in specialized online directories.
  4. Promote your database in scientic conferences and meetings.
  5. Monitor online user groups and inform them as appropriate.
  6. Actively use social media to attract users and keep them up to date.

#### 10. Maintain, update, or retire 

The authors provide a set of guidelines to describe what this means:
  1. Use professionally managed servers
  2. Use virtualization technology
  3. Make the database available for download/mirroring.
  4. Make regular backups.
  5. Make your URL institution-independent.
  6. Automate monitoring and testing of the system availability and functions.
  7. Provide a mechanism for bug reports.
  8. Choose free and popular development technologies.
  9. If it's outdated and can't be maintained, shut it down and create a public archive.

### Conclusion
The rules are intended for biological databases, or "online libraries that contain structured information about living organisms," but really, they are good guidelines for many types of public databases.   For some reason, I still remember the software development life-cycle from a college Systems Analysis text book, circa 1989:  "Survey, study, define, select, acquire, design, construct, deliver, maintain."  Sadly, I have no idea what the text book was, so I can't cite it (but I found it in a diagram [here](http://www.cs.toronto.edu/~jm/340S/PDF2/Intro2.pdf) on the last page).  It's nice to see that these rules from HCD fit loosely into that model.

---
**For more information**, please read, [Ten Simple Rules for Developing Public Biological Databases](http://journals.plos.org/ploscompbiol/article?id=10.1371/journal.pcbi.1005128) by Mohamed Helmy, Alexander Crits-Christoph, and Gary D. Bader.

---
**About the Author:** @remlaps is an Information Technology professional with three decades of business experience working with telecommunications and computing technologies. He has a bachelor's degree in mathematics, a master's degree in computer science, and is currently completing a doctoral degree in information technology.
๐Ÿ‘  , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , and 66 others
properties (23)
authorremlaps
permlinkten-rules-for-development-of-biological-databases
categoryscience
json_metadata{"tags":["science","technology","biology","programming"],"users":["remlaps"],"image":["https://s11.postimg.org/hsmc6jmdv/computer_1294359_1280.png"],"links":["http://journals.plos.org/ploscompbiol/article?id=10.1371/journal.pcbi.1005128","http://journals.plos.org/ploscompbiol/","https://postimg.org/image/u7946vdvz/","http://www.cs.toronto.edu/~jm/340S/PDF2/Intro2.pdf"],"app":"steemit/0.1","format":"markdown"}
created2016-12-06 22:54:21
last_update2016-12-07 15:52:09
depth0
children3
last_payout2017-01-07 05:55:03
cashout_time1969-12-31 23:59:59
total_payout_value18.476 HBD
curator_payout_value5.792 HBD
pending_payout_value0.000 HBD
promoted0.000 HBD
body_length5,075
author_reputation33,149,047,814,372
root_title"Ten Rules For Development of Biological Databases"
beneficiaries[]
max_accepted_payout1,000,000.000 HBD
percent_hbd10,000
post_id1,936,800
net_rshares45,404,790,689,882
author_curate_reward""
vote details (130)
@plotbot2015 · (edited)
I find the same kinds of problems with all sort of academic software.  Build it, publish it, and move on to the next grant application.  That's why open software development as part of a larger community is so so so important.
๐Ÿ‘  
properties (23)
authorplotbot2015
permlinkre-remlaps-ten-rules-for-development-of-biological-databases-20161207t012900182z
categoryscience
json_metadata{"tags":["science"]}
created2016-12-07 01:29:00
last_update2016-12-07 01:29:18
depth1
children0
last_payout2017-01-07 05:55:03
cashout_time1969-12-31 23:59:59
total_payout_value0.000 HBD
curator_payout_value0.000 HBD
pending_payout_value0.000 HBD
promoted0.000 HBD
body_length226
author_reputation15,879,930,254,115
root_title"Ten Rules For Development of Biological Databases"
beneficiaries[]
max_accepted_payout1,000,000.000 HBD
percent_hbd10,000
post_id1,937,957
net_rshares12,700,839,483
author_curate_reward""
vote details (1)
@steemstem ·
https://img1.steemit.com/0x0/http://oi63.tinypic.com/20ky92v.jpg

Thank you for taking the time to break this article down for us all and providing us with the opportunity to learn a bit more about how databases can be used in the hard sciences.

As a bonus, and in addition to resteeming for exposure. We are awarding you a small 5 Steem Power deposit as a thank you for creating quality STEM related postings on Steemit. We hope you will continue to educate us all!


https://steemit.chat/channel/steemSTEM
๐Ÿ‘  
properties (23)
authorsteemstem
permlinkre-remlaps-ten-rules-for-development-of-biological-databases-20161207t002348644z
categoryscience
json_metadata{"tags":["science"],"image":["https://img1.steemit.com/0x0/http://oi63.tinypic.com/20ky92v.jpg"],"links":["https://steemit.chat/channel/steemSTEM"]}
created2016-12-07 00:23:39
last_update2016-12-07 00:23:39
depth1
children1
last_payout2017-01-07 05:55:03
cashout_time1969-12-31 23:59:59
total_payout_value0.000 HBD
curator_payout_value0.000 HBD
pending_payout_value0.000 HBD
promoted0.000 HBD
body_length508
author_reputation262,017,435,115,313
root_title"Ten Rules For Development of Biological Databases"
beneficiaries[]
max_accepted_payout1,000,000.000 HBD
percent_hbd10,000
post_id1,937,482
net_rshares12,965,440,305
author_curate_reward""
vote details (1)
@remlaps ·
What a pleasant surprise.  You're welcome, and thank you too!
properties (22)
authorremlaps
permlinkre-steemstem-re-remlaps-ten-rules-for-development-of-biological-databases-20161207t004425794z
categoryscience
json_metadata{"tags":["science"]}
created2016-12-07 00:44:30
last_update2016-12-07 00:44:30
depth2
children0
last_payout2017-01-07 05:55:03
cashout_time1969-12-31 23:59:59
total_payout_value0.000 HBD
curator_payout_value0.000 HBD
pending_payout_value0.000 HBD
promoted0.000 HBD
body_length61
author_reputation33,149,047,814,372
root_title"Ten Rules For Development of Biological Databases"
beneficiaries[]
max_accepted_payout1,000,000.000 HBD
percent_hbd10,000
post_id1,937,608
net_rshares0