<?xml version='1.0' encoding='UTF-8'?><?xml-stylesheet href="http://www.blogger.com/styles/atom.css" type="text/css"?><feed xmlns='http://www.w3.org/2005/Atom' xmlns:openSearch='http://a9.com/-/spec/opensearchrss/1.0/' xmlns:georss='http://www.georss.org/georss' xmlns:gd='http://schemas.google.com/g/2005' xmlns:thr='http://purl.org/syndication/thread/1.0'><id>tag:blogger.com,1999:blog-14515087</id><updated>2012-02-16T13:46:43.823-02:00</updated><category term='linux'/><category term='PerM'/><category term='schduler'/><category term='FastQ'/><category term='PBS'/><category term='mosaik'/><category term='sysadmin'/><category term='maui'/><category term='ion torrent'/><category term='cluster'/><category term='SAM'/><category term='torque'/><category term='conversion'/><category term='github'/><category term='BAM'/><category term='mapping'/><category term='open source'/><category term='brazil'/><category term='hadoop'/><category term='mappimg'/><category term='solid'/><category term='job'/><category term='life technologies'/><category term='python'/><category term='haskell'/><category term='TOP500'/><category term='bowtie'/><category term='parallel'/><category term='Fasta'/><category term='performance'/><title type='text'>Bioinfo BR</title><subtitle type='html'>Comentário e dicas de bioinformática, SOLiD, Ion Torrent e Next Generation Sequencing</subtitle><link rel='http://schemas.google.com/g/2005#feed' type='application/atom+xml' href='http://bioinfo-br.blogspot.com/feeds/posts/default'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/14515087/posts/default?max-results=100'/><link rel='alternate' type='text/html' href='http://bioinfo-br.blogspot.com/'/><link rel='hub' href='http://pubsubhubbub.appspot.com/'/><author><name>Leonardo Varuzza</name><uri>https://profiles.google.com/106428020899659897437</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='32' height='32' src='//lh4.googleusercontent.com/-VfVeE_sAxds/AAAAAAAAAAI/AAAAAAAAAAA/ohSMK6AOy8g/s512-c/photo.jpg'/></author><generator version='7.00' uri='http://www.blogger.com'>Blogger</generator><openSearch:totalResults>58</openSearch:totalResults><openSearch:startIndex>1</openSearch:startIndex><openSearch:itemsPerPage>100</openSearch:itemsPerPage><entry><id>tag:blogger.com,1999:blog-14515087.post-2962872184837650032</id><published>2012-02-16T11:32:00.002-02:00</published><updated>2012-02-16T13:46:43.835-02:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='life technologies'/><category scheme='http://www.blogger.com/atom/ns#' term='job'/><title type='text'>Vaga de bioinformata na Life Technologies</title><content type='html'>Oi Pessoal,&lt;br /&gt;&lt;br /&gt;Estamos abrindo uma vaga para bioinformata aqui na Life Technologies&lt;br /&gt;(ex Applied Biosystems). Essa vaga é para trabalhar com suporte de&lt;br /&gt;bioinformática para os usuários dos sequenciadores de nova geração&lt;br /&gt;SOLiD, 5500 e Ion Torrent. Prefere-se candiados com pós em&lt;br /&gt;bioinformática e experiência em next generation sequencing analysis,&lt;br /&gt;experiência com Linux e proficiência em inglês é mandatório.&lt;br /&gt;&lt;br /&gt;Os interessados podem se inscrever no site:&lt;br /&gt;&lt;br /&gt;&lt;a href="http://www.lifetechnologies.com/br/en/home/about-us/careers.html" target="_blank"&gt;http://www.lifetechnologies.&lt;wbr&gt;&lt;/wbr&gt;com/br/en/home/about-us/&lt;wbr&gt;&lt;/wbr&gt;careers.html&lt;/a&gt;&lt;br /&gt;&lt;br /&gt;Basta ir no box de Job Search e colocar o código&amp;nbsp; &lt;span lang="PT-BR"&gt;9501BR&lt;/span&gt;&lt;br /&gt;&lt;br /&gt;PS: Temos uma segunda vaga no México, quem for espano-hablante e&lt;br /&gt;estiver interessado em se mudar pode se inscrever para a vaga 9372BR.&lt;br /&gt;&lt;br /&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/14515087-2962872184837650032?l=bioinfo-br.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://bioinfo-br.blogspot.com/feeds/2962872184837650032/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=14515087&amp;postID=2962872184837650032' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/14515087/posts/default/2962872184837650032'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/14515087/posts/default/2962872184837650032'/><link rel='alternate' type='text/html' href='http://bioinfo-br.blogspot.com/2012/02/vaga-de-bioinformata-na-life.html' title='Vaga de bioinformata na Life Technologies'/><author><name>Leonardo Varuzza</name><uri>https://profiles.google.com/106428020899659897437</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='32' height='32' src='//lh4.googleusercontent.com/-VfVeE_sAxds/AAAAAAAAAAI/AAAAAAAAAAA/ohSMK6AOy8g/s512-c/photo.jpg'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-14515087.post-7836333555306003396</id><published>2012-02-03T17:19:00.000-02:00</published><updated>2012-02-03T17:36:58.738-02:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='open source'/><category scheme='http://www.blogger.com/atom/ns#' term='github'/><category scheme='http://www.blogger.com/atom/ns#' term='ion torrent'/><title type='text'>Ion Torrent: The Open Source Sequencer</title><content type='html'>&lt;br /&gt;&lt;div style="-webkit-border-horizontal-spacing: 0px; -webkit-border-vertical-spacing: 0px; background-attachment: initial; background-clip: initial; background-image: initial; background-origin: initial; border-bottom-width: 0px; border-collapse: collapse; border-color: initial; border-image: initial; border-left-width: 0px; border-right-width: 0px; border-style: initial; border-top-width: 0px; color: #333333; font-family: 'Lucida Grande', Arial, Helvetica, sans-serif; font-size: 12px; list-style-image: initial; list-style-position: initial; list-style-type: none; outline-color: initial; outline-style: initial; outline-width: 0px; padding-bottom: 0px; padding-left: 0px; padding-right: 0px; padding-top: 0px; text-align: left;"&gt;&lt;span style="background-color: white;"&gt;&lt;br /&gt;&lt;/span&gt;&lt;/div&gt;&lt;div style="-webkit-border-horizontal-spacing: 0px; -webkit-border-vertical-spacing: 0px; background-attachment: initial; background-clip: initial; background-image: initial; background-origin: initial; border-bottom-width: 0px; border-collapse: collapse; border-color: initial; border-image: initial; border-left-width: 0px; border-right-width: 0px; border-style: initial; border-top-width: 0px; color: #333333; font-family: 'Lucida Grande', Arial, Helvetica, sans-serif; font-size: 12px; list-style-image: initial; list-style-position: initial; list-style-type: none; outline-color: initial; outline-style: initial; outline-width: 0px; padding-bottom: 0px; padding-left: 0px; padding-right: 0px; padding-top: 0px; text-align: left;"&gt;&lt;span style="background-color: white;"&gt;Apartir da versão 2.0 o PGM diponibiliza o código fonte da pipeline da análises, do mapeador, do Variant Caller e do utilitário TorrentScout no github:&lt;/span&gt;&lt;/div&gt;&lt;div style="-webkit-border-horizontal-spacing: 0px; -webkit-border-vertical-spacing: 0px; background-attachment: initial; background-clip: initial; background-image: initial; background-origin: initial; border-bottom-width: 0px; border-collapse: collapse; border-color: initial; border-image: initial; border-left-width: 0px; border-right-width: 0px; border-style: initial; border-top-width: 0px; color: #333333; font-family: 'Lucida Grande', Arial, Helvetica, sans-serif; font-size: 12px; list-style-image: initial; list-style-position: initial; list-style-type: none; outline-color: initial; outline-style: initial; outline-width: 0px; padding-bottom: 0px; padding-left: 0px; padding-right: 0px; padding-top: 0px; text-align: left;"&gt;&lt;span style="background-color: white;"&gt;&lt;br /&gt;&lt;/span&gt;&lt;/div&gt;&lt;div style="-webkit-border-horizontal-spacing: 0px; -webkit-border-vertical-spacing: 0px; background-attachment: initial; background-clip: initial; background-image: initial; background-origin: initial; border-bottom-width: 0px; border-collapse: collapse; border-color: initial; border-image: initial; border-left-width: 0px; border-right-width: 0px; border-style: initial; border-top-width: 0px; color: #333333; font-family: 'Lucida Grande', Arial, Helvetica, sans-serif; font-size: 12px; list-style-image: initial; list-style-position: initial; list-style-type: none; outline-color: initial; outline-style: initial; outline-width: 0px; padding-bottom: 0px; padding-left: 0px; padding-right: 0px; padding-top: 0px; text-align: left;"&gt;&lt;span style="background-color: white;"&gt;&lt;br /&gt;&lt;/span&gt;&lt;/div&gt;&lt;div style="-webkit-border-horizontal-spacing: 0px; -webkit-border-vertical-spacing: 0px; background-attachment: initial; background-clip: initial; background-image: initial; background-origin: initial; border-bottom-width: 0px; border-collapse: collapse; border-color: initial; border-image: initial; border-left-width: 0px; border-right-width: 0px; border-style: initial; border-top-width: 0px; color: #333333; font-family: 'Lucida Grande', Arial, Helvetica, sans-serif; font-size: 12px; list-style-image: initial; list-style-position: initial; list-style-type: none; outline-color: initial; outline-style: initial; outline-width: 0px; padding-bottom: 0px; padding-left: 0px; padding-right: 0px; padding-top: 0px; text-align: left;"&gt;&lt;span style="background-color: white;"&gt;Torrent Suite 2.0:&amp;nbsp;&lt;a class="jive-link-external-small" href="https://github.com/iontorrent/TS" style="-webkit-border-horizontal-spacing: 0px; -webkit-border-vertical-spacing: 0px; background-attachment: initial; background-clip: initial; background-image: initial; background-origin: initial; border-bottom-width: 0px; border-collapse: collapse; border-color: initial; border-image: initial; border-left-width: 0px; border-right-width: 0px; border-style: initial; border-top-width: 0px; color: #355491; list-style-image: initial; list-style-position: initial; list-style-type: none; margin-bottom: 0px; margin-left: 0px; margin-right: 0px; margin-top: 0px; outline-color: initial; outline-style: initial; outline-width: 0px; padding-bottom: 0px; padding-left: 0px; padding-right: 0px; padding-top: 0px; text-decoration: none;"&gt;http://github.com/iontorrent/TS&lt;/a&gt;&lt;/span&gt;&lt;/div&gt;&lt;div style="-webkit-border-horizontal-spacing: 0px; -webkit-border-vertical-spacing: 0px; background-attachment: initial; background-clip: initial; background-image: initial; background-origin: initial; border-bottom-width: 0px; border-collapse: collapse; border-color: initial; border-image: initial; border-left-width: 0px; border-right-width: 0px; border-style: initial; border-top-width: 0px; color: #333333; font-family: 'Lucida Grande', Arial, Helvetica, sans-serif; font-size: 12px; list-style-image: initial; list-style-position: initial; list-style-type: none; outline-color: initial; outline-style: initial; outline-width: 0px; padding-bottom: 0px; padding-left: 0px; padding-right: 0px; padding-top: 0px; text-align: left;"&gt;&lt;span style="background-color: white;"&gt;&lt;br /&gt;&lt;/span&gt;&lt;/div&gt;&lt;div style="-webkit-border-horizontal-spacing: 0px; -webkit-border-vertical-spacing: 0px; background-attachment: initial; background-clip: initial; background-image: initial; background-origin: initial; border-bottom-width: 0px; border-collapse: collapse; border-color: initial; border-image: initial; border-left-width: 0px; border-right-width: 0px; border-style: initial; border-top-width: 0px; color: #333333; font-family: 'Lucida Grande', Arial, Helvetica, sans-serif; font-size: 12px; list-style-image: initial; list-style-position: initial; list-style-type: none; outline-color: initial; outline-style: initial; outline-width: 0px; padding-bottom: 0px; padding-left: 0px; padding-right: 0px; padding-top: 0px; text-align: left;"&gt;&lt;span style="background-color: white;"&gt;TMAP:&amp;nbsp;&lt;a class="jive-link-external-small" href="https://github.com/iontorrent/TMAP" style="-webkit-border-horizontal-spacing: 0px; -webkit-border-vertical-spacing: 0px; background-attachment: initial; background-clip: initial; background-image: initial; background-origin: initial; border-bottom-width: 0px; border-collapse: collapse; border-color: initial; border-image: initial; border-left-width: 0px; border-right-width: 0px; border-style: initial; border-top-width: 0px; color: #355491; list-style-image: initial; list-style-position: initial; list-style-type: none; margin-bottom: 0px; margin-left: 0px; margin-right: 0px; margin-top: 0px; outline-color: initial; outline-style: initial; outline-width: 0px; padding-bottom: 0px; padding-left: 0px; padding-right: 0px; padding-top: 0px; text-decoration: none;"&gt;http://github.com/iontorrent/TMAP&lt;/a&gt;&lt;/span&gt;&lt;/div&gt;&lt;div style="-webkit-border-horizontal-spacing: 0px; -webkit-border-vertical-spacing: 0px; background-attachment: initial; background-clip: initial; background-image: initial; background-origin: initial; border-bottom-width: 0px; border-collapse: collapse; border-color: initial; border-image: initial; border-left-width: 0px; border-right-width: 0px; border-style: initial; border-top-width: 0px; color: #333333; font-family: 'Lucida Grande', Arial, Helvetica, sans-serif; font-size: 12px; list-style-image: initial; list-style-position: initial; list-style-type: none; outline-color: initial; outline-style: initial; outline-width: 0px; padding-bottom: 0px; padding-left: 0px; padding-right: 0px; padding-top: 0px; text-align: left;"&gt;&lt;span style="background-color: white;"&gt;&lt;br /&gt;&lt;/span&gt;&lt;/div&gt;&lt;div style="-webkit-border-horizontal-spacing: 0px; -webkit-border-vertical-spacing: 0px; background-attachment: initial; background-clip: initial; background-image: initial; background-origin: initial; border-bottom-width: 0px; border-collapse: collapse; border-color: initial; border-image: initial; border-left-width: 0px; border-right-width: 0px; border-style: initial; border-top-width: 0px; color: #333333; font-family: 'Lucida Grande', Arial, Helvetica, sans-serif; font-size: 12px; list-style-image: initial; list-style-position: initial; list-style-type: none; outline-color: initial; outline-style: initial; outline-width: 0px; padding-bottom: 0px; padding-left: 0px; padding-right: 0px; padding-top: 0px; text-align: left;"&gt;&lt;span style="background-color: white;"&gt;Torrent Scout:&amp;nbsp;&lt;a class="jive-link-external-small" href="https://github.com/iontorrent/TorrentScout" style="-webkit-border-horizontal-spacing: 0px; -webkit-border-vertical-spacing: 0px; background-attachment: initial; background-clip: initial; background-image: initial; background-origin: initial; border-bottom-width: 0px; border-collapse: collapse; border-color: initial; border-image: initial; border-left-width: 0px; border-right-width: 0px; border-style: initial; border-top-width: 0px; color: #355491; list-style-image: initial; list-style-position: initial; list-style-type: none; margin-bottom: 0px; margin-left: 0px; margin-right: 0px; margin-top: 0px; outline-color: initial; outline-style: initial; outline-width: 0px; padding-bottom: 0px; padding-left: 0px; padding-right: 0px; padding-top: 0px; text-decoration: none;"&gt;http://github.com/iontorrent/TorrentScout&lt;/a&gt;&lt;/span&gt;&lt;/div&gt;&lt;div style="-webkit-border-horizontal-spacing: 0px; -webkit-border-vertical-spacing: 0px; background-attachment: initial; background-clip: initial; background-image: initial; background-origin: initial; border-bottom-width: 0px; border-collapse: collapse; border-color: initial; border-image: initial; border-left-width: 0px; border-right-width: 0px; border-style: initial; border-top-width: 0px; color: #333333; font-family: 'Lucida Grande', Arial, Helvetica, sans-serif; font-size: 12px; list-style-image: initial; list-style-position: initial; list-style-type: none; outline-color: initial; outline-style: initial; outline-width: 0px; padding-bottom: 0px; padding-left: 0px; padding-right: 0px; padding-top: 0px; text-align: left;"&gt;&lt;span style="background-color: white;"&gt;&lt;br /&gt;&lt;/span&gt;&lt;/div&gt;&lt;div style="-webkit-border-horizontal-spacing: 0px; -webkit-border-vertical-spacing: 0px; background-attachment: initial; background-clip: initial; background-image: initial; background-origin: initial; border-bottom-width: 0px; border-collapse: collapse; border-color: initial; border-image: initial; border-left-width: 0px; border-right-width: 0px; border-style: initial; border-top-width: 0px; color: #333333; font-family: 'Lucida Grande', Arial, Helvetica, sans-serif; font-size: 12px; list-style-image: initial; list-style-position: initial; list-style-type: none; outline-color: initial; outline-style: initial; outline-width: 0px; padding-bottom: 0px; padding-left: 0px; padding-right: 0px; padding-top: 0px; text-align: left;"&gt;&lt;span style="background-color: white;"&gt;Torrent Variant Caller Plugin:&amp;nbsp;&lt;a class="jive-link-external-small" href="https://github.com/iontorrent/Ion-Variant-Hunter" style="-webkit-border-horizontal-spacing: 0px; -webkit-border-vertical-spacing: 0px; background-attachment: initial; background-clip: initial; background-image: initial; background-origin: initial; border-bottom-width: 0px; border-collapse: collapse; border-color: initial; border-image: initial; border-left-width: 0px; border-right-width: 0px; border-style: initial; border-top-width: 0px; color: #355491; list-style-image: initial; list-style-position: initial; list-style-type: none; margin-bottom: 0px; margin-left: 0px; margin-right: 0px; margin-top: 0px; outline-color: initial; outline-style: initial; outline-width: 0px; padding-bottom: 0px; padding-left: 0px; padding-right: 0px; padding-top: 0px; text-decoration: none;"&gt;http://github.com/iontorrent/Ion-Variant-Hunter&lt;/a&gt;&lt;/span&gt;&lt;/div&gt;&lt;div style="-webkit-border-horizontal-spacing: 0px; -webkit-border-vertical-spacing: 0px; background-attachment: initial; background-clip: initial; background-image: initial; background-origin: initial; border-bottom-width: 0px; border-collapse: collapse; border-color: initial; border-image: initial; border-left-width: 0px; border-right-width: 0px; border-style: initial; border-top-width: 0px; color: #333333; font-family: 'Lucida Grande', Arial, Helvetica, sans-serif; font-size: 12px; height: 8pt; list-style-image: initial; list-style-position: initial; list-style-type: none; min-height: 8pt; outline-color: initial; outline-style: initial; outline-width: 0px; padding-bottom: 0px; padding-left: 0px; padding-right: 0px; padding-top: 0px; text-align: left;"&gt;&lt;br /&gt;&lt;/div&gt;&lt;div style="-webkit-border-horizontal-spacing: 0px; -webkit-border-vertical-spacing: 0px; background-attachment: initial; background-clip: initial; background-image: initial; background-origin: initial; border-bottom-width: 0px; border-collapse: collapse; border-color: initial; border-image: initial; border-left-width: 0px; border-right-width: 0px; border-style: initial; border-top-width: 0px; color: #333333; font-family: 'Lucida Grande', Arial, Helvetica, sans-serif; font-size: 12px; height: 8pt; list-style-image: initial; list-style-position: initial; list-style-type: none; min-height: 8pt; outline-color: initial; outline-style: initial; outline-width: 0px; padding-bottom: 0px; padding-left: 0px; padding-right: 0px; padding-top: 0px; text-align: left;"&gt;Agora sim temos um Sequenciador Completamente open source :-)&lt;/div&gt;&lt;div style="-webkit-border-horizontal-spacing: 0px; -webkit-border-vertical-spacing: 0px; background-attachment: initial; background-clip: initial; background-image: initial; background-origin: initial; border-bottom-width: 0px; border-collapse: collapse; border-color: initial; border-image: initial; border-left-width: 0px; border-right-width: 0px; border-style: initial; border-top-width: 0px; color: #333333; font-family: 'Lucida Grande', Arial, Helvetica, sans-serif; font-size: 12px; height: 8pt; list-style-image: initial; list-style-position: initial; list-style-type: none; min-height: 8pt; outline-color: initial; outline-style: initial; outline-width: 0px; padding-bottom: 0px; padding-left: 0px; padding-right: 0px; padding-top: 0px; text-align: left;"&gt;&lt;br /&gt;&lt;/div&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/14515087-7836333555306003396?l=bioinfo-br.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://bioinfo-br.blogspot.com/feeds/7836333555306003396/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=14515087&amp;postID=7836333555306003396' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/14515087/posts/default/7836333555306003396'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/14515087/posts/default/7836333555306003396'/><link rel='alternate' type='text/html' href='http://bioinfo-br.blogspot.com/2012/02/open-source-sequencer.html' title='Ion Torrent: The Open Source Sequencer'/><author><name>Leonardo Varuzza</name><uri>https://profiles.google.com/106428020899659897437</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='32' height='32' src='//lh4.googleusercontent.com/-VfVeE_sAxds/AAAAAAAAAAI/AAAAAAAAAAA/ohSMK6AOy8g/s512-c/photo.jpg'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-14515087.post-7524979415079249974</id><published>2011-12-12T16:40:00.002-02:00</published><updated>2011-12-12T16:40:46.116-02:00</updated><title type='text'>5500 Wild Fire</title><content type='html'>&lt;br /&gt;Vídeo sobre o 5500W (aka wild fire) que será lançado no próximo ano:&lt;br /&gt;&lt;br /&gt;&lt;iframe allowfullscreen="" frameborder="0" height="315" src="http://www.youtube.com/embed/4YKFYiPKI1I" width="420"&gt;&lt;/iframe&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/14515087-7524979415079249974?l=bioinfo-br.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://bioinfo-br.blogspot.com/feeds/7524979415079249974/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=14515087&amp;postID=7524979415079249974' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/14515087/posts/default/7524979415079249974'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/14515087/posts/default/7524979415079249974'/><link rel='alternate' type='text/html' href='http://bioinfo-br.blogspot.com/2011/12/5500-wild-fire.html' title='5500 Wild Fire'/><author><name>Leonardo Varuzza</name><uri>https://profiles.google.com/106428020899659897437</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='32' height='32' src='//lh4.googleusercontent.com/-VfVeE_sAxds/AAAAAAAAAAI/AAAAAAAAAAA/ohSMK6AOy8g/s512-c/photo.jpg'/></author><media:thumbnail xmlns:media='http://search.yahoo.com/mrss/' url='http://img.youtube.com/vi/4YKFYiPKI1I/default.jpg' height='72' width='72'/><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-14515087.post-8637323206315590239</id><published>2011-11-25T14:52:00.001-02:00</published><updated>2011-11-25T14:52:09.145-02:00</updated><title type='text'>Simple SNP Calling and annotation in command line</title><content type='html'>This is a simple recipe for doing SNP calling and annotation:&lt;br /&gt;&lt;br /&gt;First, use samtools (http://samtools.sourceforge.net/) to call the SNP's:&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;samtools&amp;nbsp;mpileup&amp;nbsp;-uf&amp;nbsp;ref.fa&amp;nbsp;aln1.bam&amp;nbsp;aln2.bam&amp;nbsp;|&amp;nbsp;bcftools&amp;nbsp;view&amp;nbsp;-vcg&amp;nbsp;-&amp;nbsp;&amp;gt; var.vcf&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;Them use snpEff (http://snpeff.sourceforge.net/) to annotate the vcf file:&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;java -jar snpEff.jar download hg19&lt;br /&gt;&lt;br /&gt;java -jar snpEff.jar hg19 var.vcf &amp;gt; annotation.tsv&lt;br /&gt;&lt;br /&gt;(If you are working with other genome change hg19 to the name of the genome, the list of available genomes are in the snpEff site).&lt;br /&gt;&lt;br /&gt;The snpEff will inform the name of the SNP, the changed residue, codon, etc, very useful.&lt;br /&gt;&lt;br /&gt;I tested this with an Ion Amplicon, I think should work with SOLiD too.&lt;br /&gt;&lt;br /&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/14515087-8637323206315590239?l=bioinfo-br.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://bioinfo-br.blogspot.com/feeds/8637323206315590239/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=14515087&amp;postID=8637323206315590239' title='1 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/14515087/posts/default/8637323206315590239'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/14515087/posts/default/8637323206315590239'/><link rel='alternate' type='text/html' href='http://bioinfo-br.blogspot.com/2011/11/simple-snp-calling-and-annotation-in.html' title='Simple SNP Calling and annotation in command line'/><author><name>Leonardo Varuzza</name><uri>https://profiles.google.com/106428020899659897437</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='32' height='32' src='//lh4.googleusercontent.com/-VfVeE_sAxds/AAAAAAAAAAI/AAAAAAAAAAA/ohSMK6AOy8g/s512-c/photo.jpg'/></author><thr:total>1</thr:total></entry><entry><id>tag:blogger.com,1999:blog-14515087.post-5168046530106354410</id><published>2011-10-24T21:26:00.002-02:00</published><updated>2011-10-24T21:26:49.210-02:00</updated><title type='text'>Hybrid Assembly Pipeline with SOLiD</title><content type='html'>Foi disponiblizado hoje uma ferramenta para combinar os reads de uma biblioteca de mate-pair do SOLiD com contigs gerados por outra tecnologia (ou mesmo pelo próprio SOLID).&lt;br /&gt;&lt;br /&gt;A idéia é mapear os pares nos contigs e depois montar um grafo de De Brujin com as possíveis ordenações dos contigs, corrigir o grafo e gerar um scaffold. Aparentemente esse programa tem um resultado muito mais acurado do que o &lt;a href="http://www.biomedcentral.com/1471-2105/11/345"&gt;SOPRA&lt;/a&gt;.&lt;br /&gt;&lt;br /&gt;&amp;nbsp;Para quem esta fazendo Denovo com o SOLiD é uma ferramenta muito útil.&lt;br /&gt;&lt;br /&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/14515087-5168046530106354410?l=bioinfo-br.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://bioinfo-br.blogspot.com/feeds/5168046530106354410/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=14515087&amp;postID=5168046530106354410' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/14515087/posts/default/5168046530106354410'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/14515087/posts/default/5168046530106354410'/><link rel='alternate' type='text/html' href='http://bioinfo-br.blogspot.com/2011/10/hybrid-assembly-pipeline-with-solid.html' title='Hybrid Assembly Pipeline with SOLiD'/><author><name>Leonardo Varuzza</name><uri>https://profiles.google.com/106428020899659897437</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='32' height='32' src='//lh4.googleusercontent.com/-VfVeE_sAxds/AAAAAAAAAAI/AAAAAAAAAAA/ohSMK6AOy8g/s512-c/photo.jpg'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-14515087.post-8208083562203919363</id><published>2011-07-20T16:22:00.002-03:00</published><updated>2011-07-20T16:22:55.105-03:00</updated><title type='text'>Paper do Ion Torrent</title><content type='html'>Saiu na Nature o paper descrevendo o Ion Torrent:&lt;br /&gt;&lt;br /&gt;&lt;a href="http://www.nature.com/nature/journal/v475/n7356/full/nature10242.html"&gt;http://www.nature.com/nature/journal/v475/n7356/full/nature10242.html&lt;/a&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/14515087-8208083562203919363?l=bioinfo-br.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://bioinfo-br.blogspot.com/feeds/8208083562203919363/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=14515087&amp;postID=8208083562203919363' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/14515087/posts/default/8208083562203919363'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/14515087/posts/default/8208083562203919363'/><link rel='alternate' type='text/html' href='http://bioinfo-br.blogspot.com/2011/07/paper-do-ion-torrent.html' title='Paper do Ion Torrent'/><author><name>Leonardo Varuzza</name><uri>https://profiles.google.com/106428020899659897437</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='32' height='32' src='//lh4.googleusercontent.com/-VfVeE_sAxds/AAAAAAAAAAI/AAAAAAAAAAA/ohSMK6AOy8g/s512-c/photo.jpg'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-14515087.post-7220850327689911965</id><published>2011-06-04T23:07:00.000-03:00</published><updated>2011-06-04T23:07:23.395-03:00</updated><title type='text'>Ion Torrent sequencia cepa de e. coli que está causando surto na Europa</title><content type='html'>Utilizando o Ion Torrent pesquisadores identificaram que a e. coli que está causando o surto na europa é a compinação de duas outras cepas mortais:&lt;br /&gt;&lt;br /&gt;&lt;a href="http://www.courant.com/health/connecticut/hc-sequencing-e-coli-0603-20110602,0,2978920.story"&gt;http://www.courant.com/health/connecticut/hc-sequencing-e-coli-0603-20110602,0,2978920.story&lt;/a&gt;&lt;br /&gt;&lt;br /&gt;Matéria no jornal da band:&lt;br /&gt;&lt;br /&gt;&lt;a href="http://videos.band.com.br/Exibir/Alemanha-diz-que-bacteria-nao-esta-em-pepino-espanhol/2c9f94b630297d9901304d534290161b?channel=587"&gt;http://videos.band.com.br/Exibir/Alemanha-diz-que-bacteria-nao-esta-em-pepino-espanhol/2c9f94b630297d9901304d534290161b?channel=587&lt;/a&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/14515087-7220850327689911965?l=bioinfo-br.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://bioinfo-br.blogspot.com/feeds/7220850327689911965/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=14515087&amp;postID=7220850327689911965' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/14515087/posts/default/7220850327689911965'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/14515087/posts/default/7220850327689911965'/><link rel='alternate' type='text/html' href='http://bioinfo-br.blogspot.com/2011/06/ion-torrent-sequencia-cepa-de-e-coli.html' title='Ion Torrent sequencia cepa de e. coli que está causando surto na Europa'/><author><name>Leonardo Varuzza</name><uri>https://profiles.google.com/106428020899659897437</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='32' height='32' src='//lh4.googleusercontent.com/-VfVeE_sAxds/AAAAAAAAAAI/AAAAAAAAAAA/ohSMK6AOy8g/s512-c/photo.jpg'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-14515087.post-5765629237448699296</id><published>2011-05-26T11:03:00.000-03:00</published><updated>2011-05-26T11:03:56.120-03:00</updated><title type='text'>De Novo assembly com o Ion Torrent</title><content type='html'>Ótimo blog post sobre montagem de genomas utilizando dados do Ion Torrent:&lt;br /&gt;&lt;br /&gt;&lt;a href="http://pathogenomics.bham.ac.uk/blog/2011/05/ion-torrent-data-blog-post-a-week-is-a-long-time-in-genomics/"&gt;http://pathogenomics.bham.ac.uk/blog/2011/05/ion-torrent-data-blog-post-a-week-is-a-long-time-in-genomics/&lt;/a&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/14515087-5765629237448699296?l=bioinfo-br.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://bioinfo-br.blogspot.com/feeds/5765629237448699296/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=14515087&amp;postID=5765629237448699296' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/14515087/posts/default/5765629237448699296'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/14515087/posts/default/5765629237448699296'/><link rel='alternate' type='text/html' href='http://bioinfo-br.blogspot.com/2011/05/de-novo-assembly-com-o-ion-torrent.html' title='De Novo assembly com o Ion Torrent'/><author><name>Leonardo Varuzza</name><uri>https://profiles.google.com/106428020899659897437</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='32' height='32' src='//lh4.googleusercontent.com/-VfVeE_sAxds/AAAAAAAAAAI/AAAAAAAAAAA/ohSMK6AOy8g/s512-c/photo.jpg'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-14515087.post-5374541170427754692</id><published>2011-04-20T11:51:00.000-03:00</published><updated>2011-04-20T11:51:22.834-03:00</updated><title type='text'>Notícias sobre o Lançamento do Ion Torrent</title><content type='html'>No blog do Estadão:&lt;br /&gt;&lt;br /&gt;&lt;a href="http://blogs.estadao.com.br/link/projeto-genoma-para-todos/"&gt;http://blogs.estadao.com.br/link/projeto-genoma-para-todos/&lt;/a&gt;&lt;br /&gt;&lt;br /&gt;No Valor Econômico:&lt;br /&gt;&lt;br /&gt;&lt;div class="separator" style="clear: both; text-align: center;"&gt;&lt;a href="http://lifetechnextgen.com.br/images/noticias/lifetech_valor19042011.jpg" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"&gt;&lt;img border="0" height="310" src="http://lifetechnextgen.com.br/images/noticias/lifetech_valor19042011.jpg" width="320" /&gt;&lt;/a&gt;&lt;/div&gt;&lt;br /&gt;&lt;br /&gt;Na Info:&lt;br /&gt;&lt;br /&gt;&lt;a href="http://info.abril.com.br/noticias/ciencia/novo-chip-sequenciador-de-dna-chega-ao-brasil-18042011-28.shl"&gt;http://info.abril.com.br/noticias/ciencia/novo-chip-sequenciador-de-dna-chega-ao-brasil-18042011-28.shl&lt;/a&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/14515087-5374541170427754692?l=bioinfo-br.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://bioinfo-br.blogspot.com/feeds/5374541170427754692/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=14515087&amp;postID=5374541170427754692' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/14515087/posts/default/5374541170427754692'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/14515087/posts/default/5374541170427754692'/><link rel='alternate' type='text/html' href='http://bioinfo-br.blogspot.com/2011/04/noticias-sobre-o-lancamento-do-ion.html' title='Notícias sobre o Lançamento do Ion Torrent'/><author><name>Leonardo Varuzza</name><uri>https://profiles.google.com/106428020899659897437</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='32' height='32' src='//lh4.googleusercontent.com/-VfVeE_sAxds/AAAAAAAAAAI/AAAAAAAAAAA/ohSMK6AOy8g/s512-c/photo.jpg'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-14515087.post-2154658624022002332</id><published>2011-04-02T20:54:00.000-03:00</published><updated>2011-04-02T20:54:49.726-03:00</updated><title type='text'>Fotos do SOLiD 5500 e do Ion Torrent</title><content type='html'>Algumas fotos que eu tirei no PAG 2011.&lt;br /&gt;&lt;br /&gt;Fernando com o SOLiD 5500:&lt;br /&gt;&lt;br /&gt;&lt;div class="separator" style="clear: both; text-align: center;"&gt;&lt;a href="http://www.flickr.com/photos/varuzza/5583012859/#/photos/varuzza/5583012859/lightbox/" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"&gt;&lt;img border="0" height="212" src="http://farm6.static.flickr.com/5133/5583012859_066b4b63c2.jpg" width="320" /&gt;&lt;/a&gt;&lt;/div&gt;&lt;br /&gt;&lt;br /&gt;E o Ion Torrent:&lt;br /&gt;&lt;br /&gt;&lt;div class="separator" style="clear: both; text-align: center;"&gt;&lt;a href="http://www.flickr.com/photos/varuzza/5583600544/lightbox/" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"&gt;&lt;img border="0" height="221" src="http://farm6.static.flickr.com/5230/5583600544_b1b4479be8.jpg" width="320" /&gt;&lt;/a&gt;&lt;/div&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/14515087-2154658624022002332?l=bioinfo-br.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://bioinfo-br.blogspot.com/feeds/2154658624022002332/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=14515087&amp;postID=2154658624022002332' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/14515087/posts/default/2154658624022002332'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/14515087/posts/default/2154658624022002332'/><link rel='alternate' type='text/html' href='http://bioinfo-br.blogspot.com/2011/04/fotos-do-solid-5500-e-do-ion-torrent.html' title='Fotos do SOLiD 5500 e do Ion Torrent'/><author><name>Leonardo Varuzza</name><uri>https://profiles.google.com/106428020899659897437</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='32' height='32' src='//lh4.googleusercontent.com/-VfVeE_sAxds/AAAAAAAAAAI/AAAAAAAAAAA/ohSMK6AOy8g/s512-c/photo.jpg'/></author><media:thumbnail xmlns:media='http://search.yahoo.com/mrss/' url='http://farm6.static.flickr.com/5133/5583012859_066b4b63c2_t.jpg' height='72' width='72'/><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-14515087.post-8769654813911664137</id><published>2011-04-01T23:48:00.000-03:00</published><updated>2011-04-01T23:48:03.087-03:00</updated><title type='text'>Multithreading no Velvet 1.1</title><content type='html'>A versão 1.1.01 do Velvet agora suporta multithreading via OpenMP, além disso segundo o anúncio houve uma melhora no uso de memória[1]&lt;br /&gt;&lt;br /&gt;Logo em seguida foi lançado o 1.1.02, para corrigir alguns bugs da versão anterior [2]. &amp;nbsp;Eu testei essa versão com a bibliteoca de DH10B feita com SOLiD, não medi o tempo, mas percebi que ocorre um real uso dos 8 cores da máquina de teste e que houve uma significativa redução no tempo velvetg.&lt;br /&gt;&lt;br /&gt;[1]&amp;nbsp;&lt;a href="http://listserver.ebi.ac.uk/pipermail/velvet-users/2011-March/001303.html"&gt;http://listserver.ebi.ac.uk/pipermail/velvet-users/2011-March/001303.html&lt;/a&gt;&lt;br /&gt;[2]&amp;nbsp;&lt;a href="http://listserver.ebi.ac.uk/pipermail/velvet-users/2011-March/001311.html"&gt;http://listserver.ebi.ac.uk/pipermail/velvet-users/2011-March/001311.html&lt;/a&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/14515087-8769654813911664137?l=bioinfo-br.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://bioinfo-br.blogspot.com/feeds/8769654813911664137/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=14515087&amp;postID=8769654813911664137' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/14515087/posts/default/8769654813911664137'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/14515087/posts/default/8769654813911664137'/><link rel='alternate' type='text/html' href='http://bioinfo-br.blogspot.com/2011/04/multithreading-no-velvet-11.html' title='Multithreading no Velvet 1.1'/><author><name>Leonardo Varuzza</name><uri>https://profiles.google.com/106428020899659897437</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='32' height='32' src='//lh4.googleusercontent.com/-VfVeE_sAxds/AAAAAAAAAAI/AAAAAAAAAAA/ohSMK6AOy8g/s512-c/photo.jpg'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-14515087.post-5354007598982414518</id><published>2011-03-10T11:29:00.000-03:00</published><updated>2011-03-10T11:29:40.014-03:00</updated><title type='text'>Estatística de arquivo BAM</title><content type='html'>Uma maneira fácil de extrair algumas estatísticas de um arquivo bam é utilizar o comando&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;samtools flagstat &lt;arquivo.bam&gt;&lt;/arquivo.bam&gt;&lt;br /&gt;&lt;br /&gt;Um outro programa que parece muito interessante é o samstat:&lt;br /&gt;&lt;br /&gt;&lt;a href="http://samstat.sourceforge.net/"&gt;http://samstat.sourceforge.net/&lt;/a&gt;&lt;br /&gt;&lt;br /&gt;Que gera um relatório HTML sobre o alinhamento.&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/14515087-5354007598982414518?l=bioinfo-br.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://bioinfo-br.blogspot.com/feeds/5354007598982414518/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=14515087&amp;postID=5354007598982414518' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/14515087/posts/default/5354007598982414518'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/14515087/posts/default/5354007598982414518'/><link rel='alternate' type='text/html' href='http://bioinfo-br.blogspot.com/2011/03/estatistica-de-arquivo-bam.html' title='Estatística de arquivo BAM'/><author><name>Leonardo Varuzza</name><uri>https://profiles.google.com/106428020899659897437</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='32' height='32' src='//lh4.googleusercontent.com/-VfVeE_sAxds/AAAAAAAAAAI/AAAAAAAAAAA/ohSMK6AOy8g/s512-c/photo.jpg'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-14515087.post-528370231820802960</id><published>2011-02-25T08:18:00.002-03:00</published><updated>2011-02-25T08:18:51.517-03:00</updated><title type='text'>Ion Torrent Sequencia o Genome de Gordon Moore da Intel</title><content type='html'>A Ion Torrent anunciou o sequenciamento do genoma de Gordon Moore, fundador da Intel e criador da lei de Moore:&lt;br /&gt;&lt;br /&gt;&lt;a href="http://www.genomeweb.com/sequencing/ion-torrent-sequences-intel-co-founders-genome-finds-more-uniform-coverage"&gt;http://www.genomeweb.com/sequencing/ion-torrent-sequences-intel-co-founders-genome-finds-more-uniform-coverage&lt;/a&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/14515087-528370231820802960?l=bioinfo-br.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://bioinfo-br.blogspot.com/feeds/528370231820802960/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=14515087&amp;postID=528370231820802960' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/14515087/posts/default/528370231820802960'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/14515087/posts/default/528370231820802960'/><link rel='alternate' type='text/html' href='http://bioinfo-br.blogspot.com/2011/02/ion-torrent-sequencia-o-genome-de.html' title='Ion Torrent Sequencia o Genome de Gordon Moore da Intel'/><author><name>Leonardo Varuzza</name><uri>https://profiles.google.com/106428020899659897437</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='32' height='32' src='//lh4.googleusercontent.com/-VfVeE_sAxds/AAAAAAAAAAI/AAAAAAAAAAA/ohSMK6AOy8g/s512-c/photo.jpg'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-14515087.post-4174256841605336346</id><published>2011-02-24T17:37:00.001-03:00</published><updated>2011-02-24T19:00:56.922-03:00</updated><title type='text'>Visualizando os grafos do Velvet</title><content type='html'>O velvet inclui um script para converter o arquivo LastGraph para o formato dot. O problema é que o visualizador do graphviz é bem tosco, porém é possível abrir um arquivo no formato dot no visualizador gratuíto gephi:&lt;br /&gt;&lt;br /&gt;&lt;a href="http://gephi.org/"&gt;http://gephi.org/&lt;/a&gt;&lt;br /&gt;&lt;br /&gt;O resultado &amp;nbsp;da reenderização do grafo pode ser visto abaixo:&lt;br /&gt;&lt;br /&gt;&lt;div class="separator" style="clear: both; text-align: center;"&gt;&lt;a href="http://1.bp.blogspot.com/-2KXrJlPoSrs/TWbBVvkJsHI/AAAAAAAAClY/98hZNRXA-U4/s1600/gephi.png" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"&gt;&lt;img border="0" height="400" src="http://1.bp.blogspot.com/-2KXrJlPoSrs/TWbBVvkJsHI/AAAAAAAAClY/98hZNRXA-U4/s400/gephi.png" width="337" /&gt;&lt;/a&gt;&lt;/div&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/14515087-4174256841605336346?l=bioinfo-br.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://bioinfo-br.blogspot.com/feeds/4174256841605336346/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=14515087&amp;postID=4174256841605336346' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/14515087/posts/default/4174256841605336346'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/14515087/posts/default/4174256841605336346'/><link rel='alternate' type='text/html' href='http://bioinfo-br.blogspot.com/2011/02/visualizando-os-grafos-do-velvet.html' title='Visualizando os grafos do Velvet'/><author><name>Leonardo Varuzza</name><uri>https://profiles.google.com/106428020899659897437</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='32' height='32' src='//lh4.googleusercontent.com/-VfVeE_sAxds/AAAAAAAAAAI/AAAAAAAAAAA/ohSMK6AOy8g/s512-c/photo.jpg'/></author><media:thumbnail xmlns:media='http://search.yahoo.com/mrss/' url='http://1.bp.blogspot.com/-2KXrJlPoSrs/TWbBVvkJsHI/AAAAAAAAClY/98hZNRXA-U4/s72-c/gephi.png' height='72' width='72'/><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-14515087.post-3880364646017549159</id><published>2011-02-23T21:04:00.000-03:00</published><updated>2011-02-23T21:04:17.402-03:00</updated><title type='text'>Chip de 1Gbp para Ion Torrent anunciado para a segunda metade de 2011</title><content type='html'>Hoje foi anunciada uma nova versão do chip de sequenciamento do Ion Torrent, a 318, capaz de sequenciar até 1 Gbp em reads de pelo menos 300bp&lt;br /&gt;&lt;br /&gt;Comentários adicionais sobre esse anuncio no blog Omics! Omics!&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;&lt;a href="http://omicsomics.blogspot.com/2011/02/has-ion-torrent-taken-318-sized-lead.html"&gt;http://omicsomics.blogspot.com/2011/02/has-ion-torrent-taken-318-sized-lead.html&lt;/a&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/14515087-3880364646017549159?l=bioinfo-br.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://bioinfo-br.blogspot.com/feeds/3880364646017549159/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=14515087&amp;postID=3880364646017549159' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/14515087/posts/default/3880364646017549159'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/14515087/posts/default/3880364646017549159'/><link rel='alternate' type='text/html' href='http://bioinfo-br.blogspot.com/2011/02/chip-de-1gbp-para-ion-torrent-anunciado.html' title='Chip de 1Gbp para Ion Torrent anunciado para a segunda metade de 2011'/><author><name>Leonardo Varuzza</name><uri>https://profiles.google.com/106428020899659897437</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='32' height='32' src='//lh4.googleusercontent.com/-VfVeE_sAxds/AAAAAAAAAAI/AAAAAAAAAAA/ohSMK6AOy8g/s512-c/photo.jpg'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-14515087.post-2150246441487373754</id><published>2011-02-17T14:31:00.001-02:00</published><updated>2011-02-20T10:31:55.925-03:00</updated><title type='text'>Ion Community</title><content type='html'>Existe um site para os usuários de Ion Torrent chamado Ion Community:&lt;br /&gt;&lt;br /&gt;&lt;a href="http://ioncommunity.iontorrent.com/"&gt;http://ioncommunity.iontorrent.com/&lt;/a&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/14515087-2150246441487373754?l=bioinfo-br.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://bioinfo-br.blogspot.com/feeds/2150246441487373754/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=14515087&amp;postID=2150246441487373754' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/14515087/posts/default/2150246441487373754'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/14515087/posts/default/2150246441487373754'/><link rel='alternate' type='text/html' href='http://bioinfo-br.blogspot.com/2011/02/ion-community.html' title='Ion Community'/><author><name>Leonardo Varuzza</name><uri>https://profiles.google.com/106428020899659897437</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='32' height='32' src='//lh4.googleusercontent.com/-VfVeE_sAxds/AAAAAAAAAAI/AAAAAAAAAAA/ohSMK6AOy8g/s512-c/photo.jpg'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-14515087.post-2034281486683031180</id><published>2011-02-08T14:01:00.000-02:00</published><updated>2011-02-08T14:01:08.407-02:00</updated><title type='text'>Big Data, Big Problems</title><content type='html'>Artigo muito interessante sobre estrutura computacional para analisar grandes volumes de dados:&lt;br /&gt;&lt;br /&gt;&lt;a href="http://cacm.acm.org/blogs/blog-cacm/103932-big-data-big-problems/fulltext"&gt;http://cacm.acm.org/blogs/blog-cacm/103932-big-data-big-problems/fulltext&lt;/a&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/14515087-2034281486683031180?l=bioinfo-br.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://bioinfo-br.blogspot.com/feeds/2034281486683031180/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=14515087&amp;postID=2034281486683031180' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/14515087/posts/default/2034281486683031180'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/14515087/posts/default/2034281486683031180'/><link rel='alternate' type='text/html' href='http://bioinfo-br.blogspot.com/2011/02/big-data-big-problems.html' title='Big Data, Big Problems'/><author><name>Leonardo Varuzza</name><uri>https://profiles.google.com/106428020899659897437</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='32' height='32' src='//lh4.googleusercontent.com/-VfVeE_sAxds/AAAAAAAAAAI/AAAAAAAAAAA/ohSMK6AOy8g/s512-c/photo.jpg'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-14515087.post-6196259458282122426</id><published>2011-01-16T11:16:00.000-02:00</published><updated>2011-01-16T11:16:00.189-02:00</updated><title type='text'>Genoma por US$ 10mil</title><content type='html'>A empresa DNAVision em Bruxelas oferece comercialmente o sequenciamento de um genoma humano por US$ 10mil. A empresa recentemente adquiriu 2 SOLID's 5500xl e 2 SOLiD's 4 para essa tarefa.&lt;br /&gt;&lt;br /&gt;Fonte:&amp;nbsp; &lt;a href="http://www.genomeweb.com/sequencing/dnavision-offer-10k-human-genome-sequencing-services-purchases-four-solids"&gt;http://www.genomeweb.com/sequencing/dnavision-offer-10k-human-genome-sequencing-services-purchases-four-solids&lt;/a&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/14515087-6196259458282122426?l=bioinfo-br.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://bioinfo-br.blogspot.com/feeds/6196259458282122426/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=14515087&amp;postID=6196259458282122426' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/14515087/posts/default/6196259458282122426'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/14515087/posts/default/6196259458282122426'/><link rel='alternate' type='text/html' href='http://bioinfo-br.blogspot.com/2011/01/genoma-por-us-10mil.html' title='Genoma por US$ 10mil'/><author><name>Leonardo Varuzza</name><uri>https://profiles.google.com/106428020899659897437</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='32' height='32' src='//lh4.googleusercontent.com/-VfVeE_sAxds/AAAAAAAAAAI/AAAAAAAAAAA/ohSMK6AOy8g/s512-c/photo.jpg'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-14515087.post-5397591379584078023</id><published>2010-12-16T18:09:00.000-02:00</published><updated>2010-12-16T18:09:39.040-02:00</updated><title type='text'>SOLiD Paper: Identification of methylated regions with peak search based on Poisson model from massively parallel methylated DNA immunoprecipitation-sequencing data.</title><content type='html'>Artigo com análise do metiloma humano utilizando uma estratégia de enriquecimento de regiões metiladas:&lt;br /&gt;&lt;br /&gt;&lt;a href="http://www.ncbi.nlm.nih.gov/pubmed/20925052"&gt;"Identification of methylated regions with peak search based on Poisson model from massively parallel methylated DNA immunoprecipitation-sequencing data"&lt;/a&gt;&lt;br /&gt;&lt;br /&gt;Os autores verificarão que é necessário somente um quad do SOLiD para sequenciar todo o metiloma humano.&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/14515087-5397591379584078023?l=bioinfo-br.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://bioinfo-br.blogspot.com/feeds/5397591379584078023/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=14515087&amp;postID=5397591379584078023' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/14515087/posts/default/5397591379584078023'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/14515087/posts/default/5397591379584078023'/><link rel='alternate' type='text/html' href='http://bioinfo-br.blogspot.com/2010/12/solid-paper-identification-of.html' title='SOLiD Paper: Identification of methylated regions with peak search based on Poisson model from massively parallel methylated DNA immunoprecipitation-sequencing data.'/><author><name>Leonardo Varuzza</name><uri>https://profiles.google.com/106428020899659897437</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='32' height='32' src='//lh4.googleusercontent.com/-VfVeE_sAxds/AAAAAAAAAAI/AAAAAAAAAAA/ohSMK6AOy8g/s512-c/photo.jpg'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-14515087.post-830002778471380640</id><published>2010-12-16T09:36:00.000-02:00</published><updated>2010-12-16T09:36:35.941-02:00</updated><title type='text'>SOLiD Paper: De novo mutations of SETBP1 cause Schinzel-Giedion syndrome</title><content type='html'>No artigo &lt;a href="http://www.nature.com/ng/journal/v42/n6/full/ng.581.html"&gt;"De novo mutations of SETBP1&amp;nbsp;cause Schinzel-Giedion syndrome"&lt;/a&gt; os autores fazem o sequenciamento do exoma de 4 pacientes e descobrem 3 mutações no gene SETBP1 relacionadas com a doença.&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;/div&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/14515087-830002778471380640?l=bioinfo-br.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://bioinfo-br.blogspot.com/feeds/830002778471380640/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=14515087&amp;postID=830002778471380640' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/14515087/posts/default/830002778471380640'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/14515087/posts/default/830002778471380640'/><link rel='alternate' type='text/html' href='http://bioinfo-br.blogspot.com/2010/12/solid-paper-de-novo-mutations-of-setbp1.html' title='SOLiD Paper: De novo mutations of SETBP1 cause Schinzel-Giedion syndrome'/><author><name>Leonardo Varuzza</name><uri>https://profiles.google.com/106428020899659897437</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='32' height='32' src='//lh4.googleusercontent.com/-VfVeE_sAxds/AAAAAAAAAAI/AAAAAAAAAAA/ohSMK6AOy8g/s512-c/photo.jpg'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-14515087.post-7758067033576726838</id><published>2010-12-06T16:11:00.000-02:00</published><updated>2010-12-06T16:11:11.135-02:00</updated><title type='text'>SOLiD Paper: A small-cell lung cancer genome with complex signatures of tobacco exposure</title><content type='html'>Eu vou postar aqui no blog alguns papers chave que utilizam o SOLiD em diferentes aplicações. &amp;nbsp;Iniciando com o artigo&lt;br /&gt;&lt;br /&gt;A small-cell lung cancer genome with complex signatures of tobacco exposure:&lt;br /&gt;&lt;a href="http://www.nature.com/nature/journal/v463/n7278/abs/nature08629.html"&gt;http://www.nature.com/nature/journal/v463/n7278/abs/nature08629.html&lt;/a&gt;&lt;br /&gt;&lt;br /&gt;Que é um trabalho de resequenciamento de células de pulmão imortalizadas normais e tumorais. Comparando as mutações nas duas amostras os autores detecção as mutações&amp;nbsp;somáticas&amp;nbsp;adquiridas pelo indivíduo.&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/14515087-7758067033576726838?l=bioinfo-br.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://bioinfo-br.blogspot.com/feeds/7758067033576726838/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=14515087&amp;postID=7758067033576726838' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/14515087/posts/default/7758067033576726838'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/14515087/posts/default/7758067033576726838'/><link rel='alternate' type='text/html' href='http://bioinfo-br.blogspot.com/2010/12/solid-paper-small-cell-lung-cancer.html' title='SOLiD Paper: A small-cell lung cancer genome with complex signatures of tobacco exposure'/><author><name>Leonardo Varuzza</name><uri>https://profiles.google.com/106428020899659897437</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='32' height='32' src='//lh4.googleusercontent.com/-VfVeE_sAxds/AAAAAAAAAAI/AAAAAAAAAAA/ohSMK6AOy8g/s512-c/photo.jpg'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-14515087.post-7136525355422357489</id><published>2010-11-30T16:39:00.000-02:00</published><updated>2010-11-30T16:39:05.429-02:00</updated><title type='text'>Servidor com 1 TB de RAM por US$ 100 mil</title><content type='html'>O &lt;a href="http://www.dell.com/us/en/enterprise/servers/poweredge-r910/pd.aspx?refid=poweredge-r910"&gt;PowerEdge R910&lt;/a&gt; da Dell é um servidor 4U que aceita até 1 TB de RAM, um valor que me parece muito adequado para fazer montagem de genomas. O custo básico do servidor com 128 GB de RAM é de US$ 27mil, para aumentar a RAM para 1 TB adiciona-se US$ 62mil. Aumentando também a quantidade de disco e adicionando processadores de 8C chega-se a uns US$ 100mil, o que me parece um valor viável para um servidor que em teoria pode montar o genoma de um mamífero.&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/14515087-7136525355422357489?l=bioinfo-br.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://bioinfo-br.blogspot.com/feeds/7136525355422357489/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=14515087&amp;postID=7136525355422357489' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/14515087/posts/default/7136525355422357489'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/14515087/posts/default/7136525355422357489'/><link rel='alternate' type='text/html' href='http://bioinfo-br.blogspot.com/2010/11/servidor-com-1-tb-de-ram-por-us-100-mil.html' title='Servidor com 1 TB de RAM por US$ 100 mil'/><author><name>Leonardo Varuzza</name><uri>https://profiles.google.com/106428020899659897437</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='32' height='32' src='//lh4.googleusercontent.com/-VfVeE_sAxds/AAAAAAAAAAI/AAAAAAAAAAA/ohSMK6AOy8g/s512-c/photo.jpg'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-14515087.post-4434306891260186572</id><published>2010-11-24T16:44:00.000-02:00</published><updated>2010-11-24T16:44:47.279-02:00</updated><title type='text'>SAM format specification 1.3</title><content type='html'>Saiu a versão 1.3 da especificação do formato SAM/BAM:&lt;br /&gt;&lt;br /&gt;&lt;a href="http://samtools.sourceforge.net/SAM-1.3.pdf"&gt;http://samtools.sourceforge.net/SAM-1.3.pdf&lt;/a&gt;&lt;br /&gt;&lt;br /&gt;A meu ver eles melhoraram a formatação do arquivo (aparentemente trocaram o Word por LaTeX) e mudaram a nomenclatura para lidar com leituras com mais de 2 tags (bibliotecas de fragmento tem 1 tag, de mate-pair ou pair-end tem 2 e as bibliotecas feitas com strobo lightining no PacBio tem diversas tags).&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/14515087-4434306891260186572?l=bioinfo-br.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://bioinfo-br.blogspot.com/feeds/4434306891260186572/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=14515087&amp;postID=4434306891260186572' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/14515087/posts/default/4434306891260186572'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/14515087/posts/default/4434306891260186572'/><link rel='alternate' type='text/html' href='http://bioinfo-br.blogspot.com/2010/11/sam-format-specification-13.html' title='SAM format specification 1.3'/><author><name>Leonardo Varuzza</name><uri>https://profiles.google.com/106428020899659897437</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='32' height='32' src='//lh4.googleusercontent.com/-VfVeE_sAxds/AAAAAAAAAAI/AAAAAAAAAAA/ohSMK6AOy8g/s512-c/photo.jpg'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-14515087.post-966991754197808483</id><published>2010-11-19T23:06:00.000-02:00</published><updated>2010-11-19T23:06:25.057-02:00</updated><title type='text'>A de novo paradigm for mental retardation</title><content type='html'>No artigo&amp;nbsp;&lt;a href="http://www.nature.com/doifinder/10.1038/ng.712"&gt;A de novo paradigm for mental retardation&lt;/a&gt;&amp;nbsp;os autores utilzam dados de SOLiD para sequenciar o exoma de 10 trios (mãe, pai e filho/a) em que o filho/a sofrem de retardo mental. O artigo busca mutações não sinonimo não herdadas dos pais que poderiam estar ligadas ao retardo mental. &amp;nbsp;&lt;br /&gt;&lt;br /&gt;A referência é essa:&lt;br /&gt;&lt;br /&gt;Referencia: Vissers LELM, Ligt J de, Gilissen C, et al. &lt;a href="http://www.nature.com/doifinder/10.1038/ng.712"&gt;A de novo paradigm for mental retardation&lt;/a&gt;. &lt;i&gt;Nature Genetics&lt;/i&gt;. 2010;(November).&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/14515087-966991754197808483?l=bioinfo-br.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://bioinfo-br.blogspot.com/feeds/966991754197808483/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=14515087&amp;postID=966991754197808483' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/14515087/posts/default/966991754197808483'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/14515087/posts/default/966991754197808483'/><link rel='alternate' type='text/html' href='http://bioinfo-br.blogspot.com/2010/11/de-novo-paradigm-for-mental-retardation.html' title='A de novo paradigm for mental retardation'/><author><name>Leonardo Varuzza</name><uri>https://profiles.google.com/106428020899659897437</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='32' height='32' src='//lh4.googleusercontent.com/-VfVeE_sAxds/AAAAAAAAAAI/AAAAAAAAAAA/ohSMK6AOy8g/s512-c/photo.jpg'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-14515087.post-1760347661893708231</id><published>2010-11-19T18:36:00.000-02:00</published><updated>2010-11-19T18:36:24.485-02:00</updated><title type='text'>Autor do Montador de Genomas Cortex vai estar em Curitiba em Dezembro</title><content type='html'>Já faz um tempo eu ouço falar do montador Cortex, que seria um montador novo, sucessor do Velvet. &amp;nbsp;As únicas que eu consegui foram comentários de pessoas que viram a palestra do autor do programa Mario Cacamo.&lt;br /&gt;&lt;br /&gt;Felizmente dia 10 de Dezembro vai ter uma palestestra do Mario Cacamo em Curitiba:&lt;br /&gt;&lt;br /&gt;&lt;a href="http://www.bioinfo.ufpr.br/iwb/programme.php"&gt;http://www.bioinfo.ufpr.br/iwb/programme.php&lt;/a&gt;&lt;br /&gt;&lt;br /&gt;Eu pretendo ir para descobrir mais detalhes diretamente.&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/14515087-1760347661893708231?l=bioinfo-br.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://bioinfo-br.blogspot.com/feeds/1760347661893708231/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=14515087&amp;postID=1760347661893708231' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/14515087/posts/default/1760347661893708231'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/14515087/posts/default/1760347661893708231'/><link rel='alternate' type='text/html' href='http://bioinfo-br.blogspot.com/2010/11/autor-do-montador-de-genomas-cortex-vai.html' title='Autor do Montador de Genomas Cortex vai estar em Curitiba em Dezembro'/><author><name>Leonardo Varuzza</name><uri>https://profiles.google.com/106428020899659897437</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='32' height='32' src='//lh4.googleusercontent.com/-VfVeE_sAxds/AAAAAAAAAAI/AAAAAAAAAAA/ohSMK6AOy8g/s512-c/photo.jpg'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-14515087.post-7766401478051103862</id><published>2010-10-27T11:04:00.000-02:00</published><updated>2010-10-27T11:04:16.653-02:00</updated><title type='text'>Processing sequences with Hadoop</title><content type='html'>&lt;div style="font-family: inherit; text-align: justify;"&gt;The increasing volume of data generated by next-gen creates a huge pressure in the computer infra-structure. We need bigger and more expensive storage solutions to hold the results of the experiments and the intermediate data of the analysis, and because  all the data are in a single system, we need more network bandwidth to deal with all-to-one pattern of network access (we can have a 10 GB connection to the storage, but what we do if this is not enought?).&lt;/div&gt;&lt;div style="font-family: inherit; text-align: justify;"&gt;&lt;br /&gt;&lt;/div&gt;&lt;div style="font-family: inherit; text-align: justify;"&gt;Large internet sites like google, Facebook, etc have a similar problem with the data from their users (or from all the internet in the case of Google). To solve the storage and the processing problem they create new solutions to scale the storage and the processing to really big clusters (hundreds or thounsands of nodes). Google created the MapReduce Framework and the Google Filesystem  (&lt;a href="http://labs.google.com/papers/mapreduce.html"&gt;http://labs.google.com/papers/mapreduce.html&lt;/a&gt; and &lt;a href="http://labs.google.com/papers/gfs.html"&gt;http://labs.google.com/papers/gfs.html&lt;/a&gt;), which are solutions for batch processing huge amount of data. Haddop is a open source Java Framework which implements both the Distributed filesystem and the MapReduce framework and have been used by big internet players like Yahoo and Microsoft.&lt;/div&gt;&lt;div style="font-family: inherit; text-align: justify;"&gt;&lt;br /&gt;&lt;/div&gt;&lt;div style="font-family: inherit; text-align: justify;"&gt;In the Bifx space we have some projects using hadoop to process next-gen data:&lt;/div&gt;&lt;div style="font-family: inherit; text-align: justify;"&gt;&lt;br /&gt;&lt;/div&gt;&lt;div style="font-family: inherit; text-align: justify;"&gt;Crossbow: &lt;a href="http://bowtie-bio.sourceforge.net/crossbow/index.shtml"&gt;http://bowtie-bio.sourceforge.net/crossbow/index.shtml&lt;/a&gt;&lt;/div&gt;&lt;div style="font-family: inherit; text-align: justify;"&gt;Myrna: &lt;a href="http://bowtie-bio.sourceforge.net/myrna/index.shtml"&gt;http://bowtie-bio.sourceforge.net/myrna/index.shtml&lt;/a&gt;&lt;/div&gt;&lt;div style="font-family: inherit; text-align: left;"&gt;Cloudburst: &lt;a href="http://sourceforge.net/apps/mediawiki/cloudburst-bio/index.php?title=CloudBurst"&gt;http://sourceforge.net/apps/mediawiki/cloudburst-bio/index.php?title=CloudBurst&lt;/a&gt;&lt;/div&gt;&lt;div style="font-family: inherit; text-align: justify;"&gt;&lt;br /&gt;&lt;/div&gt;&lt;div style="font-family: inherit; text-align: justify;"&gt;Anyone can lear more about hadoop with the training videos from Cloudera (A hadoop consulting company):&lt;/div&gt;&lt;div style="font-family: inherit; text-align: justify;"&gt;&lt;br /&gt;&lt;/div&gt;&lt;div style="font-family: inherit; text-align: justify;"&gt;&lt;a href="http://www.cloudera.com/resources/?type=Training"&gt;http://www.cloudera.com/resources/?type=Training&lt;/a&gt;&lt;/div&gt;&lt;div style="font-family: inherit; text-align: justify;"&gt;&lt;br /&gt;&lt;/div&gt;&lt;div style="font-family: inherit; text-align: justify;"&gt;I have been playing with Hadoop in my free type and I produced some code to deal with Biological data, more specificaly I created a InputFormat for Fasta files and a FastatoFastq converter using hadoop, the code is in github:&lt;/div&gt;&lt;div style="font-family: inherit; text-align: justify;"&gt;&lt;br /&gt;&lt;/div&gt;&lt;div style="font-family: inherit; text-align: justify;"&gt;&lt;a href="http://github.com/lvaruzza/bioseq-hadoop"&gt;http://github.com/lvaruzza/bioseq-hadoop&lt;/a&gt;&lt;/div&gt;&lt;div style="font-family: inherit; text-align: justify;"&gt;&lt;br /&gt;&lt;/div&gt;&lt;div style="font-family: inherit; text-align: justify;"&gt;This is more a self-study project, but anyone interested in Hadoop and distributed programming is invited to use this code.&lt;/div&gt;&lt;div style="font-family: inherit; text-align: justify;"&gt;&lt;br /&gt;&lt;/div&gt;&lt;div style="text-align: justify;"&gt;&lt;br /&gt;&lt;/div&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/14515087-7766401478051103862?l=bioinfo-br.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://bioinfo-br.blogspot.com/feeds/7766401478051103862/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=14515087&amp;postID=7766401478051103862' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/14515087/posts/default/7766401478051103862'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/14515087/posts/default/7766401478051103862'/><link rel='alternate' type='text/html' href='http://bioinfo-br.blogspot.com/2010/10/processing-sequences-with-hadoop.html' title='Processing sequences with Hadoop'/><author><name>Leonardo Varuzza</name><uri>https://profiles.google.com/106428020899659897437</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='32' height='32' src='//lh4.googleusercontent.com/-VfVeE_sAxds/AAAAAAAAAAI/AAAAAAAAAAA/ohSMK6AOy8g/s512-c/photo.jpg'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-14515087.post-5114222375113337579</id><published>2010-10-22T11:27:00.002-02:00</published><updated>2010-10-22T11:28:16.040-02:00</updated><title type='text'>Baylor College vai comprar 2dois servidores com 1TB de RAM cada</title><content type='html'>Segundo matéria no GenomeWeb o Baylor College vai gastar US$ 262 mil na compra de dois servidores para montagem de genomas, cada um com 1 TB de RAM:&lt;br /&gt;&lt;br /&gt;&lt;a href="http://www.genomeweb.com/informatics/baylor-genome-sequencing-center-wins-262k-stimulus-funds-hpc-cluster"&gt;http://www.genomeweb.com/informatics/baylor-genome-sequencing-center-wins-262k-stimulus-funds-hpc-cluster&lt;/a&gt;&lt;br /&gt;&lt;br /&gt;PS: Com 20 sequenciadores, o Baylor College é atualmente o maior site com SOLiD's instalados.&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/14515087-5114222375113337579?l=bioinfo-br.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://bioinfo-br.blogspot.com/feeds/5114222375113337579/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=14515087&amp;postID=5114222375113337579' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/14515087/posts/default/5114222375113337579'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/14515087/posts/default/5114222375113337579'/><link rel='alternate' type='text/html' href='http://bioinfo-br.blogspot.com/2010/10/baylor-college-vai-comprar-2dois.html' title='Baylor College vai comprar 2dois servidores com 1TB de RAM cada'/><author><name>Leonardo Varuzza</name><uri>https://profiles.google.com/106428020899659897437</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='32' height='32' src='//lh4.googleusercontent.com/-VfVeE_sAxds/AAAAAAAAAAI/AAAAAAAAAAA/ohSMK6AOy8g/s512-c/photo.jpg'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-14515087.post-4581854317749716069</id><published>2010-09-10T19:21:00.000-03:00</published><updated>2010-09-10T19:21:10.885-03:00</updated><title type='text'>A mudança na geração de dados causada pelo NextGen</title><content type='html'>Achei um comentário muito interessante em um artigo do Lincoln Stein:&lt;br /&gt;&lt;br /&gt;"... the 1000 Genomes Project [25], which is cataloguing human genetic variation, deposited twice as much raw sequencing data into GenBank’s SRA division during the project’s first 6 months of operation as had been deposited into all of GenBank for the entire 30 years preceding (Paul Flicek, personal communication)."&lt;br /&gt;&lt;br /&gt;Fonte: &lt;a href="http://www.blogger.com/%28http://genomebiology.com/2010/11/5/207"&gt;http://genomebiology.com/2010/11/5/207&lt;/a&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/14515087-4581854317749716069?l=bioinfo-br.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://bioinfo-br.blogspot.com/feeds/4581854317749716069/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=14515087&amp;postID=4581854317749716069' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/14515087/posts/default/4581854317749716069'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/14515087/posts/default/4581854317749716069'/><link rel='alternate' type='text/html' href='http://bioinfo-br.blogspot.com/2010/09/mudanca-na-geracao-de-dados-causada.html' title='A mudança na geração de dados causada pelo NextGen'/><author><name>Leonardo Varuzza</name><uri>https://profiles.google.com/106428020899659897437</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='32' height='32' src='//lh4.googleusercontent.com/-VfVeE_sAxds/AAAAAAAAAAI/AAAAAAAAAAA/ohSMK6AOy8g/s512-c/photo.jpg'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-14515087.post-6183177922083314071</id><published>2010-07-19T16:33:00.000-03:00</published><updated>2010-07-19T16:33:40.151-03:00</updated><title type='text'>SOLiD Probes</title><content type='html'>&lt;span style="font-size: x-small;"&gt;O probe que faz a leitura da sequencia do SOLiD tem a seguinte estrutura&lt;/span&gt;&lt;span style="font-size: x-small;"&gt;:&lt;br /&gt;&lt;br /&gt;XX NNN ZZZ&lt;br /&gt;&lt;br /&gt;Onde XX é a dibase cujo o flurofito se refere, NNN são chamdas de bases degeneradas e ZZZ são bases  universais (&lt;a href="http://en.wikipedia.org/wiki/Inosine"&gt;inosinas&lt;/a&gt;). &lt;br /&gt;&lt;br /&gt;Eu pensava que as bases degeneradas eram algum tipo de base modificada mas na verdade são  bases normais que são sintetizadas em todas as 1024 (4^5) combinações possíveis e  portanto os probes anelam em qualquer sequencia, não por uma modificação  na base mas sim pela forma como os probes são misturados,ou seja, a mistura que é degenerada. Com essa construção, apesar de lermos somente 2 bases por ciclo, a especificidade do probe é referente a cinco bases.&lt;/span&gt;&lt;br /&gt;&lt;span style="font-size: x-small;"&gt; &lt;/span&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/14515087-6183177922083314071?l=bioinfo-br.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://bioinfo-br.blogspot.com/feeds/6183177922083314071/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=14515087&amp;postID=6183177922083314071' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/14515087/posts/default/6183177922083314071'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/14515087/posts/default/6183177922083314071'/><link rel='alternate' type='text/html' href='http://bioinfo-br.blogspot.com/2010/07/solid-probes.html' title='SOLiD Probes'/><author><name>Leonardo Varuzza</name><uri>https://profiles.google.com/106428020899659897437</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='32' height='32' src='//lh4.googleusercontent.com/-VfVeE_sAxds/AAAAAAAAAAI/AAAAAAAAAAA/ohSMK6AOy8g/s512-c/photo.jpg'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-14515087.post-8673151856229923297</id><published>2010-07-19T14:22:00.000-03:00</published><updated>2010-07-19T14:22:13.722-03:00</updated><title type='text'>Mate Pair Paper</title><content type='html'>Recentemente me perguntaram a referência da técnica de mate pair. Achei esse artigo que acredito ser a referência original da técnica: &lt;br /&gt;&lt;br /&gt;&lt;br /&gt;Pairwise end sequencing: a unified approach to genomic mapping and sequencing.&lt;br /&gt;Roach et al, 1995: &lt;br /&gt;http://www.ncbi.nlm.nih.gov/pubmed/7601461&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/14515087-8673151856229923297?l=bioinfo-br.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://bioinfo-br.blogspot.com/feeds/8673151856229923297/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=14515087&amp;postID=8673151856229923297' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/14515087/posts/default/8673151856229923297'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/14515087/posts/default/8673151856229923297'/><link rel='alternate' type='text/html' href='http://bioinfo-br.blogspot.com/2010/07/mate-pair-paper.html' title='Mate Pair Paper'/><author><name>Leonardo Varuzza</name><uri>https://profiles.google.com/106428020899659897437</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='32' height='32' src='//lh4.googleusercontent.com/-VfVeE_sAxds/AAAAAAAAAAI/AAAAAAAAAAA/ohSMK6AOy8g/s512-c/photo.jpg'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-14515087.post-8443571262434933091</id><published>2010-07-06T11:44:00.000-03:00</published><updated>2010-07-06T11:44:19.748-03:00</updated><title type='text'>Introdução ao Python para Cientistas</title><content type='html'>Coleção de tutoriais em python focados em aplicações para cientistas:&lt;br /&gt;&lt;br /&gt;http://software-carpentry.org/&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/14515087-8443571262434933091?l=bioinfo-br.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://bioinfo-br.blogspot.com/feeds/8443571262434933091/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=14515087&amp;postID=8443571262434933091' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/14515087/posts/default/8443571262434933091'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/14515087/posts/default/8443571262434933091'/><link rel='alternate' type='text/html' href='http://bioinfo-br.blogspot.com/2010/07/introducao-ao-python-para-cientistas.html' title='Introdução ao Python para Cientistas'/><author><name>Leonardo Varuzza</name><uri>https://profiles.google.com/106428020899659897437</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='32' height='32' src='//lh4.googleusercontent.com/-VfVeE_sAxds/AAAAAAAAAAI/AAAAAAAAAAA/ohSMK6AOy8g/s512-c/photo.jpg'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-14515087.post-1755278480284214054</id><published>2010-07-06T11:36:00.002-03:00</published><updated>2010-07-06T11:36:43.010-03:00</updated><title type='text'>Nova versão da CLC Machine</title><content type='html'>&lt;span style="font-size: x-small;"&gt;A CLC lançou uma nova configuração da CLC Machine:&lt;br /&gt;&lt;br /&gt;&lt;a href="https://frd.mail.lifetech.com/owa/redir.aspx?C=f19e78d96e434e42883ac79975855dc1&amp;amp;URL=http%3a%2f%2fwww.clcmachine.com%2fsystem.php" target="_blank"&gt;http://www.clcmachine.com/system.php&lt;/a&gt;&lt;br /&gt;&lt;br /&gt;Essa configuração me parece mais apropriada para análise de dados de SOLiD do que a configuração  anterior: 12 cores, 48 GB de RAM e 8 TB de disco.&lt;/span&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/14515087-1755278480284214054?l=bioinfo-br.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://bioinfo-br.blogspot.com/feeds/1755278480284214054/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=14515087&amp;postID=1755278480284214054' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/14515087/posts/default/1755278480284214054'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/14515087/posts/default/1755278480284214054'/><link rel='alternate' type='text/html' href='http://bioinfo-br.blogspot.com/2010/07/nova-versao-da-clc-machine.html' title='Nova versão da CLC Machine'/><author><name>Leonardo Varuzza</name><uri>https://profiles.google.com/106428020899659897437</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='32' height='32' src='//lh4.googleusercontent.com/-VfVeE_sAxds/AAAAAAAAAAI/AAAAAAAAAAA/ohSMK6AOy8g/s512-c/photo.jpg'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-14515087.post-7137741244826032761</id><published>2010-07-05T16:14:00.000-03:00</published><updated>2010-07-05T16:14:47.446-03:00</updated><title type='text'>Valores de qualidade no SOLiD</title><content type='html'>Saiu um application notes bem interessante sobre detecção de beads policlonais em uma corrida de SOLiD: &lt;br /&gt;&lt;br /&gt;&lt;a href="http://bioinformatics.oxfordjournals.org/cgi/content/abstract/26/6/849"&gt;http://bioinformatics.oxfordjournals.org/cgi/content/abstract/26/6/849&lt;/a&gt;&lt;br /&gt;&lt;br /&gt;Nos reads normais a qualidade cai&amp;nbsp; na extremidade 3', porém nos reads policlonais a qualidade é ruim em todo o read, portanto se for feita uma filtragem pelos reads com baixa qualidade no incio, por exemplo nas 10 primeiras bases, é possível filtrar os reads policlonais.&lt;br /&gt;&lt;br /&gt;Outro resultado interessante do artigo é que a maioria dos erros de mapeamento se concentra em bases com QV &amp;lt; 10.&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/14515087-7137741244826032761?l=bioinfo-br.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://bioinfo-br.blogspot.com/feeds/7137741244826032761/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=14515087&amp;postID=7137741244826032761' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/14515087/posts/default/7137741244826032761'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/14515087/posts/default/7137741244826032761'/><link rel='alternate' type='text/html' href='http://bioinfo-br.blogspot.com/2010/07/valores-de-qualidade-no-solid.html' title='Valores de qualidade no SOLiD'/><author><name>Leonardo Varuzza</name><uri>https://profiles.google.com/106428020899659897437</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='32' height='32' src='//lh4.googleusercontent.com/-VfVeE_sAxds/AAAAAAAAAAI/AAAAAAAAAAA/ohSMK6AOy8g/s512-c/photo.jpg'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-14515087.post-7978830400579069211</id><published>2010-06-29T22:41:00.003-03:00</published><updated>2010-06-30T19:28:43.280-03:00</updated><title type='text'>OZZY no SOLiD</title><content type='html'>E não é que vão fazer o Genoma do Ozzy Osbourne no SOLiD:&lt;br /&gt;&lt;br /&gt;&lt;a href="http://www.bizjournals.com/losangeles/stories/2010/06/28/daily5.html"&gt;http://www.bizjournals.com/losangeles/stories/2010/06/28/daily5.html&lt;/a&gt;&lt;br /&gt;&lt;br /&gt;PS: Na notícia eles só falam que vai ser feito em um sequenciador da Life Tech, mas acho seguro supor que eles estão falando do SOLiD.&lt;br /&gt;&lt;br /&gt;Update: O artigo abaixo confirma que o sequenciamento vai ser feito no SOLiD 4:&lt;br /&gt;&lt;br /&gt;&lt;a href="http://www.bio-itworld.com/els/2010/06/30/ozzys-genome.html"&gt;http://www.bio-itworld.com/els/2010/06/30/ozzys-genome.html&lt;/a&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/14515087-7978830400579069211?l=bioinfo-br.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://bioinfo-br.blogspot.com/feeds/7978830400579069211/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=14515087&amp;postID=7978830400579069211' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/14515087/posts/default/7978830400579069211'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/14515087/posts/default/7978830400579069211'/><link rel='alternate' type='text/html' href='http://bioinfo-br.blogspot.com/2010/06/ozzy-no-solid.html' title='OZZY no SOLiD'/><author><name>Leonardo Varuzza</name><uri>https://profiles.google.com/106428020899659897437</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='32' height='32' src='//lh4.googleusercontent.com/-VfVeE_sAxds/AAAAAAAAAAI/AAAAAAAAAAA/ohSMK6AOy8g/s512-c/photo.jpg'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-14515087.post-1323448753886269383</id><published>2010-06-29T08:36:00.000-03:00</published><updated>2010-06-29T08:36:37.256-03:00</updated><title type='text'>Tutorial sobre o ABySS</title><content type='html'>Achei um tutorial sobre a execução do montador distribuído ABySS:&lt;br /&gt;&lt;br /&gt;http://ged.msu.edu/angus/tutorials/short-read-assembly.html&lt;br /&gt;&lt;br /&gt;É muito mais fácil ter 512 GB ou 1 TB de memória distribuído em um cluster do que em uma única máquina, por isso acredito que montadores Eurelianos distribuídos são o caminho para a montagem De Novo de Eucaritos com short reads.&lt;br /&gt;&lt;br /&gt;PS: O ABySS trabalha com dados do SOLiD em Color Space.&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/14515087-1323448753886269383?l=bioinfo-br.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://bioinfo-br.blogspot.com/feeds/1323448753886269383/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=14515087&amp;postID=1323448753886269383' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/14515087/posts/default/1323448753886269383'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/14515087/posts/default/1323448753886269383'/><link rel='alternate' type='text/html' href='http://bioinfo-br.blogspot.com/2010/06/tutorial-sobre-o-abyss.html' title='Tutorial sobre o ABySS'/><author><name>Leonardo Varuzza</name><uri>https://profiles.google.com/106428020899659897437</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='32' height='32' src='//lh4.googleusercontent.com/-VfVeE_sAxds/AAAAAAAAAAI/AAAAAAAAAAA/ohSMK6AOy8g/s512-c/photo.jpg'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-14515087.post-1512195549468649883</id><published>2010-06-18T17:00:00.000-03:00</published><updated>2010-06-18T17:00:13.991-03:00</updated><title type='text'>99.94% de Acurácia</title><content type='html'>&lt;span class="Apple-style-span" style="font-family: monospace; font-size: 13px;"&gt;Uma acurácia de 99.94% significa que esperamos 600 diferenças entre os reads e a referência a cada 1 milhão de bases sequenciadas. Alguém pode argumentar que a diferença entre 99.90% e 99.94% é insignificante, mas em um sequenciamento de 100 GB isso significa que o SOLiD vai ter&amp;nbsp;40 milhões a menos de mismatchs, isso são quase 9 genomas de E. coli.&lt;/span&gt;&lt;br /&gt;&lt;span class="Apple-style-span" style="font-family: monospace; font-size: 13px;"&gt;&lt;br /&gt;&lt;/span&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/14515087-1512195549468649883?l=bioinfo-br.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://bioinfo-br.blogspot.com/feeds/1512195549468649883/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=14515087&amp;postID=1512195549468649883' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/14515087/posts/default/1512195549468649883'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/14515087/posts/default/1512195549468649883'/><link rel='alternate' type='text/html' href='http://bioinfo-br.blogspot.com/2010/06/9994-de-acuracia.html' title='99.94% de Acurácia'/><author><name>Leonardo Varuzza</name><uri>https://profiles.google.com/106428020899659897437</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='32' height='32' src='//lh4.googleusercontent.com/-VfVeE_sAxds/AAAAAAAAAAI/AAAAAAAAAAA/ohSMK6AOy8g/s512-c/photo.jpg'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-14515087.post-1606994525456644195</id><published>2010-06-18T09:16:00.000-03:00</published><updated>2010-06-18T09:16:17.980-03:00</updated><title type='text'></title><content type='html'>&amp;nbsp;Na lista de discussão do ABySS eu vi um post com o seguinte trecho:&lt;br /&gt;&lt;blockquote&gt;... &lt;br /&gt;The haploid genome size of the species whose genome I'm trying to&lt;br /&gt;assemble is 3 billion bp. &amp;nbsp;My colleague doesn't want me to name the&lt;br /&gt;species so I need to honor his request. &amp;nbsp;Let's just call it "rb".&lt;br /&gt;&lt;br /&gt;The Illumina pe-libs are:&lt;br /&gt;&lt;br /&gt;&lt;div class="im"&gt;&lt;br /&gt;lib1: 10x @ 2x75bp&lt;br /&gt;lib2: "20x" @ 2x150bp&lt;br /&gt;&lt;br /&gt;&lt;/div&gt;lib1: 210 million read-pairs&lt;br /&gt;lib2: 200 million read-pairs&lt;br /&gt;&lt;div class="im"&gt;&lt;br /&gt;The fragment sizes of both lib1 and lib2 are "500-600" bp.&lt;br /&gt;&lt;br /&gt;&lt;/div&gt;Our experience with 2x150bp reads (an Illumina-supported trial)  was&lt;br /&gt;disappointing. &amp;nbsp;Your mileage may vary. &amp;nbsp;Thus, the lib2 spec "2x150bp"&lt;br /&gt;is nominal. &amp;nbsp;We don't expect much more than half that (2x75bp) to be&lt;br /&gt;usable.&lt;br /&gt;&amp;nbsp;...&lt;/blockquote&gt;&amp;nbsp;Esse comentário é bastante pertinente na questão do tamanho do read versus acurácia no sentido de que aumentar o tamanho sem aumentar a acurácia não faz sentido porque o ruído inserido torna a montagem inviável (no contexto de montagem Eureliana, nos algoritímos anteriores os problemas são outros).&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/14515087-1606994525456644195?l=bioinfo-br.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://bioinfo-br.blogspot.com/feeds/1606994525456644195/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=14515087&amp;postID=1606994525456644195' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/14515087/posts/default/1606994525456644195'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/14515087/posts/default/1606994525456644195'/><link rel='alternate' type='text/html' href='http://bioinfo-br.blogspot.com/2010/06/lista-de-discussao-do-abyss-eu-vi-um.html' title=''/><author><name>Leonardo Varuzza</name><uri>https://profiles.google.com/106428020899659897437</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='32' height='32' src='//lh4.googleusercontent.com/-VfVeE_sAxds/AAAAAAAAAAI/AAAAAAAAAAA/ohSMK6AOy8g/s512-c/photo.jpg'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-14515087.post-2499038895511594768</id><published>2010-06-02T16:37:00.000-03:00</published><updated>2010-06-02T16:37:58.815-03:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='brazil'/><category scheme='http://www.blogger.com/atom/ns#' term='TOP500'/><title type='text'>Brasil no TOP500</title><content type='html'>O Brasil está bem colocado na lista do TOP500 do mês de junho de 2010, 86a posição:&lt;br /&gt;&lt;br /&gt;&lt;a href="http://www.top500.org/lists/2010/06"&gt;http://www.top500.org/lists/2010/06&lt;/a&gt;&lt;br /&gt;&lt;br /&gt;Com um cluster da SUN instalado no NACAD da UFRJ:&lt;br /&gt;&lt;br /&gt;&lt;a href="http://www.nacad.ufrj.br/"&gt;http://www.nacad.ufrj.br/&lt;/a&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/14515087-2499038895511594768?l=bioinfo-br.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://bioinfo-br.blogspot.com/feeds/2499038895511594768/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=14515087&amp;postID=2499038895511594768' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/14515087/posts/default/2499038895511594768'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/14515087/posts/default/2499038895511594768'/><link rel='alternate' type='text/html' href='http://bioinfo-br.blogspot.com/2010/06/brasil-no-top500.html' title='Brasil no TOP500'/><author><name>Leonardo Varuzza</name><uri>https://profiles.google.com/106428020899659897437</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='32' height='32' src='//lh4.googleusercontent.com/-VfVeE_sAxds/AAAAAAAAAAI/AAAAAAAAAAA/ohSMK6AOy8g/s512-c/photo.jpg'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-14515087.post-1560949674926447143</id><published>2010-04-07T11:50:00.000-03:00</published><updated>2010-04-07T11:50:22.757-03:00</updated><title type='text'>Early History of GenBank</title><content type='html'>Achei alguns textos interessantíssimos sobre o início do GenBank, quando ele ainda estava no Los Alamos National Laboratory e o NCBI ainda nem existia. Os textos são uma &lt;a href="http://www.lanl.gov/news/index.php/fuseaction/1663.article/d/200808/id/14273"&gt;entrevista no site do LANL&lt;/a&gt; e um &lt;a href="http://library.lanl.gov/cgi-bin/getfile?09-03.pdf"&gt;artigo escrito por Walter B. Goad&lt;/a&gt;.&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;Em 1979 eles já viam a importância da bioinformática e da análise de dados para a biologia molecular, eles até sugerem uma abordagem de "serviços remotos" para lidar com heterogeneidade dos computadores que existiam na época.&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/14515087-1560949674926447143?l=bioinfo-br.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://bioinfo-br.blogspot.com/feeds/1560949674926447143/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=14515087&amp;postID=1560949674926447143' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/14515087/posts/default/1560949674926447143'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/14515087/posts/default/1560949674926447143'/><link rel='alternate' type='text/html' href='http://bioinfo-br.blogspot.com/2010/04/early-history-of-genbank.html' title='Early History of GenBank'/><author><name>Leonardo Varuzza</name><uri>https://profiles.google.com/106428020899659897437</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='32' height='32' src='//lh4.googleusercontent.com/-VfVeE_sAxds/AAAAAAAAAAI/AAAAAAAAAAA/ohSMK6AOy8g/s512-c/photo.jpg'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-14515087.post-28193080876070480</id><published>2010-03-31T16:16:00.000-03:00</published><updated>2010-03-31T16:16:31.068-03:00</updated><title type='text'>Gerando consenso a partir de um arquivo SAM</title><content type='html'>Uma coisa muito legal do formato SAM é o utilitário samtools, ele permite fazer diversos processamentos muito úteis, sendo um deles gerar um consenso com base no alinhamento.&lt;br /&gt;&lt;br /&gt;Primeiro é necessário converter o arquivo SAM em SORTED BAM, (uu já abordei esse assunto no &lt;a href="http://bioinfo-br.blogspot.com/2010/01/utilizando-o-perm.html"&gt;post sobre Perm&lt;/a&gt;). Em seguida utiliza-se os comandos&amp;nbsp; samtools pileup e o samtools.pl pileup2fq em conjunto:&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;samtools pileup -cf &lt;reference&gt;&amp;nbsp; &lt;sorted bam="" file=""&gt; | samtools.pl pileup2fq -D100 &amp;gt; consensus.fastq&lt;/sorted&gt;&lt;/reference&gt;&lt;br /&gt;&lt;br /&gt;A opção -D100 limite a cobertura a no máxio 100 reads, ajuste esse parâmetro de acordo com a cobertura do seus dados.&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/14515087-28193080876070480?l=bioinfo-br.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://bioinfo-br.blogspot.com/feeds/28193080876070480/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=14515087&amp;postID=28193080876070480' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/14515087/posts/default/28193080876070480'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/14515087/posts/default/28193080876070480'/><link rel='alternate' type='text/html' href='http://bioinfo-br.blogspot.com/2010/03/gerando-consenso-partir-de-um-arquivo.html' title='Gerando consenso a partir de um arquivo SAM'/><author><name>Leonardo Varuzza</name><uri>https://profiles.google.com/106428020899659897437</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='32' height='32' src='//lh4.googleusercontent.com/-VfVeE_sAxds/AAAAAAAAAAI/AAAAAAAAAAA/ohSMK6AOy8g/s512-c/photo.jpg'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-14515087.post-8942256765895481891</id><published>2010-02-09T17:37:00.000-02:00</published><updated>2010-02-09T17:37:51.907-02:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='schduler'/><category scheme='http://www.blogger.com/atom/ns#' term='cluster'/><category scheme='http://www.blogger.com/atom/ns#' term='PBS'/><category scheme='http://www.blogger.com/atom/ns#' term='maui'/><title type='text'>Configurando o Maui, alternativa ao pbs_sched</title><content type='html'>O Torque é&amp;nbsp;distribuído&amp;nbsp;com um Scheduler muito simples, o pbs_sched. O &lt;a href="http://www.clusterresources.com/products/maui-cluster-scheduler.php/"&gt;Maui&lt;/a&gt; é um scheduler muito mais potente, feito pelos mesmos desenvolvedores do Torque (eles também tem uma solução mais completa chamada Moab que é paga).&lt;br /&gt;&lt;br /&gt;Para instalar o Maui é preciso primeiro baixar o código fonte &lt;a href="http://www.clusterresources.com/product/maui"&gt;nesse endereço&lt;/a&gt;&amp;nbsp;(você vai precisar criar uma conta no site andes de baixar o programa). Em seguida descompacte o arquivo em algum diretório, por exemplo em /usr/src. Dentro do diretório do código fonte digite (como root):&lt;br /&gt;&lt;br /&gt;&lt;code class="prettyprint lang-sh"&gt;&lt;br /&gt;./configure&lt;br /&gt;make&lt;br /&gt;make install&lt;br /&gt;&lt;/code&gt;&lt;br /&gt;&lt;br /&gt;Em seguida adicione o diretório /usr/local/maui/bin no path, no debian/ubuntu basta editar o arquivo /etc/environment para fazer isso. &lt;br /&gt;&lt;br /&gt;Antes de iniciar o Maui é preciso finalizar o pbs_sched e remove-lo da sequencia de boot, para isso (ainda como root):&lt;br /&gt;&lt;br /&gt;&lt;code class="prettyprint lang-sh"&gt;&lt;br /&gt;/etc/init.d/pbs_sched stop&lt;br /&gt;update_rc.d pbs_sched disable&lt;br /&gt;&lt;/code&gt;&lt;br /&gt;&lt;br /&gt;Edite o arquivo de configuração do maui em /usr/local/maui:&lt;br /&gt;&lt;br /&gt;1) Troque:&lt;br /&gt;&lt;br /&gt;RMCFG[TANGERINE] TYPE=PBS@RMNMHOST@&lt;br /&gt;&lt;br /&gt;Por&lt;br /&gt;&lt;br /&gt;RMCFG[base] TYPE=PBS&lt;br /&gt;&lt;br /&gt;Adicione a linha&lt;br /&gt;ADMIN3  all&lt;br /&gt;&lt;br /&gt;E adicione o seu usuário como ADMIN1, por exemplo:&lt;br /&gt;&lt;br /&gt;ADMIN1 root varuzza&lt;br /&gt;&lt;br /&gt;Depois de editar o arquivo é preciso configurar a inicialização do maui no boot. Como o maui não vem com um script de iniciação pronto coloque &lt;a href="http://sites.google.com/site/varuzza/my-files/maui?attredirects=0&amp;d=1"&gt;esse arquivo&lt;/a&gt; no diretório /etc/init.d que eu criei a partir dos scripts do pbs.&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;Finalmente digite &lt;br /&gt;&lt;br /&gt;&lt;code  class="prettyprint lang-sh"&gt;&lt;br /&gt;/etc/init.d/maui start&lt;br /&gt;&lt;/code&gt;&lt;br /&gt;&lt;br /&gt;Se todo ocorrer corretamente ao digitar o comando showq você deverá obter um output similar à esse:&lt;br /&gt;&lt;br /&gt;&lt;pre&gt;ACTIVE JOBS--------------------&lt;br /&gt;JOBNAME            USERNAME      STATE  PROC   REMAINING            STARTTIME&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;     0 Active Jobs       0 of    2 Processors Active (0.00%)&lt;br /&gt;                         0 of    1 Nodes Active      (0.00%)&lt;br /&gt;&lt;br /&gt;IDLE JOBS----------------------&lt;br /&gt;JOBNAME            USERNAME      STATE  PROC     WCLIMIT            QUEUETIME&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;0 Idle Jobs&lt;br /&gt;&lt;br /&gt;BLOCKED JOBS----------------&lt;br /&gt;JOBNAME            USERNAME      STATE  PROC     WCLIMIT            QUEUETIME&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;Total Jobs: 0   Active Jobs: 0   Idle Jobs: 0   Blocked Jobs: 0&lt;br /&gt;&lt;/pre&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/14515087-8942256765895481891?l=bioinfo-br.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://bioinfo-br.blogspot.com/feeds/8942256765895481891/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=14515087&amp;postID=8942256765895481891' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/14515087/posts/default/8942256765895481891'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/14515087/posts/default/8942256765895481891'/><link rel='alternate' type='text/html' href='http://bioinfo-br.blogspot.com/2010/02/configurando-o-maui-alternativa-ao.html' title='Configurando o Maui, alternativa ao pbs_sched'/><author><name>Leonardo Varuzza</name><uri>https://profiles.google.com/106428020899659897437</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='32' height='32' src='//lh4.googleusercontent.com/-VfVeE_sAxds/AAAAAAAAAAI/AAAAAAAAAAA/ohSMK6AOy8g/s512-c/photo.jpg'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-14515087.post-413853216106506596</id><published>2010-02-05T08:31:00.006-02:00</published><updated>2010-02-05T11:47:09.153-02:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='performance'/><category scheme='http://www.blogger.com/atom/ns#' term='conversion'/><category scheme='http://www.blogger.com/atom/ns#' term='haskell'/><category scheme='http://www.blogger.com/atom/ns#' term='Fasta'/><category scheme='http://www.blogger.com/atom/ns#' term='python'/><category scheme='http://www.blogger.com/atom/ns#' term='FastQ'/><title type='text'>Convertendo de FastQ para Fasta</title><content type='html'>Conveter de FastQ para csfasta é fácil, basta utilizar uma linha de shell script:&lt;br /&gt;&lt;br /&gt;&lt;code class="prettyprint lang-sh"&gt;zcat SRR034220.fastq.gz &lt;br /&gt;&amp;nbsp;&amp;nbsp; &amp;nbsp; &amp;nbsp; | awk 'BEGIN{a=0}{if(a==1){print;a=0}}/^@/{print;a=1}'&lt;br /&gt;&amp;nbsp;&amp;nbsp; &amp;nbsp; &amp;nbsp; | sed 's/^@/&amp;gt;/' | gzip &amp;gt; SRR034220.csfasta.gz&lt;br /&gt;&lt;/code&gt;&lt;br /&gt;&lt;br /&gt;(Fonte: &lt;a bitly="BITLY_PROCESSED" href="http://stackoverflow.com/questions/1542306/converting-fastq-to-fasta-with-sed-awk"&gt;StackOverflow&lt;/a&gt;)&lt;br /&gt;&lt;br /&gt;Já a conversão de FastQ para Qual é mais complicada porque é preciso converter os valores de qualidade do formato texto para o formato numérico, e para isso é melhor utilizar uma linguagem de programação. &lt;a bitly="BITLY_PROCESSED" href="http://www.python.org/"&gt;Python&lt;/a&gt; é uma linguagem de programação elegante e simples de programar, a biblioteca &lt;a bitly="BITLY_PROCESSED" href="http://biopython.org/wiki/Main_Page"&gt;BioPython&lt;/a&gt; é bastante sofisticada e tem uma função pronta para fazer essa conversão. O pequeno trecho de código abaixo faz a conversão que queremos:&lt;br /&gt;&lt;br /&gt;&lt;code class="prettyprint lang-py"&gt;&lt;br /&gt;from Bio import SeqIO&lt;br /&gt;print "Converting files..."&lt;br /&gt;count = SeqIO.convert("SRR034220.fastq", "fastq", "SRR034220.py.qual", "qual")&lt;br /&gt;print "Converted %i records" % count&lt;br /&gt;&lt;/code&gt;&lt;br /&gt;&lt;br /&gt;Extremamente simples, porém, leeento. O script gastou 3 horas de CPU em sua execução. Isso não é muito prático. &lt;br /&gt;&lt;br /&gt;Procurando por uma alternativa encontrei a bibiloteca &lt;a bitly="BITLY_PROCESSED" href="http://blog.malde.org/index.php/the-haskell-bioinformatics-library/"&gt;BioHaskell&lt;/a&gt;. &lt;a bitly="BITLY_PROCESSED" href="http://www.haskell.org/"&gt;Haskell&lt;/a&gt; é uma linguagem funcional compilada fortemente tipada extremamente elegante e eficiente, com performance próxima do C. O código em haskell é  quase tão compacto quanto o código em python:&lt;br /&gt;&lt;br /&gt;&lt;code class="prettyprint lang-hs"&gt;&lt;br /&gt;module Main&lt;br /&gt;where&lt;br /&gt;&lt;br /&gt;import Bio.Sequence&lt;br /&gt;import System&lt;br /&gt;&lt;br /&gt;main = do&lt;br /&gt;&amp;nbsp;&amp;nbsp;[inputFile,outputQual] &amp;lt;- getArgs&amp;nbsp;&lt;/code&gt;&lt;br /&gt;&lt;code class="prettyprint lang-hs"&gt;&amp;nbsp;&amp;nbsp;seqs &amp;lt;- (readFastQ inputFile)&amp;nbsp;&lt;/code&gt;&lt;br /&gt;&lt;code class="prettyprint lang-hs"&gt;&amp;nbsp;&amp;nbsp;writeQual outputQual seqs  &lt;/code&gt;&lt;br /&gt;&lt;br /&gt;Porém o tempo de execução do programa foi de 34 minutos, mais de 5 vezes mais rápido do que a versão em Python. &lt;br /&gt;&lt;br /&gt;Performance é algo relativo, muitos programas passam mais tempo esperando do efetivamente usando a CPU, como user interfaces por exemplo, mas nesse caso a diferença de performance é importante, pois a versão em haskell pode ser rodada durante o almoço ou uma reunião, enquanto que a versão em python vai gastar toda a manhã para rodar.&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/14515087-413853216106506596?l=bioinfo-br.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://bioinfo-br.blogspot.com/feeds/413853216106506596/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=14515087&amp;postID=413853216106506596' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/14515087/posts/default/413853216106506596'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/14515087/posts/default/413853216106506596'/><link rel='alternate' type='text/html' href='http://bioinfo-br.blogspot.com/2010/02/convertendo-de-fastq-para-fasta.html' title='Convertendo de FastQ para Fasta'/><author><name>Leonardo Varuzza</name><uri>https://profiles.google.com/106428020899659897437</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='32' height='32' src='//lh4.googleusercontent.com/-VfVeE_sAxds/AAAAAAAAAAI/AAAAAAAAAAA/ohSMK6AOy8g/s512-c/photo.jpg'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-14515087.post-5215999476500392726</id><published>2010-02-03T10:48:00.002-02:00</published><updated>2011-09-01T10:56:00.383-03:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='torque'/><category scheme='http://www.blogger.com/atom/ns#' term='linux'/><category scheme='http://www.blogger.com/atom/ns#' term='sysadmin'/><category scheme='http://www.blogger.com/atom/ns#' term='PBS'/><title type='text'>Instalando o PBS (torque)</title><content type='html'>&lt;div&gt;O &lt;a bitly="BITLY_PROCESSED" href="http://en.wikipedia.org/wiki/Portable_Batch_System"&gt;PBS (Portable Batch System)&lt;/a&gt; é um sistema de execução batch de jobs desenvolvido originalmente pela NASA. Existem três versões do PBS: PBS Pro, openPBS e o &lt;a bitly="BITLY_PROCESSED" href="http://www.clusterresources.com/products/torque-resource-manager.php"&gt;Torque&lt;/a&gt;. Esqueça as outras versões, a unica versão gratuita que funciona é o Torque (Também não use a versão que do Debian, o empacotamento está cheio de bugs).&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;Baixe o código fonte do torque aqui. Descompacte em um lugar apropriado (por exemplo, /usr/src) e execute o seguinte comando:&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;&lt;span class="Apple-style-span" style="font-family: 'Courier New', Courier, monospace;"&gt;./configure --with-rcp=scp --disable-gcc-warnings&lt;/span&gt;&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;No Debian o script de configuração não consegue encontrar a Tk, por isso se você quiser compilar o Torque as ferramentas gráficas use o seguinte comando:&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;span class="Apple-style-span" style="font-family: 'Courier New', Courier, monospace;"&gt;./configure --with-rcp=scp --with-tk=/usr/lib/tk8.4/ --disable-gcc-warnings&lt;/span&gt;&lt;br /&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;Depois de configurado é só digitar:&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;&lt;span class="Apple-style-span" style="font-family: 'Courier New', Courier, monospace;"&gt;make &amp;amp;&amp;amp; make install&lt;/span&gt;&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;Depois de instalar o sistema é preciso colocar os scripts de&amp;nbsp;inicialização&amp;nbsp;no /etc/init.d para que o torque seja iniciado no boot. Dentro do diretório fonte do torque digite:&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;&lt;span class="Apple-style-span" style="font-family: 'Courier New', Courier, monospace;"&gt;cd contrib/init.d&lt;/span&gt;&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;Em seguida é preciso copiar os scripts xxx.pbs_mom, xxx.pbs_sched e xxx.pbs_server para /etc/init.d, onde xxx pode ser debian, suze ou nada para a versão redhat do script. Abaixo está um comando que copia e renomeia os arquivos na versão para o debian/ubuntu:&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;&lt;span class="Apple-style-span" style="font-family: 'Courier New', Courier, monospace;"&gt;for i in debian.pbs_*; do cp $i /etc/init.d/`echo $i | sed 's/debian.//'`; done&lt;/span&gt;&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;Por fim:&lt;br /&gt;&lt;br /&gt;&lt;div style="font-family: &amp;quot;Courier New&amp;quot;,Courier,monospace;"&gt;update-rc.d pbs_server defaults&lt;/div&gt;&lt;div style="font-family: &amp;quot;Courier New&amp;quot;,Courier,monospace;"&gt;&lt;/div&gt;&lt;div style="font-family: &amp;quot;Courier New&amp;quot;,Courier,monospace;"&gt;update-rc.d pbs_mom defaults&lt;/div&gt;&lt;/div&gt;&lt;div&gt;&lt;div style="font-family: &amp;quot;Courier New&amp;quot;,Courier,monospace;"&gt;&lt;/div&gt;&lt;div style="font-family: &amp;quot;Courier New&amp;quot;,Courier,monospace;"&gt;update-rc.d pbs_sched defaults&lt;/div&gt;&lt;div&gt;&lt;br /&gt;Isso finaliza a instalação&amp;nbsp; do Torque, o passo seguinte é configura-lo. Você precisa editar dois arquivos, primeiro o arquivo /var/spool/torque/server_priv/nodes, nesse arquivo é incluída a lista de nós no cluster, como eu tenho somente um nó, eu coloquei a seguinte linha:&lt;br /&gt;&lt;br /&gt;&lt;div style="font-family: &amp;quot;Courier New&amp;quot;,Courier,monospace;"&gt;/var/spool/torque/server_priv/nodes:&lt;/div&gt;&lt;div style="font-family: &amp;quot;Courier New&amp;quot;,Courier,monospace;"&gt;tangerine&amp;nbsp; np=2&lt;/div&gt;&lt;br /&gt;Na primeira coluna têm-se o nome do nó (é importante que o arquivo hosts em todo o cluster contenham esse nome e que esse nome seja o mesmo do hostname. &lt;br /&gt;Esse arquivo só precisa ser configurado no headnode, como tenho somente um computador, o meu headnode é o mesmo que o compute node.&lt;br /&gt;&lt;br /&gt;Em seguida configura-se o pbs_mom em cada um dos compute nodes, de novo, no meu caso é no mesmo computador. Edite o seguinte arquivo:&lt;br /&gt;&lt;br /&gt;&lt;div style="font-family: &amp;quot;Courier New&amp;quot;,Courier,monospace;"&gt;/var/spool/torque/mom_priv/config:&lt;/div&gt;&lt;div style="font-family: &amp;quot;Courier New&amp;quot;,Courier,monospace;"&gt;$pbsserver&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; tangerine&lt;br /&gt;$logevent&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; 255 &lt;/div&gt;&lt;br /&gt;O pbsserver é o nome do headnode e o valor de logevent eu peguei da documentação do torque. Feita a configuração os deamons podem ser iniciados:&lt;br /&gt;&lt;br /&gt;&lt;div style="font-family: &amp;quot;Courier New&amp;quot;,Courier,monospace;"&gt;/etc/init.d/pbs_server start&lt;/div&gt;&lt;div style="font-family: &amp;quot;Courier New&amp;quot;,Courier,monospace;"&gt;&lt;/div&gt;&lt;div style="font-family: &amp;quot;Courier New&amp;quot;,Courier,monospace;"&gt;/etc/init.d/pbs_mom start&lt;/div&gt;&lt;div style="font-family: &amp;quot;Courier New&amp;quot;,Courier,monospace;"&gt;&lt;/div&gt;&lt;div style="font-family: &amp;quot;Courier New&amp;quot;,Courier,monospace;"&gt;/etc/init.d/pbs_sched start&lt;/div&gt;&lt;br /&gt;(Você śo precisa iniciar os deamons essa primeira vez pois já configuramos para inicia-los no momento do boot)&lt;br /&gt;&lt;br /&gt;Verifique se os daemons estão rodando utilizando o comand:&lt;br /&gt;&lt;br /&gt;&lt;div style="font-family: &amp;quot;Courier New&amp;quot;,Courier,monospace;"&gt;ps awx | grep pbs&lt;/div&gt;&lt;br /&gt;Se você não ver os três daemons rodando é porque deu problema, olhe nos logs dos daemons em /var/spool/torque/server_logs e /var/spool/torque/mom_logs.&lt;br /&gt;&lt;br /&gt;Com os daemons rodando podemos fazer o último passo que é criar as filas de execução, abaixo está a receita para criar uma fila chamada secondary:&lt;br /&gt;&lt;br /&gt;&lt;span style="font-family: 'Courier New', Courier, monospace;"&gt;qmgr -c "set server scheduling=true"&lt;/span&gt;&lt;br /&gt;&lt;span style="font-family: 'Courier New', Courier, monospace;"&gt;qmgr -c "create queue secondary queue_type=execution"&lt;/span&gt;&lt;span style="font-family: 'Courier New', Courier, monospace;"&gt;qmgr -c "set queue secondary started=true"&lt;/span&gt;&lt;br /&gt;&lt;span style="font-family: 'Courier New', Courier, monospace;"&gt;qmgr -c "set queue secondary enabled=true"&lt;/span&gt;&lt;br /&gt;&lt;span style="font-family: 'Courier New', Courier, monospace;"&gt;qmgr -c "set queue secondary resources_default.nodes=1"&lt;/span&gt;&lt;br /&gt;&lt;span style="font-family: 'Courier New', Courier, monospace;"&gt;qmgr -c "set queue secondary resources_default.walltime=1000:00:00"&lt;/span&gt;&lt;br /&gt;&lt;span style="font-family: 'Courier New', Courier, monospace;"&gt;qmgr -c "set server default_queue=secondary"&lt;/span&gt;&lt;br /&gt;&lt;span style="font-family: 'Courier New', Courier, monospace;"&gt;qmgr -c "set server resources_available.nodect = 20" &lt;br /&gt;&lt;/span&gt;&lt;/div&gt;&lt;div&gt;&lt;/div&gt;&lt;div&gt;&lt;br /&gt;As pipelines do Corona utilizam como duas files, a primeira chamada secondary que acabamos de criar e uma segunda chamada tracking que é criada com a seguinte receita:&lt;br /&gt;&lt;br /&gt;&lt;div style="font-family: &amp;quot;Courier New&amp;quot;,Courier,monospace;"&gt;qmgr -c "create queue tracking queue_type=execution"&lt;br /&gt;qmgr -c "set queue tracking started=true"&lt;br /&gt;qmgr -c "set queue tracking enabled=true"&lt;br /&gt;qmgr -c "set queue tracking resources_default.nodes=1"&lt;br /&gt;qmgr -c "set queue tracking resources_default.walltime=1000:00:00"&lt;/div&gt;&lt;br /&gt;Pronto!!!!&lt;br /&gt;&lt;br /&gt;Podemos submeter jobs ao Torque, como por exemplo esse aqui:&lt;br /&gt;&lt;br /&gt;echo echo echo | qsub&lt;br /&gt;&lt;br /&gt;A execução do job pode ser acompanhada pelo comando qstat, porém nesse caso geralmente job é executado instantaneamente. Quando esse job terminar serão criados dois arquivos no diretório atual, STDIN.o??? e STDIN.e??? onde ??? é o id do job. Se esses arquivos não forem criados, existe um problema no torque.&lt;br /&gt;&lt;br /&gt;Um dos problemas mais comuns na execução de jobs no torque é a falta de acesso sem senha via ssh, é preciso que o ssh faça um login automático para que os resultados sejam copiados de volta, mas isso é assunto para um próximo post.&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;&lt;/div&gt;&lt;/div&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/14515087-5215999476500392726?l=bioinfo-br.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://bioinfo-br.blogspot.com/feeds/5215999476500392726/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=14515087&amp;postID=5215999476500392726' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/14515087/posts/default/5215999476500392726'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/14515087/posts/default/5215999476500392726'/><link rel='alternate' type='text/html' href='http://bioinfo-br.blogspot.com/2010/02/instalando-o-pbs-torque.html' title='Instalando o PBS (torque)'/><author><name>Leonardo Varuzza</name><uri>https://profiles.google.com/106428020899659897437</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='32' height='32' src='//lh4.googleusercontent.com/-VfVeE_sAxds/AAAAAAAAAAI/AAAAAAAAAAA/ohSMK6AOy8g/s512-c/photo.jpg'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-14515087.post-5340757627663616759</id><published>2010-02-02T19:31:00.006-02:00</published><updated>2010-02-05T11:48:04.118-02:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='mapping'/><category scheme='http://www.blogger.com/atom/ns#' term='solid'/><category scheme='http://www.blogger.com/atom/ns#' term='mosaik'/><title type='text'>Utilizando O MosaikAligner</title><content type='html'>O &lt;a bitly="BITLY_PROCESSED" href="http://bioinformatics.bc.edu/marthlab/Mosaik"&gt;Mosaik&lt;/a&gt; é um programa de mapeameamento com uma interface muito boa. Uma característica do Mosaik é trabalhar com arquivos binários em formato próprio, por isso é preciso pré-processar tanto a&amp;nbsp;seqüencia&amp;nbsp;de referência quanto os reads.&lt;br /&gt;&lt;br /&gt;Existe um bug no processamento de seqüências em color space, por isso é necessário aplicar o patch descrito nesse &lt;a href="http://code.google.com/p/mosaik-aligner/issues/detail?id=9"&gt;bug report&lt;/a&gt; antes de utilizar o programa.&lt;br /&gt;&lt;br /&gt;Para processar a referência utilize o seguinte comando:&lt;br /&gt;&lt;br /&gt;&lt;pre class="prettyprint lang-sh"&gt;MosaikBuild -fr fruitfly.fa -oa fruitfly.ref.bs.dat&lt;/pre&gt;&lt;br /&gt;Que vai gerar a referência em Base-Space. É preciso também gerar a referência em Color-Space, para isso deve-se utilizar o comando:&lt;br /&gt;&lt;br /&gt;&lt;span class="Apple-style-span" style="font-family: 'Courier New',Courier,monospace;"&gt;MosaikBuild -fr fruitfly.fa -cs -oa fruitfly.ref.cs.dat&lt;/span&gt;&lt;br /&gt;&lt;br /&gt;Por fim, os reads são processados com o comando:&lt;br /&gt;&lt;br /&gt;&lt;span class="Apple-style-span" style="font-family: 'Courier New',Courier,monospace;"&gt;MosaikBuild -q SRR034220.fastq.gz -out SRR034220.dat -st solid&lt;/span&gt;&lt;br /&gt;&lt;br /&gt;Existe mais um pre-processamento opcional pois o Mosaik pode fazer a busca das sementes utilizando uma hash-table ou então o que eles chamam de Jump Database. Para gerar o Jump Database utilize o seguinte comando:&lt;br /&gt;&lt;br /&gt;&lt;span class="Apple-style-span" style="font-family: 'Courier New',Courier,monospace;"&gt;MosaikJump -ia fruitfly.ref.cs.dat -out fruitfly.ref.cs_15.dat &amp;nbsp;-hs 15 -mem 6&lt;/span&gt;&lt;br /&gt;&lt;br /&gt;A opção -mem é muito importante, pois caso ele não seja ajustada o programa vai utilizar toda a memória do sistema, no caso, eu coloquei 6 GB para o programa. Agora podemos ir apara o alinhamento:&lt;br /&gt;&lt;br /&gt;&lt;span class="Apple-style-span" style="font-family: 'Courier New',Courier,monospace;"&gt;MosaikAligner -in SRR034220.dat -out SRR034220_align.dat -ibs fruitfly.ref.bs.dat -ia fruitfly.ref.cs.dat -hs 15 -mm 4 -j fruitfly.ref.cs_15.dat -p 2&lt;/span&gt;&lt;br /&gt;&lt;br /&gt;Note que estamos utilizando tanto a referência em color-space e em base-space, além disso o tamanho do hash (-hs 15) tem que bater com o valor passado para o MosaikJump, além disso estamos aceitando até 4 mismatchs (-mm 4). Estou também utilizando os cores que eu tenho. O problema é que com esse comando eu consigo alinhar somente 90 reads/s, e portanto demoraria mais de 200 horas para alinhar todo o dataset, por isso acabei utilizando a opção -bw 19 para utilizar o algorithmo de Smith-Waterman com bandas, o manual sugere utilizar 13 para um reads 36, eu cheguei ao valor de 19 através de uma regra de 3. Eu também utilizei o parâmetro -mhp 100 para limitar o número de posições por seed. Portanto o comando final foi:&lt;br /&gt;&lt;br /&gt;&lt;span class="Apple-style-span" style="font-family: 'Courier New',Courier,monospace;"&gt;MosaikAligner -in SRR034220.dat -out SRR034220_align.dat -ibs fruitfly.ref.bs.dat -ia fruitfly.ref.cs.dat -hs 15 -mm 4 -j fruitfly.ref.cs_15.dat -p 2 &amp;nbsp;-bw 19 -mhp 100&lt;/span&gt;&lt;br /&gt;&lt;br /&gt;Com esse comando eu consigo alinhar em torno de 260 reads/s, uma boa melhora. Porém &amp;nbsp;depois de um tempo o &amp;nbsp;valro cai para 40 reads/s. Acredito que o fato de ter somente 8GB de RAM está limitando o desempenho do programa.&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/14515087-5340757627663616759?l=bioinfo-br.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://bioinfo-br.blogspot.com/feeds/5340757627663616759/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=14515087&amp;postID=5340757627663616759' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/14515087/posts/default/5340757627663616759'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/14515087/posts/default/5340757627663616759'/><link rel='alternate' type='text/html' href='http://bioinfo-br.blogspot.com/2010/02/utilizando-o-mosaikaligner.html' title='Utilizando O MosaikAligner'/><author><name>Leonardo Varuzza</name><uri>https://profiles.google.com/106428020899659897437</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='32' height='32' src='//lh4.googleusercontent.com/-VfVeE_sAxds/AAAAAAAAAAI/AAAAAAAAAAA/ohSMK6AOy8g/s512-c/photo.jpg'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-14515087.post-6003227220358551597</id><published>2010-02-01T12:01:00.009-02:00</published><updated>2010-03-10T16:52:11.512-03:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='mappimg'/><category scheme='http://www.blogger.com/atom/ns#' term='SAM'/><category scheme='http://www.blogger.com/atom/ns#' term='solid'/><category scheme='http://www.blogger.com/atom/ns#' term='bowtie'/><title type='text'>Mapeando reads de Transcriptoma com Bowtie</title><content type='html'>Já existem diversos datasets de SOLiD disponíveis no banco de dados do &lt;a bitly="BITLY_PROCESSED" href="http://www.ncbi.nlm.nih.gov/sra"&gt;SRA (Sequence Read Archive)&lt;/a&gt; do NCBI. Para testar o bowtie eu peguei um dataset de transcriptoma de D. Melanogaster &lt;a bitly="BITLY_PROCESSED" href="http://www.ncbi.nlm.nih.gov/sra/?db=sra&amp;amp;term=SRX015641&amp;amp;report=full"&gt;SRX015641&lt;/a&gt;. Esse arquivo contém o resultado de somente uma corrida, mesmo assim são 4.2 Gb (O site do NCBI permite utilizar um plugin chamado &lt;a bitly="BITLY_PROCESSED" href="http://www.asperasoft.com/download/sw/connect/AsperaConnect"&gt;Aspera Connect&lt;/a&gt; para facilitar o download).&lt;br /&gt;&lt;br /&gt;O genoma da Drosófila pode ser baixado no &lt;a bitly="BITLY_PROCESSED" href="http://www.fruitfly.org/sequence/download.html"&gt;BDGP&lt;/a&gt;. O genoma vem em diversos arquivos, antes de fazer o alinhamento é preciso juntar todas as partes em um único arquivo fasta utilizando o velho e bom cat:&lt;br /&gt;&lt;br /&gt;&lt;code class="prettyprint lang-sh"&gt;&lt;br /&gt;cat na*.RELEASE5 &amp;gt; fruitfly.fasta&lt;/code&gt;&lt;br /&gt;&lt;br /&gt;Em seguida é feita a indexação do do genoma utilizando o comando bowtie-build, supondo que o bowtie está no path, digite:&lt;br /&gt;&lt;br /&gt;&lt;code class="prettyprint lang-sh"&gt;bowtie-build -C fruitfly.fasta&amp;nbsp; fruitfly&lt;/code&gt;&lt;br /&gt;&lt;br /&gt;A opção -C é muito importante, ela diz para gera um índice em color-space. Esse processo demora alguns minutos. Por fim, é só executar o bowtie, como o download já está no formato fastaq não é preciso fazer nenhuma conversão. Como o arquivo está compactado, podemos fazer a descompactação em paralelo ao alinhamento com o seguinte comando:&lt;br /&gt;&lt;br /&gt;&lt;code class="prettyprint lang-sh"&gt;&lt;br /&gt;zcat SRR034220.fastq.gz |&amp;nbsp;&lt;br /&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp; bowtie -S -C -p 2 fruitfly - |&lt;br /&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp; gzip &amp;gt; SRR034220.sam.gz&lt;/code&gt;&lt;br /&gt;&lt;br /&gt;O comando deve ser digitado em uma única linha. A opção -S pede para o programa gerar o alinhamento em formato SAM, a opção -C diz que estamos utilizando color-spae e -p 2 pede para o programa utilizar duas threads (que é o número de cores que eu tenho). Nessa forma são utilizadas duas pipes para descomprimir os reads e comprimir o arquivo SAM, o uso do gzip reduz muito o uso de disco e o IO, note o - após o nome da referência, ele diz para o bowtie ler os reads do STDIN e portanto o resultado do do zcat.&lt;br /&gt;&lt;br /&gt;Alternativamento pode-se decompactar o arquivo fastq (você vai precisar de 16 GB de disco para fazer isso) e depois fazer o mapeamento com esse comando:&lt;br /&gt;&lt;br /&gt;&lt;code class="prettyprint lang-sh"&gt;&lt;br /&gt;bowtie -S -C -p 2 fruitfly SRR034220.fastq |&lt;br /&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; gzip &amp;gt; SRR034220.sam.gz&lt;/code&gt;&lt;br /&gt;&lt;br /&gt;O bowtie possui a opção --mm que faz o programa utilizar o função memmap para ler os reads, porém, para o tamanho de input que o SOLiD gera, essa opção supera o limite de 4GB e faz com que o bowtie tenho um Segmentation Fail.&lt;br /&gt;&lt;br /&gt;O mapeamento demorou 1h45 (wall time) e mapeou 30% dos reads.&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/14515087-6003227220358551597?l=bioinfo-br.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://bioinfo-br.blogspot.com/feeds/6003227220358551597/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=14515087&amp;postID=6003227220358551597' title='1 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/14515087/posts/default/6003227220358551597'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/14515087/posts/default/6003227220358551597'/><link rel='alternate' type='text/html' href='http://bioinfo-br.blogspot.com/2010/02/mapeando-reads-de-transcriptoma-com.html' title='Mapeando reads de Transcriptoma com Bowtie'/><author><name>Leonardo Varuzza</name><uri>https://profiles.google.com/106428020899659897437</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='32' height='32' src='//lh4.googleusercontent.com/-VfVeE_sAxds/AAAAAAAAAAI/AAAAAAAAAAA/ohSMK6AOy8g/s512-c/photo.jpg'/></author><thr:total>1</thr:total></entry><entry><id>tag:blogger.com,1999:blog-14515087.post-4035017490508346747</id><published>2010-01-29T00:26:00.000-02:00</published><updated>2010-01-29T00:26:53.701-02:00</updated><title type='text'>Mapeando reads do SOLiD com mais programas</title><content type='html'>Consegui mapear os reads do dataset de E. coli com o Mosaik, com o bowtie e o com o perm (detalhes no último post).&amp;nbsp; O bowtie funcionou sem problemas e o Mosaik funcionou após aplicar o patch indicado nesse &lt;a href="http://code.google.com/p/mosaik-aligner/issues/detail?id=9"&gt;bug report&lt;/a&gt;. Um problema em todos os programas foi a identificação de mate-pairs, todos eles tiveram problemas em identificar os mates, o mapeamento funcionou mas na hora de fazer o match dos mates os programas tiveram problemas, preciso descobrir o que há de errado.&amp;nbsp; &lt;br /&gt;&lt;br /&gt;Tentei usar o bfast, mas achei ele meio sacal de usar, é preciso especificar manualmente quais os padrões de discontinuous words que se deseja usar, um por um, talvez depois eu de mais uma olhada nele.&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/14515087-4035017490508346747?l=bioinfo-br.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://bioinfo-br.blogspot.com/feeds/4035017490508346747/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=14515087&amp;postID=4035017490508346747' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/14515087/posts/default/4035017490508346747'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/14515087/posts/default/4035017490508346747'/><link rel='alternate' type='text/html' href='http://bioinfo-br.blogspot.com/2010/01/mapeando-reads-do-solid-com-mais.html' title='Mapeando reads do SOLiD com mais programas'/><author><name>Leonardo Varuzza</name><uri>https://profiles.google.com/106428020899659897437</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='32' height='32' src='//lh4.googleusercontent.com/-VfVeE_sAxds/AAAAAAAAAAI/AAAAAAAAAAA/ohSMK6AOy8g/s512-c/photo.jpg'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-14515087.post-7381041637617147474</id><published>2010-01-28T13:57:00.002-02:00</published><updated>2010-02-01T21:47:13.154-02:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='PerM'/><category scheme='http://www.blogger.com/atom/ns#' term='BAM'/><category scheme='http://www.blogger.com/atom/ns#' term='SAM'/><category scheme='http://www.blogger.com/atom/ns#' term='mapping'/><category scheme='http://www.blogger.com/atom/ns#' term='solid'/><title type='text'>Utilizando o PerM</title><content type='html'>&lt;a href="http://code.google.com/p/perm/"&gt;PerM&lt;/a&gt; é um alinhador de sequencias curtas bastante rápido e que alinha sequencias tanto color-space quando base-space. Dentre as vantagens desse programa estão o output em formato SAM e o uso de multiplos processadores.&lt;br /&gt;&lt;br /&gt;Para mapear um conjunto de dados de mate-pair, por exemplo o conjunto de exemplo de E. coli disponível no site do solid tools: &lt;a href="http://solidsoftwaretools.com/gf/project/ecoli2x50/"&gt;ecoli2x50&lt;/a&gt;,&amp;nbsp; basta utilizar o seguinte comando:&lt;br /&gt;&lt;br /&gt;&lt;pre&gt;PerM DH10B_WithDup_FinalEdit_validated.fasta &lt;br /&gt;          -1 Rosalind_20080729_2_Chris5_F3.csfasta &lt;br /&gt;          -2 Rosalind_20080729_2_Chris5_R3.csfasta &lt;br /&gt;          -o mates.sam &lt;br /&gt;&lt;/pre&gt;&lt;br /&gt;Estou suponto que o PerM está no path e que o diretório atual é o que contém os reads (as quebras de linha não devem ser utilizadas, foram colocadas somente para aumentar a legibilidade).&lt;br /&gt;&lt;br /&gt;Para fazer o alinhamento é somente isso. Para visualizar o resultado o &lt;a href="http://www.broadinstitute.org/igv/"&gt;IGV&lt;/a&gt; é um ótima opção, mas para abrir o arquivo nesse programa é preciso primeiro convertar o arquivo SAM em BAM indexado. Para isso utiliza-a o samtools e as seguintes etapas:&lt;br /&gt;&lt;br /&gt;&lt;ol&gt;&lt;li&gt;Convertar o arquivo de SAM para BAM: &lt;br /&gt;&lt;br /&gt;&lt;span style="font-family: &amp;quot;Courier New&amp;quot;,Courier,monospace;"&gt;samtools view -bt DH10B_WithDup_FinalEdit_validated.fasta mates.sam &amp;gt; mates.bam &lt;br /&gt;&lt;/span&gt;&lt;/li&gt;&lt;li&gt;Ordenar o arquivo BAM:&lt;br /&gt;&lt;br /&gt;&lt;span style="font-family: &amp;quot;Courier New&amp;quot;,Courier,monospace;"&gt;samtools sort [-m MEM] mates.bam mates.sorted&lt;/span&gt;&lt;br style="font-family: &amp;quot;Courier New&amp;quot;,Courier,monospace;" /&gt; &lt;br /&gt;É importante ajustar o valor de MEM para que a você não tenha que esperar algumas horas pela ordenação dos reads. Esse valor deve ser informado em bytes.&lt;br /&gt;&lt;/li&gt;&lt;li&gt;Indexar o arquivo BAM:&lt;br /&gt;&lt;br /&gt;&lt;span style="font-family: &amp;quot;Courier New&amp;quot;,Courier,monospace;"&gt;samtools&amp;nbsp; index&amp;nbsp; mates.sorted.bam&amp;nbsp;&lt;/span&gt;&lt;br /&gt;&lt;br /&gt;Essa última etapa vai gerar o arquivo mates.sorted.bam.bai. &lt;br /&gt;&lt;/li&gt;&lt;/ol&gt;&lt;pre&gt;&lt;/pre&gt;&lt;br /&gt;Depois de executada a conversão e a indexação, abra o IGV. Na primeira vez que você for visualizar o genoma da E. Coli será preciso importa-lo, vá no menu &lt;span style="font-family: &amp;quot;Courier New&amp;quot;,Courier,monospace;"&gt;File/Import Genome...&lt;/span&gt;&lt;br /&gt;&lt;br /&gt;Depois de importar o genoma vá em File/Load From File... e carregue o arquivo mates.sorted.bam.&lt;br /&gt;&lt;br /&gt;Pronto! Se você aumentar o nível de zoom vai ser possível ver todos os alinhamentos, como na figura abaixo:&lt;br /&gt;&lt;br /&gt;&lt;div class="separator" style="clear: both; text-align: center;"&gt;&lt;a href="http://1.bp.blogspot.com/_1_1yOlN_87I/S2GzcHrVzHI/AAAAAAAAChY/K3M3rJ47-0k/s1600-h/IGV-ecoli.png" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"&gt;&lt;img border="0" src="http://1.bp.blogspot.com/_1_1yOlN_87I/S2GzcHrVzHI/AAAAAAAAChY/K3M3rJ47-0k/s320/IGV-ecoli.png" /&gt;&lt;/a&gt;&lt;/div&gt;&lt;br /&gt;&lt;br /&gt;O alinhamento demora em torno de 15minutos no meu Atlhom X2 3000+. A conversão de SAM para BAM e a indexação demoram mais tempo que o alinhamento.&lt;br /&gt;&lt;br /&gt;O único problema é na resolução dos mate-pairs, em pouquíssimas sequencias o PerM consegue fazer o pareamento dos mates.&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/14515087-7381041637617147474?l=bioinfo-br.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://bioinfo-br.blogspot.com/feeds/7381041637617147474/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=14515087&amp;postID=7381041637617147474' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/14515087/posts/default/7381041637617147474'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/14515087/posts/default/7381041637617147474'/><link rel='alternate' type='text/html' href='http://bioinfo-br.blogspot.com/2010/01/utilizando-o-perm.html' title='Utilizando o PerM'/><author><name>Leonardo Varuzza</name><uri>https://profiles.google.com/106428020899659897437</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='32' height='32' src='//lh4.googleusercontent.com/-VfVeE_sAxds/AAAAAAAAAAI/AAAAAAAAAAA/ohSMK6AOy8g/s512-c/photo.jpg'/></author><media:thumbnail xmlns:media='http://search.yahoo.com/mrss/' url='http://1.bp.blogspot.com/_1_1yOlN_87I/S2GzcHrVzHI/AAAAAAAAChY/K3M3rJ47-0k/s72-c/IGV-ecoli.png' height='72' width='72'/><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-14515087.post-3422993230662512920</id><published>2009-12-17T15:20:00.000-02:00</published><updated>2009-12-17T15:20:52.549-02:00</updated><title type='text'>De Novo Acessory Tools</title><content type='html'>Foi lançada uma nova versão dos utilitários para fazer montagem &lt;i&gt;De Novo&lt;/i&gt; com SOLiD reads. &amp;nbsp;Essa nova versão está disponível no site:&lt;br /&gt;&lt;br /&gt;&lt;a href="http://solidsoftwaretools.com/gf/project/denovotools/"&gt;http://solidsoftwaretools.com/gf/project/denovotools/&lt;/a&gt;&lt;br /&gt;&lt;br /&gt;PS: A versão antiga ainda está disponível no site (http://solidsoftwaretools.com/gf/project/denovo/), portanto verifique qual versão você está baixando.&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/14515087-3422993230662512920?l=bioinfo-br.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://bioinfo-br.blogspot.com/feeds/3422993230662512920/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=14515087&amp;postID=3422993230662512920' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/14515087/posts/default/3422993230662512920'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/14515087/posts/default/3422993230662512920'/><link rel='alternate' type='text/html' href='http://bioinfo-br.blogspot.com/2009/12/de-novo-acessory-tools.html' title='De Novo Acessory Tools'/><author><name>Leonardo Varuzza</name><uri>https://profiles.google.com/106428020899659897437</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='32' height='32' src='//lh4.googleusercontent.com/-VfVeE_sAxds/AAAAAAAAAAI/AAAAAAAAAAA/ohSMK6AOy8g/s512-c/photo.jpg'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-14515087.post-6467831345795050872</id><published>2009-12-17T12:04:00.000-02:00</published><updated>2009-12-17T12:04:51.001-02:00</updated><title type='text'>Visualização de Montagens</title><content type='html'>Achei um visualizador de assemblies muito bom chamado Tablet. O programa é feito em java, testando em Mac, Win e Linux. A user interface é bastante agradável e achei a performance, para o exemplo pequeno disponível na página do programa, muito boa.&lt;br /&gt;&lt;br /&gt;A página do programa é a seguinte:&lt;br /&gt;&lt;br /&gt;&lt;a href="http://bioinf.scri.ac.uk/tablet/index.shtml"&gt;http://bioinf.scri.ac.uk/tablet/index.shtml&lt;/a&gt;&lt;br /&gt;&lt;br /&gt;O programa esta descrito no artigo &lt;a href="http://bioinformatics.oxfordjournals.org/cgi/content/abstract/btp666"&gt;Tablet – Next Generation Sequence Assembly Visualization&lt;/a&gt;.&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/14515087-6467831345795050872?l=bioinfo-br.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://bioinfo-br.blogspot.com/feeds/6467831345795050872/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=14515087&amp;postID=6467831345795050872' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/14515087/posts/default/6467831345795050872'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/14515087/posts/default/6467831345795050872'/><link rel='alternate' type='text/html' href='http://bioinfo-br.blogspot.com/2009/12/visualizacao-de-montagens.html' title='Visualização de Montagens'/><author><name>Leonardo Varuzza</name><uri>https://profiles.google.com/106428020899659897437</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='32' height='32' src='//lh4.googleusercontent.com/-VfVeE_sAxds/AAAAAAAAAAI/AAAAAAAAAAA/ohSMK6AOy8g/s512-c/photo.jpg'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-14515087.post-7171313287157976439</id><published>2009-12-16T15:40:00.004-02:00</published><updated>2009-12-16T15:45:00.650-02:00</updated><title type='text'>Uso de memória pelo Velvet</title><content type='html'>O fator limitante para a execução do Velvet é a quantidade de memória disponível no computador. Segundo  &lt;a href="http://seqanswers.com/forums/showthread.php?t=2101"&gt; essa&lt;br /&gt;thread&lt;/a&gt; no fórum &lt;a href="http://seqanswers.com/"&gt;SEQAnswers&lt;/a&gt; forum a fórmula para calcular o uso da memória em kb é:&lt;br /&gt;&lt;br /&gt;&lt;pre&gt;Ram required for velvetg = -109635 + 18977*ReadSize &lt;br /&gt;    + 86326*GenomeSize + 233353*NumReads - 51092*K&lt;br /&gt;&lt;/pre&gt;&lt;br /&gt;Onde:&lt;br /&gt;&lt;br /&gt;&lt;ul&gt;&lt;li&gt; ReadSize está em bases&lt;/li&gt;&lt;li&gt; GenomeSize está em megabases&lt;/li&gt;&lt;li&gt; NumReads está em milhões de reads&lt;/li&gt;&lt;li&gt; E &lt;bf&gt;k&lt;/bf&gt; é o tamanho do k-tamero utilizado pelo Velvet&lt;/li&gt;&lt;/ul&gt;&lt;br /&gt;Para facilitar a vida dos usuários do Velvet (e para aprender a usar o Google App Engine), eu criei uma aplicaçãozinha que faz essa conta. O endereço é o seguinte:&lt;br /&gt;&lt;br /&gt;&lt;a href="http://denovoutils.appspot.com/"&gt;http://denovoutils.appspot.com/&lt;/a&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/14515087-7171313287157976439?l=bioinfo-br.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://bioinfo-br.blogspot.com/feeds/7171313287157976439/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=14515087&amp;postID=7171313287157976439' title='1 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/14515087/posts/default/7171313287157976439'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/14515087/posts/default/7171313287157976439'/><link rel='alternate' type='text/html' href='http://bioinfo-br.blogspot.com/2009/12/uso-de-memoria-pelo-velvet.html' title='Uso de memória pelo Velvet'/><author><name>Leonardo Varuzza</name><uri>https://profiles.google.com/106428020899659897437</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='32' height='32' src='//lh4.googleusercontent.com/-VfVeE_sAxds/AAAAAAAAAAI/AAAAAAAAAAA/ohSMK6AOy8g/s512-c/photo.jpg'/></author><thr:total>1</thr:total></entry><entry><id>tag:blogger.com,1999:blog-14515087.post-8966828211458459951</id><published>2009-12-07T11:51:00.000-02:00</published><updated>2009-12-07T11:51:05.812-02:00</updated><title type='text'>Next-gen Map</title><content type='html'>om a lista Achei esse site com um mapa bem legal com a lista dos sequenciadores de próxima geração:&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;&lt;a href="http://pathogenomics.bham.ac.uk/hts/"&gt;http://pathogenomics.bham.ac.uk/hts/&lt;/a&gt;&lt;br /&gt;&lt;br /&gt;A lista está bastante desatualizada para o Brasil, mas os próprios usuários podem atualiza-la clicando diretamente no mapa.&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/14515087-8966828211458459951?l=bioinfo-br.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://bioinfo-br.blogspot.com/feeds/8966828211458459951/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=14515087&amp;postID=8966828211458459951' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/14515087/posts/default/8966828211458459951'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/14515087/posts/default/8966828211458459951'/><link rel='alternate' type='text/html' href='http://bioinfo-br.blogspot.com/2009/12/next-gen-map.html' title='Next-gen Map'/><author><name>Leonardo Varuzza</name><uri>https://profiles.google.com/106428020899659897437</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='32' height='32' src='//lh4.googleusercontent.com/-VfVeE_sAxds/AAAAAAAAAAI/AAAAAAAAAAA/ohSMK6AOy8g/s512-c/photo.jpg'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-14515087.post-5885681148791542418</id><published>2009-12-02T17:22:00.003-02:00</published><updated>2010-02-05T11:49:25.551-02:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='cluster'/><category scheme='http://www.blogger.com/atom/ns#' term='parallel'/><category scheme='http://www.blogger.com/atom/ns#' term='hadoop'/><title type='text'>Hadoop for bioinformatics</title><content type='html'>&lt;object height="300" width="400"&gt;&lt;param name="allowfullscreen" value="true" /&gt;&lt;param name="allowscriptaccess" value="always" /&gt;&lt;param name="movie" value="http://vimeo.com/moogaloop.swf?clip_id=7351342&amp;amp;server=vimeo.com&amp;amp;show_title=1&amp;amp;show_byline=1&amp;amp;show_portrait=0&amp;amp;color=&amp;amp;fullscreen=1" /&gt;&lt;embed src="http://vimeo.com/moogaloop.swf?clip_id=7351342&amp;amp;server=vimeo.com&amp;amp;show_title=1&amp;amp;show_byline=1&amp;amp;show_portrait=0&amp;amp;color=&amp;amp;fullscreen=1" type="application/x-shockwave-flash" allowfullscreen="true" allowscriptaccess="always" width="400" height="300"&gt;&lt;/embed&gt;&lt;/object&gt;&lt;br /&gt;&lt;a href="http://vimeo.com/7351342"&gt;Hadoop for Bioinfomatics - Deepak Singh&lt;/a&gt; from &lt;a href="http://vimeo.com/cloudera"&gt;Cloudera&lt;/a&gt; on &lt;a href="http://vimeo.com/"&gt;Vimeo&lt;/a&gt;.&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/14515087-5885681148791542418?l=bioinfo-br.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://bioinfo-br.blogspot.com/feeds/5885681148791542418/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=14515087&amp;postID=5885681148791542418' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/14515087/posts/default/5885681148791542418'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/14515087/posts/default/5885681148791542418'/><link rel='alternate' type='text/html' href='http://bioinfo-br.blogspot.com/2009/12/hadoop-for-bioinformatics.html' title='Hadoop for bioinformatics'/><author><name>Leonardo Varuzza</name><uri>https://profiles.google.com/106428020899659897437</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='32' height='32' src='//lh4.googleusercontent.com/-VfVeE_sAxds/AAAAAAAAAAI/AAAAAAAAAAA/ohSMK6AOy8g/s512-c/photo.jpg'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-14515087.post-4446314315651217024</id><published>2009-11-09T11:10:00.000-02:00</published><updated>2009-11-09T11:10:52.570-02:00</updated><title type='text'>Tutorial: Mapeando Reads utilizando Hadoop e EC2</title><content type='html'>Hoje foi postado um &lt;a href="http://biodivertido.blogspot.com/2009/11/automated-informatics-pipelines-public.html"&gt;tutorial&lt;/a&gt; no blog Biodivertido sobre a instalação e uso do programa de mapeamento &lt;a href="http://sourceforge.net/apps/mediawiki/cloudburst-bio/index.php?title=CloudBurst"&gt;CloudBurst&lt;/a&gt; em uma instância do Amazon EC2 (Elastic Cloud Computing).&lt;br /&gt;&lt;br /&gt;Hadoop, Cloud Computing e Amazon EC2 são três coisas que eu acredito terão muita importância para os usuários do sequênciadores de nova geração.&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/14515087-4446314315651217024?l=bioinfo-br.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://bioinfo-br.blogspot.com/feeds/4446314315651217024/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=14515087&amp;postID=4446314315651217024' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/14515087/posts/default/4446314315651217024'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/14515087/posts/default/4446314315651217024'/><link rel='alternate' type='text/html' href='http://bioinfo-br.blogspot.com/2009/11/tutorial-mapeando-reads-utilizando.html' title='Tutorial: Mapeando Reads utilizando Hadoop e EC2'/><author><name>Leonardo Varuzza</name><uri>https://profiles.google.com/106428020899659897437</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='32' height='32' src='//lh4.googleusercontent.com/-VfVeE_sAxds/AAAAAAAAAAI/AAAAAAAAAAA/ohSMK6AOy8g/s512-c/photo.jpg'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-14515087.post-3776795539910173688</id><published>2009-10-29T12:02:00.002-02:00</published><updated>2009-11-25T11:28:02.992-02:00</updated><title type='text'>De Novo Assembly</title><content type='html'>O trabalho &lt;a href="http://www.pnas.org/content/98/17/9748.abstract"&gt;An Eulerian path approach to DNA fragment assembly&lt;/a&gt; propôs um nova abordagem para a montagem de sequencias que permite a montagem De Novo de genoma pequenos utilizando os reads de SOLiD ou Solexa. &lt;br /&gt;&lt;br /&gt;Segue abaixo a lista de programas que implementam esse metodo:&lt;br /&gt;&lt;br /&gt;&lt;table&gt;&lt;tbody&gt;&lt;tr&gt;   &lt;td&gt;Velvet&lt;br /&gt;&lt;/td&gt;   &lt;td&gt;&lt;a href="http://www.ebi.ac.uk/%7Ezerbino/velvet/"&gt;Site&lt;/a&gt;&lt;br /&gt;&lt;/td&gt;   &lt;td&gt;&lt;a href="http://genome.cshlp.org/content/early/2008/03/18/gr.074492.107.abstract"&gt;Paper&lt;/a&gt;&lt;br /&gt;&lt;/td&gt; &lt;/tr&gt;&lt;tr&gt;   &lt;td&gt;Edena&lt;br /&gt;&lt;/td&gt;   &lt;td&gt;&lt;a href="http://www.genomic.ch/edena.php"&gt;Site&lt;/a&gt;&lt;br /&gt;&lt;/td&gt;   &lt;td&gt;&lt;a href="http://genome.cshlp.org/content/early/2008/04/03/gr.072033.107"&gt;Paper&lt;/a&gt;&lt;br /&gt;&lt;/td&gt; &lt;/tr&gt;&lt;tr&gt;   &lt;td&gt;EULER-SR&lt;br /&gt;&lt;/td&gt;   &lt;td&gt;&lt;a href="http://euler-assembler.ucsd.edu/portal/"&gt;Site&lt;/a&gt;&lt;br /&gt;&lt;/td&gt;   &lt;td&gt;&lt;a href="http://genome.cshlp.org/content/18/2/324.abstract"&gt;Paper&lt;/a&gt;&lt;br /&gt;&lt;/td&gt; &lt;/tr&gt;&lt;tr&gt;   &lt;td&gt;SSAKE&lt;br /&gt;&lt;/td&gt;   &lt;td&gt;&lt;a href="http://www.bcgsc.ca/platform/bioinfo/software/ssake"&gt;Site&lt;/a&gt;&lt;br /&gt;&lt;/td&gt;   &lt;td&gt;&lt;a href="http://bioinformatics.oxfordjournals.org/cgi/content/abstract/23/4/500"&gt;Paper&lt;/a&gt;&lt;br /&gt;&lt;/td&gt; &lt;/tr&gt;&lt;tr&gt;   &lt;td&gt;ADiR&lt;br /&gt;&lt;/td&gt;   &lt;td&gt;&lt;a href="http://solidsoftwaretools.com/gf/project/adir/"&gt;Site&lt;/a&gt;&lt;br /&gt;&lt;/td&gt; &lt;/tr&gt;&lt;tr&gt;   &lt;td&gt;ALLPATHS2&lt;br /&gt;&lt;/td&gt;   &lt;td&gt;&lt;a href="http://www.broadinstitute.org/science/programs/genome-biology/crd"&gt;Site&lt;/a&gt;&lt;br /&gt;&lt;/td&gt;   &lt;td&gt;&lt;a href="http://genomebiology.com/2009/10/10/R103"&gt;Paper&lt;/a&gt;&lt;br /&gt;&lt;/td&gt; &lt;/tr&gt;&lt;tr&gt;   &lt;td&gt;ALLPATHS&lt;br /&gt;&lt;/td&gt;   &lt;td&gt;&lt;a href="http://www.broadinstitute.org/science/programs/genome-biology/crd"&gt;Site&lt;/a&gt;&lt;br /&gt;&lt;/td&gt;   &lt;td&gt;&lt;a href="http://genome.cshlp.org/content/18/5/810.full"&gt;Paper&lt;/a&gt;&lt;br /&gt;&lt;/td&gt; &lt;/tr&gt;&lt;tr&gt;   &lt;td&gt;QSRA&lt;br /&gt;&lt;/td&gt;   &lt;td&gt;&lt;a href="http://qsra.cgrb.oregonstate.edu/"&gt;Site&lt;/a&gt;&lt;br /&gt;&lt;/td&gt;   &lt;td&gt;&lt;a href="http://www.biomedcentral.com/1471-2105/10/69"&gt;Paper&lt;/a&gt;&lt;br /&gt;&lt;/td&gt; &lt;/tr&gt;&lt;tr&gt;   &lt;td&gt;VCAKE&lt;br /&gt;&lt;/td&gt;   &lt;td&gt;&lt;a href="http://sourceforge.net/projects/vcake/"&gt;Site&lt;/a&gt;&lt;br /&gt;&lt;/td&gt;   &lt;td&gt;&lt;a href="http://bioinformatics.oxfordjournals.org/cgi/content/full/23/21/2942"&gt;Paper&lt;/a&gt;&lt;br /&gt;&lt;/td&gt; &lt;/tr&gt;&lt;tr&gt;&lt;td valign="top"&gt;MIRA&lt;br /&gt;&lt;/td&gt;&lt;td valign="top"&gt;&lt;a href="http://chevreux.org/projects_mira.html"&gt;Site&lt;/a&gt;&lt;br /&gt;&lt;/td&gt;&lt;td valign="top"&gt;&lt;br /&gt;&lt;/td&gt;&lt;/tr&gt;&lt;tr&gt;&lt;td valign="top"&gt;Abyss&lt;br /&gt;&lt;/td&gt;&lt;td valign="top"&gt;&lt;a href="http://www.bcgsc.ca/platform/bioinfo/software/abyss"&gt;Site&lt;/a&gt;&lt;br /&gt;&lt;/td&gt;&lt;td valign="top"&gt;&lt;a href="http://genome.cshlp.org/content/19/6/1117"&gt;Paper1&lt;/a&gt;, &lt;a href="http://bioinformatics.oxfordjournals.org/cgi/content/abstract/btp367"&gt;Paper2&lt;/a&gt;&lt;br /&gt;&lt;/td&gt;&lt;/tr&gt;&lt;/tbody&gt; &lt;/table&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/14515087-3776795539910173688?l=bioinfo-br.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://bioinfo-br.blogspot.com/feeds/3776795539910173688/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=14515087&amp;postID=3776795539910173688' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/14515087/posts/default/3776795539910173688'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/14515087/posts/default/3776795539910173688'/><link rel='alternate' type='text/html' href='http://bioinfo-br.blogspot.com/2009/10/de-novo-assembly.html' title='De Novo Assembly'/><author><name>Leonardo Varuzza</name><uri>https://profiles.google.com/106428020899659897437</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='32' height='32' src='//lh4.googleusercontent.com/-VfVeE_sAxds/AAAAAAAAAAI/AAAAAAAAAAA/ohSMK6AOy8g/s512-c/photo.jpg'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-14515087.post-6162018668916770959</id><published>2009-10-27T12:37:00.000-02:00</published><updated>2009-10-27T12:41:56.832-02:00</updated><title type='text'>Hadoop na bioinformática</title><content type='html'>O &lt;a href="http://hadoop.apache.org/"&gt;Hadoop&lt;/a&gt; é a combinação de um framework open-source em Java para processamento distribuído e um filesystem também distribuído que foi desenvolvido nos moldes do &lt;a href="http://labs.google.com/papers/mapreduce.html"&gt;MapReduce&lt;/a&gt; e GFS do Google. Essas tecnologias permitem que se desenvolvam programas que podem ser escalados para a execução em clusters com dezenas de milhares de computadores. Atualmente o Hadoop é a arquitetura por trás do sistema de busca do Yahoo.&lt;br /&gt;&lt;br /&gt;O aumento brutal no volume de dados gerado pelos sequenciadores de nova geração cria novos desafios para a bioinformática. O hadoop é uma solução bastante adequada para muitos dos problemas ligados com a análise desses novos dados de sequenciamento.&amp;nbsp; No blog da Cloudera foi colocado &lt;a href="http://www.cloudera.com/blog/2009/10/15/analyzing-human-genomes-with-hadoop/"&gt;um artigo&lt;/a&gt; sobre o &lt;a href="http://bowtie-bio.sourceforge.net/crossbow/index.shtml"&gt;Crossbow&lt;/a&gt;, que é um pipeline para resequenciamento.&lt;br /&gt;&lt;br /&gt;Além de permitir uma grande escalabilidade, o Hadoop é suportado pelo sistema de Cloud Computing &lt;a href="http://aws.amazon.com/ec2/"&gt;Amazon EC2&lt;/a&gt;. Com o cloud computing é possível&amp;nbsp; executar análises pesadas sem investir um grande volume de dinheiro na compra de computadores, pois cria-se um cluster on-demand no data center da Amazon e paga-se por hora de CPU usada.&lt;br /&gt;&lt;br /&gt;Acredito que no futuro devem surgir mais pipelines que utilizam o Hadoop.&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/14515087-6162018668916770959?l=bioinfo-br.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://bioinfo-br.blogspot.com/feeds/6162018668916770959/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=14515087&amp;postID=6162018668916770959' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/14515087/posts/default/6162018668916770959'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/14515087/posts/default/6162018668916770959'/><link rel='alternate' type='text/html' href='http://bioinfo-br.blogspot.com/2009/10/hadoop-na-bioinformatica.html' title='Hadoop na bioinformática'/><author><name>Leonardo Varuzza</name><uri>https://profiles.google.com/106428020899659897437</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='32' height='32' src='//lh4.googleusercontent.com/-VfVeE_sAxds/AAAAAAAAAAI/AAAAAAAAAAA/ohSMK6AOy8g/s512-c/photo.jpg'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-14515087.post-5348861346644997039</id><published>2009-10-27T08:35:00.011-02:00</published><updated>2010-01-28T14:47:52.792-02:00</updated><title type='text'>Lista de programas para mapear short sequencies</title><content type='html'>A nova geração de sequênciadores de alto desempenho geram uma quantidade enorme de reads muito menores do que o método de Sanger. Alinhar centenas de milhões de seqüências curtas é um grande desafio computacional, por isso estão sendo criados diversos programas para essa tarefa.&lt;br /&gt;&lt;br /&gt;Abaixo está uma lista de programas que eu levantei:&lt;br /&gt;&lt;br /&gt;&lt;table&gt;&lt;thead&gt;&lt;tr&gt; &lt;th&gt;Nome&lt;br /&gt;&lt;/th&gt; &lt;th&gt;Site&lt;br /&gt;&lt;/th&gt; &lt;th&gt;Paper&lt;br /&gt;&lt;/th&gt;&lt;td valign="top"&gt;&lt;br /&gt;&lt;/td&gt; &lt;/tr&gt;&lt;/thead&gt;&lt;tbody&gt;&lt;tr&gt; &lt;td&gt;SOAP&lt;br /&gt;&lt;/td&gt; &lt;td&gt;&lt;a href="http://soap.genomics.org.cn/"&gt;http://soap.genomics.org.cn/&lt;/a&gt;&lt;br /&gt;&lt;/td&gt; &lt;td&gt;&lt;a href="http://bioinformatics.oxfordjournals.org/cgi/content/abstract/25/15/1966"&gt;SOAP2: an improved ultrafast tool for short read alignment&lt;/a&gt;&lt;br /&gt;&lt;/td&gt;&lt;td valign="top"&gt;&lt;br /&gt;&lt;/td&gt; &lt;/tr&gt;&lt;tr&gt; &lt;td&gt;MapReads&lt;br /&gt;&lt;/td&gt; &lt;td&gt;&lt;a href="http://solidsoftwaretools.com/gf/project/mapreads/"&gt;http://solidsoftwaretools.com/&lt;/a&gt;&lt;br /&gt;&lt;/td&gt; &lt;td&gt;&lt;br /&gt;&lt;/td&gt;&lt;td valign="top"&gt;cs&lt;br /&gt;&lt;/td&gt; &lt;/tr&gt;&lt;tr&gt; &lt;td&gt;SHRiMP&lt;br /&gt;&lt;/td&gt; &lt;td&gt;&lt;a href="http://compbio.cs.toronto.edu/shrimp/"&gt;http://compbio.cs.toronto.edu/shrimp/&lt;/a&gt;&lt;br /&gt;&lt;/td&gt; &lt;td&gt;&lt;a href="http://www.ploscompbiol.org/article/info:doi/10.1371/journal.pcbi.1000386"&gt;SHRiMP: Accurate Mapping of Short Color-space Reads&lt;/a&gt;&lt;br /&gt;&lt;/td&gt;&lt;td valign="top"&gt;bs, cs&lt;br /&gt;&lt;/td&gt; &lt;/tr&gt;&lt;tr&gt;&lt;td valign="top"&gt;perM&lt;br /&gt;&lt;/td&gt;&lt;td valign="top"&gt;&lt;a href="http://code.google.com/p/perm/"&gt;http://code.google.com/p/perm/&lt;/a&gt;&lt;br /&gt;&lt;/td&gt;&lt;td valign="top"&gt;&lt;a href="http://bioinformatics.oxfordjournals.org/cgi/content/abstract/btp486"&gt;PerM: efficient mapping of short sequencing reads with periodic full sensitive spaced seeds&lt;/a&gt;&lt;br /&gt;&lt;/td&gt;&lt;td valign="top"&gt;bs,cs&lt;br /&gt;&lt;/td&gt;&lt;/tr&gt;&lt;tr&gt;&lt;td valign="top"&gt;Mosaik&lt;br /&gt;&lt;/td&gt;&lt;td valign="top"&gt;&lt;a href="http://code.google.com/p/mosaik-aligner/"&gt;http://code.google.com/p/mosaik-aligner/&lt;/a&gt;&lt;br /&gt;&lt;/td&gt;&lt;td valign="top"&gt;&lt;br /&gt;&lt;/td&gt;&lt;td valign="top"&gt;cs,bs&lt;br /&gt;&lt;/td&gt;&lt;/tr&gt;&lt;tr&gt;&lt;td valign="top"&gt;bwa&lt;br /&gt;&lt;/td&gt;&lt;td valign="top"&gt;&lt;a href="http://bio-bwa.sourceforge.net/"&gt;http://bio-bwa.sourceforge.net/&lt;/a&gt;&lt;br /&gt;&lt;/td&gt;&lt;td valign="top"&gt;&lt;a href="http://bioinformatics.oxfordjournals.org/cgi/content/full/25/14/1754"&gt;Fast and accurate short read alignment    with Burrows-Wheeler Transform&lt;/a&gt;&lt;br /&gt;&lt;/td&gt;&lt;td valign="top"&gt;cs,bs&lt;br /&gt;&lt;/td&gt;&lt;/tr&gt;&lt;tr&gt; &lt;td&gt;maq&lt;br /&gt;&lt;/td&gt; &lt;td&gt;&lt;a href="http://maq.sourceforge.net/index.shtml"&gt;http://maq.sourceforge.net/index.shtml&lt;/a&gt;&lt;br /&gt;&lt;/td&gt; &lt;td&gt;&lt;br /&gt;&lt;/td&gt;&lt;td valign="top"&gt;&lt;br /&gt;&lt;/td&gt; &lt;/tr&gt;&lt;tr&gt; &lt;td&gt;bowtie&lt;br /&gt;&lt;/td&gt; &lt;td&gt;&lt;a href="http://bowtie-bio.sourceforge.net/index.shtml"&gt;http://bowtie-bio.sourceforge.net/index.shtml&lt;/a&gt;&lt;br /&gt;&lt;/td&gt; &lt;td&gt;&lt;a href="http://genomebiology.com/2009/10/3/R25"&gt;Ultrafast and memory-efficient alignment of short DNA sequences to the human genome&lt;/a&gt;&lt;br /&gt;&lt;/td&gt;&lt;td valign="top"&gt;bs,cs&lt;br /&gt;&lt;/td&gt; &lt;/tr&gt;&lt;tr&gt; &lt;td&gt;SOCS&lt;br /&gt;&lt;/td&gt;  &lt;td&gt;&lt;a href="http://socs.biology.gatech.edu/"&gt;http://socs.biology.gatech.edu/&lt;/a&gt;&lt;br /&gt;&lt;/td&gt;  &lt;td&gt;&lt;a href="http://bioinformatics.oxfordjournals.org/cgi/content/abstract/btn512"&gt;Efficient mapping of Applied Biosystems SOLiD sequence data to a reference genome for functional genomic applications&lt;/a&gt;&lt;br /&gt;&lt;/td&gt;&lt;td valign="top"&gt;cs&lt;br /&gt;&lt;/td&gt; &lt;/tr&gt;&lt;tr&gt; &lt;td&gt;bfast&lt;br /&gt;&lt;/td&gt;  &lt;td&gt;&lt;a href="https://secure.genome.ucla.edu/index.php/BFAST"&gt;https://secure.genome.ucla.edu/index.php/BFAST&lt;/a&gt;&lt;br /&gt;&lt;/td&gt;  &lt;td&gt;&lt;a href="http://www.plosone.org/article/info%3Adoi%2F10.1371%2Fjournal.pone.0007767"&gt;BFAST: An Alignment Tool for Large Scale Genome Resequencing&lt;/a&gt;&lt;br /&gt;&lt;/td&gt;&lt;td valign="top"&gt;cs,bs&lt;br /&gt;&lt;/td&gt; &lt;/tr&gt;&lt;tr&gt;&lt;td valign="top"&gt;RMAP&lt;br /&gt;&lt;/td&gt;&lt;td valign="top"&gt;&lt;a href="http://rulai.cshl.edu/rmap/"&gt;http://rulai.cshl.edu/rmap/&lt;/a&gt;&lt;br /&gt;&lt;/td&gt;&lt;td valign="top"&gt;&lt;a href="http://www.biomedcentral.com/1471-2105/9/128"&gt;Using quality scores and longer reads improves accuracy of Solexa read mapping&lt;/a&gt;&lt;br /&gt;&lt;/td&gt;&lt;td valign="top"&gt;&lt;br /&gt;&lt;/td&gt;&lt;/tr&gt;&lt;tr&gt;&lt;td valign="top"&gt;CloudBurst&lt;br /&gt;&lt;/td&gt;&lt;td valign="top"&gt;&lt;a href="http://sourceforge.net/apps/mediawiki/cloudburst-bio/index.php?title=CloudBurst"&gt;http://sourceforge.net/apps/mediawiki/cloudburst-bio/index.php?title=CloudBurst&lt;/a&gt;&lt;br /&gt;&lt;/td&gt;&lt;td valign="top"&gt;&lt;br /&gt;&lt;/td&gt;&lt;td valign="top"&gt;&lt;br /&gt;&lt;/td&gt;&lt;/tr&gt;&lt;/tbody&gt;   &lt;/table&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/14515087-5348861346644997039?l=bioinfo-br.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://bioinfo-br.blogspot.com/feeds/5348861346644997039/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=14515087&amp;postID=5348861346644997039' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/14515087/posts/default/5348861346644997039'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/14515087/posts/default/5348861346644997039'/><link rel='alternate' type='text/html' href='http://bioinfo-br.blogspot.com/2009/10/lista-de-programas-para-mapear-short.html' title='Lista de programas para mapear short sequencies'/><author><name>Leonardo Varuzza</name><uri>https://profiles.google.com/106428020899659897437</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='32' height='32' src='//lh4.googleusercontent.com/-VfVeE_sAxds/AAAAAAAAAAI/AAAAAAAAAAA/ohSMK6AOy8g/s512-c/photo.jpg'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-14515087.post-112317947428418909</id><published>2005-08-04T15:11:00.000-03:00</published><updated>2005-08-04T15:41:53.093-03:00</updated><title type='text'>Extração de tags de SAGE</title><content type='html'>Estou analisando alguns programas que estraem as tags de SAGE das seqüências. O processo consiste em extrair as ditags que estão entre os sítios CATG. Já analisei dois programas, o SAGEparser e o tagcalling. O SAGEparser tem um sistema simples de corrigir o número de ditags apartir da freqüência de susas monotags. Já o tagcalling possui um sistema de EM que corrige a contagem das tags considerando a possibilidade de mutações. Esse segundo programa me parece melhor.&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/14515087-112317947428418909?l=bioinfo-br.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://bioinfo-br.blogspot.com/feeds/112317947428418909/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=14515087&amp;postID=112317947428418909' title='3 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/14515087/posts/default/112317947428418909'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/14515087/posts/default/112317947428418909'/><link rel='alternate' type='text/html' href='http://bioinfo-br.blogspot.com/2005/08/extrao-de-tags-de-sage.html' title='Extração de tags de SAGE'/><author><name>Leonardo Varuzza</name><uri>https://profiles.google.com/106428020899659897437</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='32' height='32' src='//lh4.googleusercontent.com/-VfVeE_sAxds/AAAAAAAAAAI/AAAAAAAAAAA/ohSMK6AOy8g/s512-c/photo.jpg'/></author><thr:total>3</thr:total></entry><entry><id>tag:blogger.com,1999:blog-14515087.post-112143399385593887</id><published>2005-07-15T10:22:00.000-03:00</published><updated>2005-07-15T10:26:33.856-03:00</updated><title type='text'>Cuidado com o CAP3</title><content type='html'>Quem for utilizar o cap3, tome cuidado com a versão do programa. Digite cap3 e veja se ele imprime a seguinte linha:&lt;br /&gt;&lt;br /&gt;VersionDate: 04/15/05&lt;br /&gt;&lt;br /&gt;A versão antiga do cap3 não indica qualquer informação sobre a versão. Essa versão antiga contém um bug em linux, não sei se ele se repete em outras plataformas. Se vc for montar uma seqüência e resolver mascarar as pontos com N, o cap3 antigo vai gerar um número muito grande de singlets, a versão nova não faz isso.&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/14515087-112143399385593887?l=bioinfo-br.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://bioinfo-br.blogspot.com/feeds/112143399385593887/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=14515087&amp;postID=112143399385593887' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/14515087/posts/default/112143399385593887'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/14515087/posts/default/112143399385593887'/><link rel='alternate' type='text/html' href='http://bioinfo-br.blogspot.com/2005/07/cuidado-com-o-cap3.html' title='Cuidado com o CAP3'/><author><name>Jonny Marafo</name><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='http://img2.blogblog.com/img/b16-rounded.gif'/></author><thr:total>0</thr:total></entry></feed>
