由买买提看人间百态

topics

全部话题 - 话题: tophat
1 (共1页)
G***G
发帖数: 16778
1
来自主题: Biology版 - tophat's junctions.bed
after we get junctions.bed from tophat, can we map the junctions
to the exons and introns of transcripts?
Which tools allow us to do this?
thank you.
j*p
发帖数: 411
2
本人在wet lab里面做纯数据分析,for NGS data analysis, 简单介绍一些自己接触过
,并且觉得挺有用的工具,说的有点杂,权作抛砖引玉,还请不吝赐教。
Next-Gen sequencing(NGS)和现在正在发展的3rd-gen sequencing将会在生物学研究中
被越来越广泛应用。不管你信不信,反正我信了。一是基于实验成本的降低($1k
whole-genome sequencing is coming),越来越多的实验室可以操作;二是可以提供
相对low throughput experiment多的多的数据和信息,可以看到很多从前看不到的东
西;三是sequencer本身对测序的准确性正在逐渐提高,所以实验固有错误率降低;四
是各种算法的成熟应用,这使得很多由于实验产生的误差在出数据后通过对数据的分析
得以过滤。按照library preparation来分,NGS主要有DNA-seq和RNA-seq
DNA-seq is usually used as ChIP-seq to study transcription factor(TF)-DNA
bi... 阅读全帖
j*p
发帖数: 411
3
本人在wet lab里面做纯数据分析,for NGS data analysis, 简单介绍一些自己接触过
,并且觉得挺有用的工具,说的有点杂,权作抛砖引玉,还请不吝赐教。
Next-Gen sequencing(NGS)和现在正在发展的3rd-gen sequencing将会在生物学研究中
被越来越广泛应用。不管你信不信,反正我信了。一是基于实验成本的降低($1k
whole-genome sequencing is coming),越来越多的实验室可以操作;二是可以提供
相对low throughput experiment多的多的数据和信息,可以看到很多从前看不到的东
西;三是sequencer本身对测序的准确性正在逐渐提高,所以实验固有错误率降低;四
是各种算法的成熟应用,这使得很多由于实验产生的误差在出数据后通过对数据的分析
得以过滤。按照library preparation来分,NGS主要有DNA-seq和RNA-seq
DNA-seq is usually used as ChIP-seq to study transcription factor(TF)-DNA
bi... 阅读全帖
B*M
发帖数: 1418
4
很喜欢这个帖子,希望可以继续热烈讨论,别把楼盖歪~~
我想提的是,如果自己是wet lab出身的,想开始学linux 编程,至少可以run bowtie,
tophat 之类的,有什么好的入门教材推荐吗?
还是就是bowtie, tophat只能在linux or mac 系统跑. windows我们在课上试过,中间
stuck,因为一个什么package 找不到...
B*M
发帖数: 1418
5
很喜欢这个帖子,希望可以继续热烈讨论,别把楼盖歪~~
我想提的是,如果自己是wet lab出身的,想开始学linux 编程,至少可以run bowtie,
tophat 之类的,有什么好的入门教材推荐吗?
还是就是bowtie, tophat只能在linux or mac 系统跑. windows我们在课上试过,中间
stuck,因为一个什么package 找不到...
s********n
发帖数: 248
6
来自主题: Biology版 - 求科普RNA-sequencing
http://gtac.wustl.edu/services/sequencing/analysis-pipelines.ph
RNA-Seq
TopHat – align raw sequence reads to the reference genome. Tophat maps across splice junctions.
Cufflinks - assemble the transcripts, estimate their abundance and test for differential expression in RNA-Seq samples. Currently gene abundance is restricted to annotated genes and transcripts.
j***x
发帖数: 1469
7
我使用的是 topHat 和cufflinks 处理。 参考文献是 :
nature protocols VOL.7 NO.3 | 2012
:Differential gene and transcript expression analysis of RNA-seq
experiments with TopHat and Cufflinks。
最后CummeRbund需要 R 环境,我安装了, 结果调用gfortran的时候找不到一个文件夹
, 我安装了gfortran,最新版。
错误信息如下:
[jokex@localhost download]$ R
/usr/lib64/R/bin/exec/R: error while loading shared libraries: libgfortran.
so.1: cannot open shared object file: No such file or directory
[jokex@localhost download]$ su --
Password:
[root@localhost download]# R
/usr... 阅读全帖
a******k
发帖数: 1190
8
来自主题: Biology版 - 如何处理RNA-Seq
I have to say, I have very bad experience with Trapnell's pipeline.
Tophat is complained to be very slow. If you have only tens of millions of
reads, you can go with Tophat, but with a huge dataset like me, it runs
forever (I admit that I have a billion reads). STAR is recommend by a lot of
people and I also found it is very good. It finishes the job in one hour on
our super cluster.
Cufflink is also very slow. For my dataset, it has been running for one week
(16 parallel jobs), and likely anoth... 阅读全帖
l**********1
发帖数: 5204
9
来自主题: Biology版 - 如何处理RNA-Seq
Pls check,
i) Trapnell C et al., (2012).
Differential gene and transcript expression analysis of RNA-seq experiments
with TopHat and Cufflinks.
Nat Protoc 7: 562–578.
ii) Li H et al., (2009).
The sequence alignment/Map format and SAMtools.
Bioinformatics 25: 2078–2079.
plus
Weikard R et al., (2013).
Identification of novel transcripts and noncoding RNAs in bovine skin by
deep next generation sequencing.
BMC Genomics. 14: 789. [Epub ahead of print]
>http://www.ncbi.nlm.nih.gov/pubmed/24225384
c... 阅读全帖
l**********1
发帖数: 5204
10
来自主题: Biology版 - 如何处理RNA-Seq
中文论坛问不出答案的话
去Google Group Tophat 英文论坛问下 如何?
HTTPS double dot//groups.google.com/forum/#!topic/tuxedo-tools-users/
HQkjCNXx2-Y
HTTPS //groups.google.com/forum/#!forum/tuxedo-tools-users
from
http://tophat.cbcb.umd.edu/igenomes.shtml
M*P
发帖数: 6456
11
十月份时候见steven salzberg,他说他的下一版tophat和cufflinks已经审完稿了,貌
似有的reviewer不是很满意,但是方法是好的,速度和精度都有提高,估计现在就快能
发表了。
你这个靠tophat/cufflinks的东西很快就要过期了。
r*****q
发帖数: 216
12
额 这个有了新的tophat 出来 我当然也会update 我的软件吧。 难道我不能用新的
tophat?

发帖数: 1
13
来自主题: Biology版 - 含有indel的reads怎么比对?
It's an interesting question. To the best of my knowledge, no RNA-seq
alignment tool was designed to tolerate CRISPR-mediated indels to date. I'm
not an RNA-seq expert, so it's possible that there are some on the way. You
can also email Luca Pinello, the author of CRISPResso. He might know more,
and he's a really nice guy.
Here I'm trying to think about a solution. There are two situations - I don'
t know which one is your case.
a) The sample for RNA-seq was derived from a single mutant clone. I... 阅读全帖
b*******m
发帖数: 3
14
Hiring unit:
Garmire Group (PI starting 09/01/2012, the postdoc position available 09/01
or later) )
University of Hawaii Cancer Research Center
Job description:
Located on the beautiful sea shore of Honolulu, Hawaii, overlooking the
Pacific Ocean, the University of Hawaii Cancer Center (UHCC) is one of only
66 research organizations in the country designated by the National Cancer
Institute. Its mission is to focus on key cancers that impact the multi-
ethic population of Hawaii, as well as wor... 阅读全帖
t******l
发帖数: 2135
15
谢谢。那就tophat了。第一印象也是它,挑来挑去眼花了。
b****b
发帖数: 656
16
来自主题: Programming版 - 问个docker做pipeline的基础问题
如果讨厌CWL的复杂,可以看看Script of Scripts ( http://vatlab.github.io/SOS/ ),用Python,支持Docker,remote execution。唯一的问题是还在beta。
SoS 的最大优点是提供一个从交互分析到批量执行都可以使用的平台,script的可读性
非常强,适合于需要经常修改的bioinformatics pipeline。Docker方面用起来也很简
单,具体就是有什么script,本地可以run,加上 docker_image=name 的option就可以
在docker中执行。我推荐你用SoS写pipeline,根据需要把其中几步放docker中去执行(
诸如说tophat,用python2,不用docker装起来很麻烦)。以后需要在cluster上run了,
只需要几个小的改动就可以了。
入门可以看看 http://vatlab.github.io/SOS/doc/presentations/SoS_BCB_Jan23_2017/index.html , 不过哪个讲的简单,没有提docker。

and
A*****n
发帖数: 243
17
google RNA-seq
能用的软件包括ERNAGE,TOPHAT等
或者商用的如CLC Genomics Workbench
但是最好自己有点序列分析的知识。
s*********x
发帖数: 1923
18
来自主题: Biology版 - RNA-seq结果分析求助
use tophat to map, and cufflinks to call rpkm on gene expression.
e****e
发帖数: 3450
19
来自主题: Biology版 - RNA-seq结果分析求助
how many cores does your laptop have?
for fastx and bowtie, 50M reads is barely OK, if you do tophat, it's not
enough...
e*****t
发帖数: 642
20
来自主题: Biology版 - 请教Bioinformatics职业规划~~~
oh hell,there are so many software available. i used to use maq,it's quite
reliable. but it's slow...
now many ppl use bowtie, it's fast and memory economic because it's special
algorithm. you can even run it on pc. for RNA-seq, some ppl use tophat if
you want to align on splicing junction...
n******7
发帖数: 12463
21
来自主题: Biology版 - 请教Bioinformatics职业规划~~~
可能基于bowtie发展了tophat,cufflinks的缘故?
n******7
发帖数: 12463
22
请问没钱的用什么aligner处理454好?而且需要spliced alignment?
tophat-bowtie 号称对长序列不好
bwa-sw 不能产生spliced alignment...
B*M
发帖数: 1418
23
galaxy 感觉使着不是很顺手.我可以用tophat拿到accepted hits和splice junctions.
然后我用cufflinks 就怎么也跑不出结果!.
在这之前,我还得把raw data groomer...
我还是用了一个小的sample.
j*p
发帖数: 411
24
what do you mean by "然后我用cufflinks 就怎么也跑不出结果"?
cufflinks should be able to take output files directly from tophat. Also,
you may want to install the recent version of cufflinks, the developers made
significant improvement predicting transcripts, and allow the user to give
kind of reference transcriptome in .gtf format.

junctions.
j*p
发帖数: 411
25
GenePattern probably has better RNA-seq pipelines. People who develops
Tophat and Scripture are both with Broad now.
n******7
发帖数: 12463
26
请问没钱的用什么aligner处理454好?而且需要spliced alignment?
tophat-bowtie 号称对长序列不好
bwa-sw 不能产生spliced alignment...
B*M
发帖数: 1418
27
galaxy 感觉使着不是很顺手.我可以用tophat拿到accepted hits和splice junctions.
然后我用cufflinks 就怎么也跑不出结果!.
在这之前,我还得把raw data groomer...
我还是用了一个小的sample.
j*p
发帖数: 411
28
what do you mean by "然后我用cufflinks 就怎么也跑不出结果"?
cufflinks should be able to take output files directly from tophat. Also,
you may want to install the recent version of cufflinks, the developers made
significant improvement predicting transcripts, and allow the user to give
kind of reference transcriptome in .gtf format.

junctions.
j*p
发帖数: 411
29
GenePattern probably has better RNA-seq pipelines. People who develops
Tophat and Scripture are both with Broad now.
n******7
发帖数: 12463
30
自己回答一下
就我的经验,gmap用来做454 sequence的alignment非常好
bowtie-tophat 有很多很奇怪的结果 后续用cufflink处理丢了很多exon
n******7
发帖数: 12463
31
自己回答一下
就我的经验,gmap用来做454 sequence的alignment非常好
bowtie-tophat 有很多很奇怪的结果 后续用cufflink处理丢了很多exon
z*********8
发帖数: 1203
32
推荐clc genomic workbench做rna seq,对于没有精力去学语言的人很合适,mapping
很快,但是mapping的算法不是大家听到的bowtie,bwa,soap,tophat etc,是他们自
己的算法,但是好处就是非常非常快啊!
而且他们做的页面也非常user friendly,mapping完了可以用R的package DESeq
calling differential expression。也可以用他们自带的,当然可能不如DESeq那么深
得人心。
不过这个软件要5000块一年,不是所有的lab都能买
z*********8
发帖数: 1203
33
推荐clc genomic workbench做rna seq,对于没有精力去学语言的人很合适,mapping
很快,但是mapping的算法不是大家听到的bowtie,bwa,soap,tophat etc,是他们自
己的算法,但是好处就是非常非常快啊!
而且他们做的页面也非常user friendly,mapping完了可以用R的package DESeq
calling differential expression。也可以用他们自带的,当然可能不如DESeq那么深
得人心。
不过这个软件要5000块一年,不是所有的lab都能买
t*******e
发帖数: 119
34
bowtie, tophat
j***x
发帖数: 1469
35
来自主题: Biology版 - paper help!
http://www.nature.com/nprot/journal/v7/n3/full/nprot.2012.016.h
Nature Protocols 7, 562–578 (2012)
Differential gene and transcript expression analysis of RNA-seq experiments
with TopHat and Cufflinks
please send to l************[email protected]
非常感谢!
b*******m
发帖数: 3
36
【 以下文字转载自 Postdoc 讨论区 】
发信人: bioinform (ngs), 信区: Postdoc
标 题: postdoc position available in bioinformatics
发信站: BBS 未名空间站 (Fri Jul 20 00:44:34 2012, 美东)
Hiring unit:
Garmire Group (PI starting 09/01/2012, the postdoc position available 09/01
or later)
University of Hawaii Cancer Research Center
Job description:
Located on the beautiful sea shore of Honolulu, Hawaii, overlooking the
Pacific Ocean, the University of Hawaii Cancer Center (UHCC) is one of only
66 research organizations in the coun... 阅读全帖
n******7
发帖数: 12463
37
来自主题: Biology版 - RNA-seq map工具
什么平台?
solexa的话 就是 tophat --> cufflinks吧
j*p
发帖数: 411
38
来自主题: Biology版 - RNA-seq map工具
if illumina, and if only interested in annotated transcripts' expression,
just try
(1) map to genome using tophat
(2) run cufflinks to get FPKM
FPKM estimation at transcript level is not as good as that at gene level.
x*****d
发帖数: 704
39
来自主题: Biology版 - 贡献一个SNP/Indel calling pipeline

多谢指点!我用的都是Tophat主页上面的hg19和mm9 reference。下次再试试。
c********e
发帖数: 598
40
来自主题: Biology版 - truth about RNAseq vs Microarray

8
Which pipeline did you use? Tophat,STAR?Deseq or cufflinks,cuffdiff?
z*********8
发帖数: 1203
41
来自主题: Biology版 - truth about RNAseq vs Microarray
I used tophat and cufflinks for mapping(DNAnexus) and DESeq for analyzing
differentially expressed gene. DEseq is very nice for beginners like me. I
dont know any R language, but I was able to use the example script to apply
in my case and get the results that I wanted. For pathway analysis, I used
IPA and metacore.
x***u
发帖数: 297
42
来自主题: Biology版 - 如何处理RNA-Seq
Tophat is slow but result is OK. Cufflinks have been reported to have issues
Trinity + PASA 可以做transcripts reconstruct:
http://pasa.sourceforge.net/

it is kind of the only software that declares to reconstruct transcripts
with reference transcriptome.
a***e
发帖数: 1010
43
来自主题: Biology版 - NGS数据分析的流程
your sample --> company --> FQ or FA file
--> blat or bowtie or Tophat to align --> (.sam, .bam file)
--> Samtools or GATK to call variants --> .vcf file (excel file)
--> igvtools or genome browser to visualize
or it is said u can use CLC to replace the last three steps.
S*******e
发帖数: 94
44
来自主题: Biology版 - 如何检测 long noncoding RNA
很多long ncRNAs本身就比较短,表达量又低不容易检测。所以担心150bp paired end
建库的时候就是300bp左右了,容易把一些lncRNA给弄断了弄丢了。所以我的一个很个
人的建议是,直接100-nt SE就可以。重点是测得深一些,后续分析仔细点。
至于exon-exon junction,其实也无所谓了。 long ncRNA gene的本身是热点,但的
splice variant其实感觉上也不是一个特别好的热点。因为做这个的需要后续,这时
候麻烦事或者无效劳动量也挺多,个人意见了。
还有就是strand specific library之后用Cufflinks这一套去拼接lncRNA。这个如果是
intergenic的,那还好说。如果是Antisense,这个如上文有人提到,有"很多tricky的
地方",从tophat这一步就开始了。
另外就是Poly-A selection或者Ribosomal RNA depletion。其实这两个技术回答的问题
有点不一样。 如果没有什么特异的考虑,那么我个人觉得Poly-A selection鉴定
lncRNAs
就... 阅读全帖
x*****d
发帖数: 704
G***G
发帖数: 16778
46
来自主题: Biology版 - tophat's junctions.bed
thank you.
can we export the results in igv?
r******0
发帖数: 357
47
跑跑bowtie (tophat), cufflinks只是最初步的
真的的难点在后面的统计/算法模型,比如如何挑选感兴趣的基因以保证实验验证有比
较好的成功率?
如何在大量噪音中分析出究竟是哪些基因参与了哪些过程导致了该结果?

的。
f*****h
发帖数: 228
48
来自主题: Biology版 - non strand specific RNA-seq数据分析
版上大牛,我现在想分析一些non strand specific RNA-seq数据,不知道在用Bowtie
,TopHat,Cufflinks是都要注意些什么,另外,这些软件都是如何区分strand
specific和non specific的数据的,是正反strand都align上吗?还是在做
differential expression analysis的时候才考虑进去,本人初入门,多谢指教。
c*********r
发帖数: 1312
49
来自主题: Biology版 - non strand specific RNA-seq数据分析
http://www.nature.com/nprot/journal/v7/n3/full/nprot.2012.016.h
我不是大牛,也是刚刚开始做NGS数据分析。上边这个link是个很好的Bowtie Tophat
Cufflinks pipeline。你可以参考一下,跟着做不太难。
就我的理解,non strand specific基本没问题,现在大多数RNA-seq数据好像都是默认
non strand specific。往往strand-specific才需要在分析时额外加一些option。
有空多交流。happy data mining!

Bowtie
d****7
发帖数: 109
50
来自主题: Biology版 - non strand specific RNA-seq数据分析
Non-strand specific就不用说了,最简单了,照着protocol来,没啥问题,问题在于
strand specific:
1. strand-specific or not 不影响mapping,因为不管用什么protocol,出来的reads
map到哪条strand都算。再算gene expression的时候才会用到strand information,
具体怎么count reads to gene, 这个要取决于你的strand specific library prep
protocol
2. 拿tophat/bowtie来说,mapping的时候有一个选项--library-type,这个怎么选都
不影响mapping,但是一定要注意,接下来用cuffilink/cuffdiff和htseq-count之类的
,这个--library-type一定要添对了,这个影响gene expression的counting
3. 如何选--library-type?总的来说,如果先测序的reads(R1 reads)是从first-
strand (first st... 阅读全帖
1 (共1页)