由买买提看人间百态

topics

全部话题 - 话题: batches
首页 上页 1 2 3 4 5 6 7 8 9 10 下页 末页 (共10页)
z****e
发帖数: 54598
1

rdd
stream就是那种连续的,不间断过来的数据
batch就是那种已知边界的数据
spark的streaming只是mincro batch
本质上还是bacth,不是streaming
streaming要求过来一个就处理一个,而且一次就处理一个
这种就是真streaming,如果达不到这种要求,就是伪streaming
microbatch顾名思义,不是这种搞法
streaming的好处显而易见,时效性强,可以很快作出反应
但是坏处也很明显,需要资源比较多
而且从长时间上看,比如处理chunk,总体算下来
还是batch用时比较节省
其实streaming我个人认为并不适合用来做persistance的处理
尤其是file system, db上的数据,我觉得用batch就足够了
streaming用在对付需要短时间处理并反馈的数据
主要是用来处理web上过来的数据,比如video这些
还有tweets,还比如用一个udp socket直接监听一个port就好了
这些用streaming api就非常合理,可以增强客户体验
他们还有第三种api,就是table api,这个... 阅读全帖
c*********d
发帖数: 9770
2
【 以下文字转载自 Mod_CHN_Hist 讨论区 】
发信人: chinabbsdad (张果老他爹), 信区: Mod_CHN_Hist
标 题: 张春桥:论对资产阶级的全面专政(汉英对照)
发信站: BBS 未名空间站 (Thu Nov 21 06:51:57 2013, 美东)
张春桥:论对资产阶级的全面专政(On Exercising All-Round Dictatorship Over
the Bourgeoisie)
无产阶级专政问题,是长期以来马克思主义同修正主义斗争的焦点。列宁说“只有承认
阶级斗争、同时也承认无产阶级专政的人,才是马克思主义者。”
THE question of the dictatorship of the proletariat has long been the focus
of the struggle between Marxism and revisionism. Lenin said, “Only he is a
Marxist who extends the recognition of the class struggle t... 阅读全帖
t***y
发帖数: 1246
3
A typical iPhone Serial number: 7T727XYZWH8
Using the above as an example
digits 1 to 3 (7T7) = Year / Batch code / manufacture ID
digits 4 to 5 (27) = Week of manufacture
digits 6 to 8 (XZY) = Unique identifier part of the S/N
digits 9 to 11 (A4S) = 16Gb model
我一会吧版上发的4.01, 4.02总结下。
要看同一周下面同一个batch是什么版本,那个batch可能就都是一个版本。这个估计好
总结些。然后晚上你再验证下。
基本就是先看周,然后看batch code。
e********s
发帖数: 119
4
刚刚收到email说,奶粉没问题。

Dear ×××××,
With the recent media attention to infant formula, we want to make sure that
you—as a member of the Enfamil® Family Beginnings™ program—
know that trust in Enfamil products is well placed. We are completely
confident that Enfamil PREMIUM® Newborn formula and all the other
products we manufacture are safe.
That's because every single batch of Enfamil formula undergoes approximately
2,300 individual quality and safety tests before being shipped fro... 阅读全帖
r*f
发帖数: 39119
5
来自主题: NextGeneration版 - enfamil有事?
Updated: Statement from Mead Johnson on Enfamil PREMIUM? Newborn Safety
Last Updated: December 23, 2011; 5:00 p.m. CT
For over 100 years, Mead Johnson has been trusted by parents and
healthcare professionals around the world to provide nutrition for their
infants and children. That trust is earned based on the knowledge
that every product we manufacture is safe and of the highest quality.
All of our infant formula products sold around the world undergo more
than 2,300 quality tests before they a... 阅读全帖
a******n
发帖数: 206
6
先声明:我没有站边, either pump or dump btc. 只是读到了自己觉得不错的文章,
分享过来,想听听大牛怎么说。
链接在此:https://blog.chain.com/a-letter-to-jamie-dimon-de89d417cb80?from=
singlemessage&isappinstalled=0
原文很长。总结如下。
Cryptocurrencies (which I prefer to call crypto assets) are a new asset
class that enable decentralized applications
Decentralized applications enable services we already have today, like
payments, storage, or computing, but without a central operator of those
services
This software model is useful to people who need c... 阅读全帖
G****s
发帖数: 13
7
来自主题: Immigration版 - EB1B直接PP后扫号之我见
在本版学到了很多东西,也想做点贡献。版上,包括大牛学者genegun多次问到几个关
于PP的问题,我最近的140申请或许可以提供一些有用的信息。
个人基本情况:千年生物博后,paper和conference abstract各20篇左右,他引150上
下。一个pending patent和几个学校的invention disclosure。今年初决定不再偷懒,
把绿卡办了,去学校要求办EB1b,学校指定了Fragoman的律师。写推荐信,准备材料花
了一个半月左右。要了9封推荐信,来自四个国家。Claim了四项,publication,
contribution,judgment, 还有media report。律师的工作很让人失望,细节就不说了
,因为问题太多。本来根据genegun的分析,想等申请送到后两周再追加PP,好避开著
名的IO #1172,但律师说要另收$500。觉得不值,于是直接PP。
4/6周五FedEx送到,4/9号收到Email通知,说是15 calendar days给答复。本来对绿卡
并不上心,但这绿卡真投出去了,反而开始焦虑。不会用扫号器,但咱做生物的最擅长... 阅读全帖
f*1
发帖数: 837
8
也有人在网上抱怨Donic Acuda S1起泡。
网摘:
“I would suggest trying to order the rubber from a different vendor if you
want to keep it because it may be a bad batch or something of that nature as
well.”
“...being a Tibhar agent myself I've seen some batches of rubber which have
consistently bubbled while other batches of the same rubber have been fine.”
“Cons - Commerical DHS rubbers are always different, from each batch to the
next, but nothing to do about that.”
f*1
发帖数: 837
9
嫁接一下:
黑哥的想法很是别有新意。说不定还真和真空包装有关。不过要确定,本版博士们得出
一篇《On Vacuum's Effects on Adhesives》才行。
难道你是说你的新H3Neo在拆开之前就是凸的?这个好象不大对头。我的全是平的,从
megaspin.net买的和shanghai-sports.com带来的都是。放了半年仍是平的。
也有人在网上抱怨Donic Acuda S1起泡。
网摘:
“I would suggest trying to order the rubber from a different vendor if you
want to keep it because it may be a bad batch or something of that nature as
well.”
“...being a Tibhar agent myself I've seen some batches of rubber which have
consistently bubbled while other batches of the same rubber... 阅读全帖
p*****y
发帖数: 1049
10
来自主题: Xibei版 - 其实上次说起earned income
不是我逗,我是不信任他们。我是学化工的,多少有点了解批号代表的 batch是什么意
思。如果这一batch里发现了虫子,下一个batch真的就没有污染吗?骗鬼吧?不一个
batch,可是原料还是有可能来自一个仓库的。交叉污染渠道多了去了。
l*s
发帖数: 783
11
☆─────────────────────────────────────☆
runPython (凸-.-) 于 (Mon Oct 15 00:29:15 2012, 美东) 提到:
还在犹豫:
语言上C#强于Java;
框架上MVC的ASP更易用,
但是JAVA的基于开源的生态系统已经很强大,
大公司都在用,找工作前景很不错。
感觉上java和j2EE的是中大公司搞,
C#和ASP是中小公司搞或者非IT的大公司。
可以看出来薪水还是有点差别的,平均来说。
JAVA略高
☆─────────────────────────────────────☆
a9 (嗯) 于 (Mon Oct 15 08:33:29 2012, 美东) 提到:
搞电子商务的很多大公司都在用.net

☆─────────────────────────────────────☆
NeverLearn (24K golden bear) 于 (Mon Oct 15 11:06:35 2012, 美东) 提到:
Java is paid high simply b/c it's c... 阅读全帖
d****n
发帖数: 1637
12
来自主题: Programming版 - SGE qsub
The -r option allows users to control whether the submitted job will be
rerun if the controlling batch node fails during execution of the
batch
job. The -r option likewise allows users to indicate whether or not
the
batch job is eligible to be rerun by the qrerun utility. Some jobs
can-
not be correctly rerun because of changes they make in the state
of
databases or other aspects of their environment. This volume
of
IEEE Std 1003.1-2001 speci... 阅读全帖
j******4
发帖数: 6090
13
来自主题: Unix版 - 新手问个基础问题
比如我在A文件夹下面有文件 *.batch.new, *.batch.old *.batch.annot,在B文件夹
下面有
相同名字的文件。
现在我想把A文件夹下面所有文件的最后两行(tail -2)添加到B文件夹的对应文件里面
,应该如何实
现?我想应该用循环,但是我不知道具体的语法。。。
比我我的路径现在在B文件夹:
for name in \ls /A/*.batch.new; do
tail -2 $name.annot >> (name)."annot"
tail -2 $name.old >> (name)."old"
tail -2 $name.new >> (name)."new"
done
这个语句肯定不对,但是大概应该是这样,vi编辑器,用的bash,不知道我说清楚了没
有,希望大牛指
点一二,谢谢
d*2
发帖数: 2053
14
来自主题: DataSciences版 - Impala v Hive
http://vision.cloudera.com/impala-v-hive/
by Mike Olson
December 22, 2013
We introduced Cloudera Impala more than a year ago. It was a good launch for
us — it made our platform better in ways that mattered to our customers,
and it’s allowed us to win business that was previously unavailable because
earlier products simply couldn’t tackle interactive SQL workloads.
As a side effect, though, that launch ignited fierce competition among
vendors for SQL market share in the Apache Hadoop ecosystem, w... 阅读全帖
o******1
发帖数: 1046
15
来自主题: DataSciences版 - 有关Stochastic Gradient Descent
要是有一种方法能够保证收敛到global minimum的话,其余所有的数值方法全部不值一
提了。跟batch or mini-batch GD一样,除非是对于convex函数,否则只能收敛到
local minimum。SGD和mini-batch GD还必须让rate随step递减并趋于0,否则结果会是
绕着local minimum打转。
我理解SGD和mini-batch GD都是在sample数量太大,内存装不了,SD算不过来的情形下
,不得已采取的措施。只要rate取得合适,只有SD能保证每一步都在优化,而另外俩只
能保证大体上往优化点跑。假设SD算得过来,完全没必要用另外两种。Ng的cousera课
程上讲到过这三种算法,可以去那儿看看。
当然了,我也是半路出家。欢迎专家指正。
o******1
发帖数: 1046
16
来自主题: DataSciences版 - 有关Stochastic Gradient Descent
要是有一种方法能够保证收敛到global minimum的话,其余所有的数值方法全部不值一
提了。跟batch or mini-batch GD一样,除非是对于convex函数,否则只能收敛到
local minimum。SGD和mini-batch GD还必须让rate随step递减并趋于0,否则结果会是
绕着local minimum打转。
我理解SGD和mini-batch GD都是在sample数量太大,内存装不了,SD算不过来的情形下
,不得已采取的措施。只要rate取得合适,只有SD能保证每一步都在优化,而另外俩只
能保证大体上往优化点跑。假设SD算得过来,完全没必要用另外两种。Ng的cousera课
程上讲到过这三种算法,可以去那儿看看。
当然了,我也是半路出家。欢迎专家指正。
c*********d
发帖数: 9770
17
张春桥:论对资产阶级的全面专政(On Exercising All-Round Dictatorship Over
the Bourgeoisie)
无产阶级专政问题,是长期以来马克思主义同修正主义斗争的焦点。列宁说“只有承认
阶级斗争、同时也承认无产阶级专政的人,才是马克思主义者。”
THE question of the dictatorship of the proletariat has long been the focus
of the struggle between Marxism and revisionism. Lenin said, “Only he is a
Marxist who extends the recognition of the class struggle to the recognition
of the dictatorship of the proletariat.” And it is precisely to enable us
to go by Marxism and not revisionism in both theory an... 阅读全帖
D***s
发帖数: 5613
18
这个中文报道太多文字游戏,含含糊糊没说清楚。看了一个英文报道,意思是这批检测
试剂盒有问题,准确度低于30%(王辰院士也提到过核酸试剂盒的准确度是30%-50%),
所以就退回了。这批有问题的试剂盒不是跟中国大规模合同(432 M 欧元)里面的试剂盒
,是西班牙国内一个供应商提供的,该供应商从中国进的货。
Madrid, Mar 26 (efe-epa).- Spain抯 government on Thursday said a batch of
faulty Covid-19 testing kits it had been forced to return had been acquired
through a domestic supplier and not directly from an unlicensed Chinese
company as the Asian nation抯 embassy said.
Spain抯 efforts this week to roll out 640,000 rapid testing kits, mainly
made by co... 阅读全帖
l****z
发帖数: 29846
19
来自主题: USANews版 - "Death by Bureaucracy"
by Rick aka Mr. Brutally Honest
I stumbled across this story on Facebook this morning:
After a two and a half year legal battle, 15 tons of cheese made and
aged near Mountain View was MorningLandDiaryhauled to a dump. To fans of
natural foods, it is monumental waste and over-regulation. To Missouri's
Milk Board, it's merely protecting public health.
"I see the destruction of what my wife and I and family have worked to
build," said Joseph Dixon, owner of Morningland Dairy.
Dixon an... 阅读全帖
l******a
发帖数: 3803
20
Judicial Watch Uncovers New Batch of Hillary Clinton Emails
AUGUST 09, 2016
Email Print Text Size
Huma Abedin Emails Show Clinton Foundation Donor Demands on State Department
(Washington DC) – Judicial Watch today released 296 pages of State
Department records, of which 44 email exchanges were not previously turned
over to the State Department, bringing the known total to date to 171 of new
Clinton emails (not part of the 55,000 pages of emails that Clinton turned
over to the State Department).... 阅读全帖
d*w
发帖数: 384
21
来自主题: HiFi版 - 请推荐一个DAC
in the sub-$100 category, Topping D1 mark 2 is pretty good. the dynamic
range of its headphone output is relatively narrow compared to more
expensive DAC/headphone amplifiers.
in the sub-$200 category, Audioengine D1 is pretty good.
in the sub-$500, NuForce HDP is good. it is discontinued. somehow some
distributors still claim they have new batches. I tried some of "the new
batches," they are not as good as older batches.
b********g
发帖数: 43
22
来自主题: JobHunting版 - 题目都答对了,竟然都没offer?
那天太神勇了。。主要都是常见题。。
1. two sum, sorted and unsorted version
2. reverse K group linked list
3. binary search tree(lowest common ancestor), extend to binary tree case
4. find unnecessary classes to compile a class in a package .
5. reverse a sentence word by word "I am joe" ---> I ma eoj
6. permutation of a string, I wrote the recursive way first and next_
permutation implementation after that.
7. intersection of two sorted array
8. rotate of a matrix, NxN case, I use transpose, NxM case, I use repl... 阅读全帖
p*****3
发帖数: 488
23
来自主题: JobHunting版 - FB设计题求教。
request 会打在web server上,每个web server再实时的batch processing log,再内
存keep一个简单的aggregation map。batch size 到了就把old aggregation map写到
sharded的key value store上,一般支持batch write一个shard就发一个request,key
value
store最好支持merge operation,不支持race condition的几率也很小。
解释个毛线的key value store原理,拿来会用不就得了。
h******b
发帖数: 312
24
来自主题: StartUp版 - iPhone上批量删除联系人 (转载)
【 以下文字转载自 Apple 讨论区 】
发信人: happyzhb (NotionInMotion), 信区: Apple
标 题: iPhone上批量删除联系人
发信站: BBS 未名空间站 (Wed Dec 22 15:55:23 2010, 美东)
http://itunes.apple.com/app/contactdel/id399148990?mt=8
ContactDel provides many ways to delete your contacts in batch on iPhone and
iPod Touch. Use ContactDel to save time when cleaning up your address book.
1. Delete All: The easiest way to delete all contacts on your iPhone and
iPod Touch, just push the button, the App does the rest.
2. Delete by Name: list your c... 阅读全帖
h*******y
发帖数: 1563
25
IE/Firefox/Maxthon/Chrome: single: Save As; Batch: 同主题+save as
Cterm/Sterm,etc.: single F4/Shift+F; Batch: tool: batch download (I like thi
s a lot, you can set many differnt criteria: same author, same topic, articl
a range from xxxxx to xxxxxx, or Data range, or combination)
S*******i
发帖数: 508
26
来自主题: EB23版 - [同求建议]EB2在audit后deny了
比昨天的帖子还惨,原来的PD是2012年的,拖了好久啊,原因前两天发文提到了
,再说一下:2013年5月被通知audit,要求公司补充2个象征性材料,发信催律
师,人家说random audit, 没问题,结果在网上查是在deadline当天才把材料发出去。
上个星期通知我被deny了,律师还说是人家号称没收到材料,属于技术原因, argue了
,还是不行,于是要appeal,目前看是3年多出结果,好处是没出结果就可以继续延H1
-B。我现在H1-B4年多一点,2017年8月到期。
旧律师排屁股走人,来了个新律师,时频会议及后续email从来没说过一句话,写过一
个字,都是他助手在邮件回复我。。。现在是在旧Perm在apeal的同时,给我弄新Perm
,要和以前的job description有至少50%的不同.
我加入现公司3年了,之前国内2年master,美国3年master,然后前公司工作1年。
第一个Perm的minimum requirement写的是 master+1year experience.
现在这个要给我写 master+3year experience. 说把我... 阅读全帖
m****s
发帖数: 7397
27
来自主题: NewJersey版 - twin, leisure reading
Ningbo Smart Pharmaceutical… Or Ningbo Dumb?
By Ed Silverman // April 5th, 2011 // 9:41 am
For those wondering about potential problems with active pharmaceutical
ingredient makers in China, consider the case of Ningbo Smart Pharmaceutical
. The company was cited in March 30 warning letter from the FDA for various
problems - alleging impurity testing was done when, in fact, it was not;
failing to test all API lots shipped to customers and tossing all sorts of
data that should have been kept.
In ... 阅读全帖
a****r
发帖数: 154
28
来自主题: Football版 - 写一篇庆贺自己3 year anniversary的
周末又要飞弯曲见领导们了,同时庆祝我们结婚3年的纪念日。
三年好快呀(是呀孩子都那么大了),难忘3年前的一幕幕,忽远忽近,
夜深人静独坐xbox前想起往事,心中翻涌,却是无言。。。。
难忘在开往婚礼目的地的路上,虽然是周日下午,但是正事要紧,
只能在车里听radio转播的NFL比赛。当时我的头号RB马丁每周一个
TD,非常consistent. 可是苦于没有一个很好的二号RB,草稿时
自认得一个sleeper pick, 从Ravens不得志的走人的Holmes有yards
但就是进不了端区。不知道是不是结婚运气太壮,一路上好消息不断,
先是Holmes TD, 接着是顶替McNabb(bye)的Charlie Batch TD. 然后
Holmes再TD, Batch再TD, Holmes三TD, Batch三TD。 激动的我在
车里直叫唤。估计是被当作是对新婚的渴望溢于言表而在双方老人心
里又加了不少印象分。
(从此Holmes一发不可收之,开始了在NFL里TD王的历程。)
严重说明这不是结婚给我留下的唯一印象。
i**********p
发帖数: 1341
29
来自主题: Pingpong版 - 再争论一个规则吧。
还有一个选择
比赛3桌一批打
如果一个3桌的batch开始了,所以3场必须打完,都算结果
如果第二个batch的时候已经是5:1 or 6:0了,最后一个batch不打
这样你说的尊重得到满足,也没有什么tricky的情况
也不会多占多少时间
n******d
发帖数: 244
30
来自主题: gardening版 - 西红柿吃不完怎么办?
I dry it at 190 degree for about 12 hours, then sun dry it for another two
days. I think if I cut it thinner, it should take much less time. I am
drying my third batch now.
Two more pictures:
1. The second batch ready to by dried;
2. The first batch ready to be packed.
p********e
发帖数: 16048
31
来自主题: PhotoGear版 - 5000ED is overpriced
V ED can do batch scan , six in a set.
when doing batch scan, the speed is not important

V ED不能batch scan
5000ED也更快一点
R******d
发帖数: 5739
32
来自主题: Joke版 - 求学术版鉴定真假

FDA would ban such herbal products in the first place if they are deemed to
be dangerous. herbs absorbs heavy metals from soils, therefore the level of
heavy metals are different from batch to batch depending on the environment
. FDA can't ban all of them unless they have heavy metal content data, and
they can't test all batches. so they ask vendors to put a warning label on
and let the consumers take their own risk.
T*******y
发帖数: 6523
33
The revised translation is pasted below.
Also, I just noticed that the name of the American professor who helped
register SOSCEF might have had a typo in "s".
鲍敏琪访谈录 used McCombs J.B.
吴雅楠访谈录 used McComb J.B.
I have followed whatever is in the Chinese version in the translation, and
perhaps nobody would really notice this, as I haven't noticed it until now,
but I want to point it out just in case.
=============================================
鲍敏琪访谈录
OCEF走专业化的道路是我一直的心愿
采访人: 唐棠(Teresa Tang)
采访时间:20... 阅读全帖
h******b
发帖数: 312
34
来自主题: Apple版 - iPhone上批量删除联系人
http://itunes.apple.com/app/contactdel/id399148990?mt=8
ContactDel provides many ways to delete your contacts in batch on iPhone and
iPod Touch. Use ContactDel to save time when cleaning up your address book.
1. Delete All: The easiest way to delete all contacts on your iPhone and
iPod Touch, just push the button, the App does the rest.
2. Delete by Name: list your contacts by names, check-mark in batch and
delete in batch.
3. Restore: you can restore major information of previous deleted contac... 阅读全帖
v*****r
发帖数: 1119
35
来自主题: Database版 - 编程高手来说说怎么做效率高?
假如后台是 Oracle, client 用 jdbc load data 的话,效率最高的应该是用 jdbc 的
batch operation + bind variable, 基本上能达到 pl/sql 的bulk insert 的
performance。
你说的2 好像是这个意思,但没提到 bind variable,你可以 不用 bind variable 来
batch 100 条 record inserts, the only thing you save comparing with "no
batch" is a little bit less network traffic for sending requests, 不会有什么
performance 提高,只有同时用 bind variable, oracle 才能 bind all 100 input
values,then execute them as one run inside sql engine,. Bulk size 100 有点
低,我一般 500 起。做benchmark 的时候,你应... 阅读全帖
c********l
发帖数: 8138
36
来自主题: Hardware版 - 尼玛,bitcoin的价格破$100了!!
价格太贵了
“Avalon” Batch 3
Batch Status: Not Open
Unit Price: 75 Bitcoins
Batch Size: 600 Units
Orders Opened On: To be Announced
Payment Processor: Bitpay
Shipping Time: May 5st, 2013
o***i
发帖数: 603
37
来自主题: Java版 - 求思路
这个runTask需要运行一系列subTasks,这些个subTasks需要顺序执行,但是如果中断的
话不需要回滚
启动/停止都难,这个暂停/继续还没有好的思路。
Spring batch能用在这里么?
我看Spring batch有个spring batch admin,不过好久没有更新了
p***c
发帖数: 94
38
最近有个任务,要在软件出故障crash以后,把内存里面的东西dump出来,把里面相关
有用的数据整理好使软件重启。
一般都认为内存dump出来的内容可以从linux的dev/mem、 dev/kmem这些系统文件里面
读,但由于软件是给别人用的,用户一般不会有管理员的权限。所以费了不少力气,终
于找到办法把这些东西dump到一个外部的文件里面了。但是dump的时候是按process_ID
找的,而且里面还分batch,一个process有十多个batch,每个batch出来一个文件。但
都是二进制的,也不知道其格式,不知如何下手分析?
请问有高手可以指点一二吗?
谢谢。
a********d
发帖数: 491
39
if you want run your 10 programs in Unix, then write a simple shell scripts:
batch.ksh
#!/usr/bin/ksh
for fname in `ls *.c`
do
gcc $fname.c
done
then you can just run batch.ksh in background:
nohup ./batch.ksh &
baozi please ......
c******o
发帖数: 1277
40
来自主题: Programming版 - 试了下spark,不过如此啊
spark 不是 hadoop的竞争者。
是mapreduce的替代品。我们的stack就是hdfs+spark+aws s3,可能会用 Cassandra 替
代hdfs.
对我们来说,hadoop (以前的BI系统),换成spark的好处有很多:
1. unified system =》 成为真正的pipeline, easy to program, modern, and
reliable, less maintenance.
2. much much faster (really, really fast for most BI use cases) , BI 最关心
的是最近,即使是历史数据,也是会对一段时间多加分析。反正测试是很快
3. uniformed way to do stream/interactive/batch/sql/ML/graph calculation, 很
多你在interactive/batch弄的东西,直接就可以用到stream, 常见的就是interactive
试验一下,成功了,转成 batch/stream,持续监视。
对一一个大型的数据... 阅读全帖
w***g
发帖数: 5958
41
来自主题: Programming版 - 已经全上内存了,还要40多秒啊
别人写open source都有公司/学校在背后发工资,我又没拿钱瞎掺和什么。
等着用免费的轮子才是王道。我劳动力贱,implement这个东西也得收$10k吧,不然你
问问goodbug他会不会干。
pLSA和LDA需要用gibbs sampling/variational method实现,目前并行化的方法
是用mini batch。问题是batch一大收敛速度就会下降,而batch不大的话又没发发挥
并行计算的优势。我觉得spark上那个SGD可能都挺勉强的。
z****e
发帖数: 54598
42
February 22, 2015 Nicole Hemsoth
art2
If you haven’t heard of Flink until now, get ready for the deluge. As one
of a stream of Apache incubator-to-top-level projects turned commercial
effort, the data processing engine’s promise is to deliver near-real time
handling of data analytics in a much faster, more condensed, and memory-
aware way than Hadoop or its in-memory predecessor, Spark, could do.
What really captured our attention, however, was the claim by Data Artisans,
the company behind Flin... 阅读全帖
z****e
发帖数: 54598
43
按照datasources分的话
一般stream api用在网络上过来的数据
比如kafka,比如video,比如etl
这些都是streaming的大户,然后配合reactive
就可以比较迅速地处理数据并反馈
一般batch和table api用在硬盘上读取出来的数据
尤其是你自己系统控制的硬盘上的数据
这种用batch或者table来稿
table针对结构比较完整,精度要求高的数据源
因为精度高,相对要求也高,要求index和transaction
很正常
batch针对结构不完整,精度要求相应可以降低的数据源
比如网页搜索,一般google就反馈给你一个最相似的网页
并不是保证百分百精确,很多时候第一个结果不是你想要的
偶尔还会出现翻了几页才找到,甚至根本找不到的情况
这种情况你只能逼近,完美是不存在滴
w***g
发帖数: 5958
44
FCN的输入不需要确定大小,至少我用过caffe和tensorflow都支持自动调整大小。
只要保证每个batch大小一致就行。我都是batch size = 1。
比如
X = tf.placeholder(tf.float32, shape=(None, None, None, 3), name="images")
Y = tf.placeholder(tf.float32, shape=(None, None, None, 1), name="labels")
只有channel数是定的,batch size和图片大小都是每个iteration动态调整的。
你的问题不是CNN抓global信息。FCN本身就是一个大的convolution,就是local的。
你的问题是一般network的receptive field都> 64,也就是说大于你的input size。
这样你train出来的model都会expect有白边。如果这个model直接apply到全图上,
中间那些位置没有白边,就会和training example有systematic的差别。
你把test imag... 阅读全帖
q******s
发帖数: 26
45
我经常要连着run 一系列程序,于是我就把指令写在一个batch file里递交, 可是最近
组里另外一个人在run 一个很大的job, 我的batch 指令递交后总是被排在等待队列中。
有什么办法能让我的那些job 和他的同时run呢?
我知道如果不用batch job submit, 直接input一个个指令就可以, 但是这样太不方便了
啊。 谢谢先。
s******s
发帖数: 13035
46
来自主题: Biology版 - 有没有tumor CNA的统计数据
你说的是magetab?
比如SNP6的东西在这儿:https://tcga-data.nci.nih.gov/tcgafiles/ftp_auth/
distro_ftpusers/anonymous/tumor/luad/cgcc/broad.mit.edu/genome_wide_snp_6/
snp/
CNA的单独文件在这:https://tcga-data.nci.nih.gov/tcgafiles/ftp_auth/distro_
ftpusers/anonymous/tumor/luad/cgcc/broad.mit.edu/genome_wide_snp_6/snp/broad
.mit.edu_LUAD.Genome_Wide_SNP_6.Level_3.84.2012.0/BASIC_p_TCGASNP_219_221_
223_N_GenomeWideSNP_6_F06_1148642.nocnv_hg19.seg.txt
mage-tab的SDRF在这:https://tcga-data.nci.nih.gov/tcgafiles/ftp_auth/distro... 阅读全帖
q******s
发帖数: 26
47
【 以下文字转载自 Unix 讨论区,原文如下 】
发信人: Qingerus (我的新年愿望一个个来吧~~~), 信区: Unix
标 题: 请教大虾,关于unix server里执行程序的问题。
发信站: Unknown Space - 未名空间 (Thu Mar 25 16:12:18 2004) WWW-POST
我经常要连着run 一系列程序,于是我就把指令写在一个batch file里递交, 可是最近
组里另外一个人在run 一个很大的job, 我的batch 指令递交后总是被排在等待队列中。
有什么办法能让我的那些job 和他的同时run呢?
我知道如果不用batch job submit, 直接input一个个指令就可以, 但是这样太不方便了
啊。 谢谢先。
p******g
发帖数: 92
48
来自主题: Macromolecules版 - **Akron polymer - continued**
i'm anxtiously waiting to hear from them. they said if they don't make any
offer to the first batch of candidates, they then would come to us who are
still on the waiting list. any chance that they don't make any offer to the
first batch? then how many candidates will they select from the second batch?
l*u
发帖数: 2090
49
来自主题: Pharmaceutical版 - cGMP 问题
在开始做 cGMP production之前
一般需要做几个Validation Batches ?
如果这几个Validation Batch都通过了要求
Validation Batch产品能不能当成cGMP产品来卖?
首页 上页 1 2 3 4 5 6 7 8 9 10 下页 末页 (共10页)