由买买提看人间百态

topics

全部话题 - 话题: reinforce
1 2 3 4 5 6 7 8 9 10 下页 末页 (共10页)
w*******d
发帖数: 59
1
有句话叫“半桶水响叮当”……今天终于见识到了……Sutton和barto的书老掉
牙这种话话也真敢说得出口=_=……那大概图灵的paper是不是也可以扔垃圾箱里了。
MCTS作为算法本身和RL是两个独立概念。当RL里的value和policy function固定不变的
情况下,这个数学模型就退化为统计上的Markov decision process。而MDP最优决策是
一个NP-hard问题,所以利用MCTS算法可以近似的给每步搜索最优解。
当value和policy不知道的情况下,你需要通过MDP里不断的数据反馈去学习这两个函数
,这个过程叫做reinforcement learning。换句话说,我可以不用MCTS去在每步寻找最
优决策,而换另一种搜索方法,但是这个过程依然是reinforcement learning……不知
道这样讲清楚了没有……
Google的最大贡献就是用deep belief nets来model这两个函数+MCTS搜索。这个被
Google称为deep reinforcement learning。换言之,我可以用random forest来mode... 阅读全帖
z*****3
发帖数: 1793
2
reinforcement learning不是online learning。Reinforcement learning 我们内行一
般是作为一个problem,或者framework来对待。一般是用来解决问题的。如何solve
RLproblem,才出现了online 和batch 方法。
w*******y
发帖数: 60932
3
New: Perfection 11 Piece Knife Set With Cutting Board, Stainless Steel With
Reinforced Rivets, 13.5 By 17.5 Wood Cutting Board.
retail price: $69.99
you save: $60.00(86%)
Sale price: $9.99
Link:
http://home.dailysteals.com/
Product Features:
Knives are made of stainless steel with reinforced rivets that extends all
the way to the butt for a full tang.
Wooden cutting board measures 13.5 by 17.5.
Sharpener allows for a quick easy sharpen when ever your knives may need it.
What You Get:
Chopper
Ch
d*****u
发帖数: 17243
4
来自主题: Military版 - 大致思路是用reinforcement
大致思路是用reinforcement learning
通过反推过程来确定每个状态下各个Move的reward
没有人为输入任何法则
训练后的系统可以解100%的魔方局,平均在30步
a******9
发帖数: 20431
5
来自主题: Military版 - 大致思路是用reinforcement
艹 我还以为早就解完了 这种有边界的, discrete choice的决策问题 全都可以用MDP
来找解

:大致思路是用reinforcement learning
:通过反推过程来确定每个状态下各个Move的reward

发帖数: 1
6
来自主题: Military版 - 大致思路是用reinforcement
说说你老婆在白妞面前的心理感受
盹盹盹
[在 daigaku (๑۩۞۩๑) 的大作中提到:]
:大致思路是用reinforcement learning
:通过反推过程来确定每个状态下各个Move的reward
:没有人为输入任何法则
:训练后的系统可以解100%的魔方局,平均在30步
d*****u
发帖数: 17243
7
来自主题: Military版 - 大致思路是用reinforcement
他们应该没有找到最短路径。
Reinforcement learning不是搞全局优化,所以在有限训练时间内找不到最优路径是正
常的,但是能解了。
n****y
发帖数: 819
8
来自主题: Living版 - 急问: basement wall reinforcement
看到seller disclosure里面写的:basement wall reinforcement, 请问这个属于
foundation的问题吗, 有多严重呢?多谢!
w***c
发帖数: 709
9
我正在编辑一本书“Fiber-Reinforced Composites”。
共有约15章,不过目前只有2全文,其余还是摘要。大约12月份能收到所有全文。
如果感兴趣并有“Composites materials"背景,请站内发信给我。
请留下Email,姓名,地址,以便发邀请,如果有简要简历会更好。
多谢!
a***m
发帖数: 5037
10
也是重要一环
We trained the neural networks on 30 million moves from games played by
human experts, until it could predict the human move 57 percent of the time
(the previous record before AlphaGo was 44 percent).
But our goal is to beat the best human players, not just mimic them. To do
this, AlphaGo learned to discover new strategies for itself, by playing
thousands of games between its neural networks, and adjusting the
connections using a trial-and-error process known as reinforcement learning.
Of... 阅读全帖
a***m
发帖数: 5037
11
Our Nature paper published on 28th January 2016, describes the technical
details behind a new approach to computer Go that combines Monte-Carlo tree
search with deep neural networks that have been trained by supervised
learning, from human expert games, and by reinforcement learning from games
of self-play.
就这句话也表明 MCT 和 RL 两个概念啊
O**l
发帖数: 12923
12
MCT不是monte carlo
reinforcement learning本来就是online learning
O**l
发帖数: 12923
13
搞笑
reinforcement learning 是个很大范畴
UCB guided monte carlo tree search是其中一种
z*****3
发帖数: 1793
14
你给的这些材料的作者Alan Fern, Dan Klein, Subbarao Kambhampati, Raj Rao,
Lisa Torrey, Dan Weld
我见过2个。权威性不足。
但是我给你材料的作者Richard S. Sutton and Andrew G. Barto是RL泰斗级别人物。
而我给你介绍的书
Reinforcement Learning: An Introduction
https://webdocs.cs.ualberta.ca/~sutton/book/ebook/the-book.html
是RL的标准教材。
w*******d
发帖数: 59
15
=_=我第三段里都说了RL里policy和value是需要learn的……第二段只是告诉你,这两
个确定的情况下,这个数学模型叫做MDP……
话说有空回复阅读都有障碍的人的我也真是够闲的……
对于连话都听不进去的人我已经不care了……你大概根本没有任何RL或者是decision
science的基础,也几乎没有读过Google的关于deep reinforcement learning和deep
learning方面的文章……也不知道他们去年和今年在NIPS上的最新进展……
我只是希望其他读者不要被误导了。Barto那本书绝对是RL里的经典,有兴趣的童鞋可
以去读一读,可以打下好的基础,避免一上来就出现走火入魔这种情况……
I**********s
发帖数: 441
16
来自主题: CS版 - Reinforcement Learning
有人PHD做Reinforcement Learning吗? 这个方向如何? 有什么重要的问题?

发帖数: 1
17
来自主题: Military版 - 数据说话:长江水深
Naval History and Heritage Command
Naval History and Heritage Command
Open for Print Social Media
Search
Home
Research
Our Collections
Visit Our Museums
Browse by Topic
News & Events
Get Involved
About Us
DANFS » S » South Dakota I (Armored Cruiser No. 9)
Tags
Related Content
Topic
Document Type
Ship History
Wars & Conflicts
nhhc-wars-conflicts:world-war-i
Navy Communities
File Formats
Image (gif, jpg, tiff)
Location of Archival Materials
South Dakota I (Armored Cruiser No. 9)
1902... 阅读全帖
a*********p
发帖数: 717
18
来自主题: Hardware版 - Thinkpad的外壳材料
T400s/T410s/X301/X200s
Display cover: Carbon-fiber reinforced plastic(CFRP)(top), glass-fiber
reinforced plastic(GFRP) (side walls);
Base: Magnesium alloy
T510/W510
Display cover: Glass-fiber reinforced plastic (GFRP)
Base: Carbon-fiber reinforced plastic(CFRP)
X200/X200T
Magnesium alloy
R400/T400/T500/W500
Top: Super-Elastic PolyCarbonate (SEPC);
Bottom: Carbon-Fiber Reinforced Plastic(CFRP)
T410
Top: Acrylonitrile-Butadiene-Styrene (ABS) plastic;
Bottom: Carbon-Fiber Reinforced Plastic(CFRP)
W
f**d
发帖数: 768
19
来自主题: Neuroscience版 - eBook: From computer to brain
这是一本计算神经科学的优秀著作,全文拷贝这里(图和公式缺),有兴趣的同学可以
阅读
如需要,我可以分享PDF文件(--仅供个人学习,无商业用途)
From Computer to Brain
William W. Lytton
From Computer to Brain
Foundations of Computational Neuroscience
Springer
William W. Lytton, M.D.
Associate Professor, State University of New York, Downstato, Brooklyn, NY
Visiting Associate Professor, University of Wisconsin, Madison
Visiting Associate Professor, Polytechnic University, Brooklyn, NY
Staff Neurologist., Kings County Hospital, Brooklyn, NY
In From Computer to Brain: ... 阅读全帖

发帖数: 1
20
Spoiler alert: 中或输
https://www.reddit.com/r/taiwan/comments/87i2lk/scenario_the_fourth_taiwan_
strait_crisis_of_2020/
2018 (March onward) – America and China initiate a series of tit-for-tat
trade restrictions and tariffs, gradually resulting in economic losses and
stock market drops totaling in the hundreds of billions
Mid-2019 – Somewhat overdue, the next global recession begins, mostly due
to bad debts and a round of mass layoffs originating in the Chinese economy.
Stock market crashes and so... 阅读全帖
S*******n
发帖数: 1721
21
【 以下文字转载自 AutismCare 俱乐部 】
发信人: Seattlian (Seattleite), 信区: AutismCare
标 题: 第六届自闭症和阿斯伯格大会——理疗篇
发信站: BBS 未名空间站 (Sat Nov 5 12:49:22 2011, 美东)
总体感觉这个大会是非常商业化的,这些讲演人主要是卖书,卖仪器和卖服务。会场外
面是一个展览,卖书,卖仪器,卖药,卖服务。下面是我在会场实时记录的一些笔记。
当时要是录音或者录像一下就好了。http://www.usautism.org/tv/ 上有录像卖,整个大会是99刀,一个议题10刀。
1)Raun K. Kaufman 的主题讲演:自闭症革命: 给父母的突破性策略
www.autismtreatment.com Home of the Son-Rise program 1974
他的主要目的是给他们Son-Rise Program作广告,同时卖书和DVD。他说的有一个思想
比较有趣,就是在干预初期,和自闭儿童一起去疯,他还举例说他小时候经常一个人转
家里的盘子,一转就是几个小时,他妈妈就和他一起转。... 阅读全帖
s**u
发帖数: 1436
22
来自主题: Hardware版 - So disappointed about my new T410
我不觉得没有s的T系列有什么不好的。
自己周围有T400, T500, 去年的MBP, Dell的E4300和E6500
T系列比dell的商用机还是好很多。MBP看着漂亮,但是重量并不轻。
而且我觉得T系列已经不像以前那样以便携为目的了,真要移动办公的,用Ts或者X。
T400: Top: Super-Elastic PolyCarbonate (SEPC); Bottom: Carbon-Fiber
Reinforced Plastic
T410: Top: Acrylonitrile-Butadiene-Styrene (ABS) plastic; Bottom:
Carbon-Fiber Reinforced Plastic
T500: Top: Super-Elastic PolyCarbonate (SEPC); Bottom: Carbon-Fiber
Reinforced Plastic
T510: Top: Glass-fi ber reinforced plastic; Base: Carbon-fiber
reinforced plastic
T510 / W51
K**********n
发帖数: 1197
23

跟这帮自称理科的文科生物wsn实在纠缠不清楚。给丫们上点洋大人自己的说法吧,而
且都是学术圈的,已有reference一堆,其中一份是著名的兰德公司的报告。
另外,美国和澳洲已立法限制使用脑波扫描仪,拿脑波扫描仪扫本国公民已是联邦法的
重罪。不过这个比较难的是你无法证实谁接收了你的脑信号。所以立了法跟没立一个样。
Remote Mind Control Technology

Reprinted from SECRET AND SUPPRESSED: BANNED IDEAS AND HIDDEN
HISTORY, edited by Jim Keith, $12.95, available from
1-800-680-INET.
There had been an ongoing controversy over health effects of electromagnetic
fields (EMF) for years (e.g., extremely low frequency radiation and the
Navy's Project Sea... 阅读全帖
s***n
发帖数: 1280
24
来自主题: Parenting版 - 10岁自闭小孩因为踢了aide被逮捕
谢谢分享。赞同你大部分观点,比如对高功自闭的孩子早期干预很重要。
我也赞同你说的"对那些很多东西不理解的自闭小孩有时positive reinforcement比管
教有用很多"。不过我想强调的是另一面,对那些很多东西不理解的自闭小孩有些时候
是punishment比positive reinforcement 更直接高效且无害。
比如一个孩子有攻击其他孩子的行为,positive enforcement会说,你停止攻击我就给
你奖励。然后孩子就停了下来。在一些人开来,这孩子是停止攻击得到奖励;可在这孩
子看来,是他攻击其他孩子让他最终得到的奖励。这种positive reinforcement 很有
可能会助长孩子的攻击性。相反,孩子有攻击行为的时候对他的punishment,不管
positive还是negative,会让孩子把自身受到的punishment/"伤害"和伤害别人的攻击
行为联系起来,帮助他建立某种同理心机制,他就会去控制这种最后会伤害到自己的行
为。
对自闭孩子,positive/negative reinforcement,positive/negative pu... 阅读全帖
a*********p
发帖数: 717
25
来自主题: shopping版 - Thinkpad的外壳材料 (转载)
【 以下文字转载自 Hardware 讨论区 】
发信人: afreeshrimp (shrimp), 信区: Hardware
标 题: Thinkpad的外壳材料
发信站: BBS 未名空间站 (Thu Feb 11 14:42:28 2010, 美东)
T400s/T410s/X301/X200s
Display cover: Carbon-fiber reinforced plastic(CFRP)(top), glass-fiber
reinforced plastic(GFRP) (side walls);
Base: Magnesium alloy
T510/W510
Display cover: Glass-fiber reinforced plastic (GFRP)
Base: Carbon-fiber reinforced plastic(CFRP)
X200/X200T
Magnesium alloy
R400/T400/T500/W500
Top: Super-Elastic PolyCarbonate (SEPC);
Bottom: Carbon-Fibe
a*****g
发帖数: 19398
26
Deep Learning Isn’t a Dangerous Magic Genie. It’s Just Math
Deep learning is rapidly ‘eating’ artificial intelligence. But let’s not
mistake this ascendant form of artificial intelligence for anything more
than it really is. The famous author Arthur C. Clarke wrote, “Any
sufficiently advanced technology is indistinguishable from magic.” And deep
learning is certainly an advanced technology—it can identify objects and
faces in photos, recognize spoken words, translate from one language to
another... 阅读全帖
a*******o
发帖数: 699
27
来自主题: WaterWorld版 - water
How to SUCCESSFULLY teach a baby to sleep - 3rd ed.
Group Owner
BabeGirlMom · Pass a Note!
Posted 02/11/2009
http://community.babycenter.com/post/a5417415/how_to_successful
1) WHY?
Sleep training is NOT a fix-all-solution to all sleep problems. If your baby
is not sleeping well, there are usually good reasons why.
The most common reasons are:
1) Overtiredness. All sleep problems are at least partly due to
overtiredness. Some problems my have been initiated by illness, teething, or
phases; but be... 阅读全帖
m***r
发帖数: 359
28
来自主题: Programming版 - Python日报 2015年2月楼
Python日报 2015-02-26
@好东西传送门 出品, 过刊见
http://py.memect.com
订阅:给 [email protected]
/* */ 发封空信, 标题: 订阅Python日报
更好看的HTML版
http://py.memect.com/archive/2015-02-26/short.html
1) 【Python强化学习库】 by @爱可可-爱生活
关键词:库, 数据科学, Andrew Ng, 代码, 机器学习
[开源] reinforce —— Python下“即插即用”型强化学习(reinforcement learning)
库 GitHub: [1] 其实现基于Andrew Ng的notes [2] 以及另一篇关于强化学习实现的文
章《Reinforcement Learning》 [3]
[1] https://github.com/nathanepstein/reinforce
[2] http://pan.baidu.com/s/1eQcUvdC
[3] http://nathanepstein.github.io/... 阅读全帖
a*****g
发帖数: 19398
29
Deep Learning Isn’t a Dangerous Magic Genie. It’s Just Math
Deep learning is rapidly ‘eating’ artificial intelligence. But let’s not
mistake this ascendant form of artificial intelligence for anything more
than it really is. The famous author Arthur C. Clarke wrote, “Any
sufficiently advanced technology is indistinguishable from magic.” And deep
learning is certainly an advanced technology—it can identify objects and
faces in photos, recognize spoken words, translate from one language to
another... 阅读全帖
b*******e
发帖数: 217
30
来自主题: Programming版 - 王垠:我为什么不在乎人工智能
Reinforcement learning 和你想的这个so far 最接近。可以看看deep reinforcement
learning , deep q learning, policy gradient based learning什么的。
Ml 分三大类: supervised learning, unsupervised learning, reinforcement
learning. Reinforcement learning 介于supervise 和unsupervised 之间
p*****m
发帖数: 7030
31
来自主题: Biology版 - 一个朋友转业成功
这个我还是部分同意,主观判断的mediocre和客观存在的判断标准是相辅相成的,某种
程度上这就是我说self reinforcement的原因。
简单说的话,我们首先应该承认任何行业,包括scientist,都存在相当比例的
mediocre(这个不会因为基础研究的偶然性而有所不同,最多你可以说科学家有很大的
偶然性可以摆脱自身的mediocre)。而这些mediocre能够存在本身,就说明人们有足够
的理由继续留在一个让自己mediocre下去的职业里,这个理由可以是钱多,可以是清闲
,可以是有更多潜在的机会,甚至可以是经常吃好吃的,都行。我之前那贴的意思,就
是说科学家这门职业这些因素比较弱,因此让自己相信自己不是mediocre,或者自己这
门职业本质上就不可能trivial,或者自己很有机会摆脱mediocre,就会是一个重要的理
由。这就是我说的self reinforcement。
你说的那些意见其实反而提供了一个很好的证据。我这句话并不是说你mediocre而且自
恋,或者说凡是对自己的职业很骄傲的科学家都是自恋的mediocre,你不要误解,我讨
论的是一个普遍趋势,... 阅读全帖
K**********n
发帖数: 1197
32
来自主题: EE版 - 版上有人懂这个技术没?
学术界珍稀有限的原版外文资料,
Remote Mind Control Technology

Reprinted from SECRET AND SUPPRESSED: BANNED IDEAS AND HIDDEN
HISTORY, edited by Jim Keith, $12.95, available from
1-800-680-INET.
There had been an ongoing controversy over health effects of electromagnetic
fields (EMF) for years (e.g., extremely low frequency radiation and the
Navy's Project Seafarer; emissions of high power lines and video display
terminals; radar and other military and industrial sources of radio
frequencies and micr... 阅读全帖
a*****g
发帖数: 19398
33
Deep Learning Isn’t a Dangerous Magic Genie. It’s Just Math
Deep learning is rapidly ‘eating’ artificial intelligence. But let’s not
mistake this ascendant form of artificial intelligence for anything more
than it really is. The famous author Arthur C. Clarke wrote, “Any
sufficiently advanced technology is indistinguishable from magic.” And deep
learning is certainly an advanced technology—it can identify objects and
faces in photos, recognize spoken words, translate from one language to
another... 阅读全帖
R****a
发帖数: 6858
34
来自主题: Military版 - 美国对外政策:战略收缩
美国对外政策:战略收缩
来源: ognc 于 2013-02-18 07:58:39[档案] [博客] [旧帖] [转至博客] [给我悄悄
话] 本文已被阅读:198次
字体:调大/调小/重置 | 加入书签| 打印| 所有跟帖 | 加跟贴| 查看当前最热讨论主题
机器翻译的,凑合着看。原文在后。
一个不活跃的外交政策的情况
由Barry R.波森一月/二月2013,外交
尽管经历了十年的昂贵和优柔寡断的战争和不断增加的财政压力,美国政策制定者对美
国大战略之间的长期的共识仍然保持完整。作为总统竞选的共和党人和民主党人明确表
示,可以说外交政策的边缘,但他们同意这个:,美国要称霸世界的军事,经济,政治
,自冷战的最后一年,一个自由主义的霸权战略。这个国家,他们认为,需要保持其在
全球力量平衡的巨大领先地位,巩固其经济优势,扩大市场的民主社会,并保持其影响
力的国际机构,它帮助创造。
为此,美国政府已扩大其庞大的冷战时代的网络安全的承诺和军事基地。它加强了现有
的联盟,北约新成员加入,加强与日本的安全协议。在波斯湾,它试图用一个全副武装
的空气,海洋保护油流,和地面部队,这一目标消耗至少百... 阅读全帖
S*********g
发帖数: 24893
35
https://petitions.whitehouse.gov/petition/deport-xue-gang%E8%96%9B%E5%88%9A-
li-hanlin%E6%9D%8E%E5%90%AB%E7%90%B3-who-involved-poison-murder-case-
reinforce-background-checks-pharma/vvsq0tsC
Deport Xue Gang(薛刚) / Li Hanlin(李含琳) Who involved in a poison murder
case, Reinforce Background Checks In Pharma. Ind.
Deport所有撒谎的中国人.
In 1995, Zhu Ling was poisoned by Thallium salt at Tsinghua University in
China. The case was closed abruptly due to the political influence of a
suspect's family. Some peopl... 阅读全帖
p****s
发帖数: 3184
36
为清楚起见,按照美陆军的官方战史
http://www.history.army.mil/books/korea/truce/ch17.htm
393-396页,对3月底哥伦比亚营守卫的老秃山头的战斗过程的大致描述如下:
战斗前1天美7师31团的部署:
从左到右是xx山头、老秃山头、猪排山头,从左到右是第2、第4 (哥伦比亚营)、第3
营。1营是堵2,4,3营阵地的缺口的。
3月23日晚,志愿军423团占领老秃山头表面阵地,哥伦比亚营死伤20%,剩余人员被压
入地下坑道地堡。
3月24日晨到25日晚,美陆军31团1营的A,B,C三个连反攻老秃山头表面阵地,死伤失踪
300人但未能夺回山头表面阵地。联系美空军要用航弹于26日晨进行大规模轰炸。
美空军26日晨轰炸前,25日晚26日晨的某个时候,美军认为志愿军为防空袭而放弃了老
秃山头表面阵地(美军没发现老秃山头表面阵地有动静),被压入老秃山头地下坑道地
堡的哥伦比亚人借此机会于26日晨逃回美军当时的控制线(UNC)。
原文:
The Old Baldy-Porkchop area was held by the 31st Infantry... 阅读全帖
d*b
发帖数: 21830
37
来自主题: Military版 - 感觉美国突然阳痿了
U.S. patrol sought to avoid provocation, not reinforce China island claim:
officials
WASHINGTON | By Andrea Shalal and David Brunnstrom
Subi reef, located in the disputed Spratly Islands in the South China Sea,
is shown in this handout Center for Strategic and International Studies (
CSIS) Asia Maritime Transparency Initiative satellite image taken September
3, 2015 and released to Reuters October 27, 2015. REUTERS/CSIS Asia Maritime
Transparency Initiative/DigitalGlobe/Handout via Reuters
Subi ... 阅读全帖
G*******n
发帖数: 6889
38
http://der-fuehrer.org/reden/deutsch/Weisungen/1942-08-18.htm
Führerhauptquartier, den 18. 8. 42
Der Führer
OKW/WFSt/Op. Nr. 002821/42 g.K.
Geheime Kommandosache
30 Ausfertigungen
24. Ausfertigung
Weisung Nr. 46 für die Kriegführung
Richtlinien für die verstärkte Bekämpfung des Bandenunwesens im
Osten
A) Allgemeines
I.) Das Bandenunwesen im Osten hat in den letzten Monaten ei­nen nicht
mehr erträglichen Umfang angenommen und droht zu einer ernsten Gefahr f
ür die Versorgung d... 阅读全帖
l****z
发帖数: 29846
39
President George W. Bush wasn’t unintelligent. In fact, he was brilliant…
Joshua Riddle
March 25, 2014 11:08 pm
Keith Hennessey, professor at Stanford Business School, wanted to set the
record straight about President Bush’s intellect. He writes:
I teach a class at Stanford Business School titled “Financial Crises in the
U.S. and Europe.” During one class session while explaining the events of
September 2008, I kept referring to the efforts of the threesome of Hank
Paulson, Ben Bernanke, and Tim... 阅读全帖
f*******e
发帖数: 3433
40
1篇。大部分都是放导师的名。还完全不能证明自己独立研究。我认识一在美国教授,
每年nature子刊都超过5篇
22. Lei Fan, Houlong L. Zhuang, Kaihang Zhang, Valentino R. Cooper, Qi Li,
and Yingying Lu, Chloride-Reinforced Carbon Nanofiber Host as Effective
Polysulfide Traps in Lithium–Sulfur Batteries, Adv. Sci. 2016, 1600175
21. Qi Li, Juner Chen, Lei Fan, Xueqian Kong, Yingying Lu, Progress in
electrolytes for rechargeable Li-based batteries and beyond, Green Energy &
Environment 1 (2016) 18-42
20. Zheng Lianga, Dingchang Lina, Jie Zhaoa, Zhenda L... 阅读全帖
s****y
发帖数: 3416
41
来自主题: Family版 - 话不投机半句多呀
Good that you are checking out psychology.
Shaping is somewhat different from the films you looked at named positive
reinforcement.
Shaping is probably more gradual.
For instance, if you want your husband to cook. You have to reinforce him
step by step.
Just as an example, you provide him attention once he walks around the area
in the kitchen.
And when he started to touch the counter you give him a hug.
Give him praises once he "happens" to use the stove.
Before you know it, he will be cooking a... 阅读全帖
b****5
发帖数: 25
42
这个是拿到的foundation inspection report:
Inspection Overview:
This inspection included a visual observation of the perimeter foundation.
It should also be noted that
this inspection consisted of a surface inspection only. No detailed
inspection or testing of underlying soils
and geologic conditions was performed for this inspection. Subsurface
conditions can be quite different
from those suggested by surface environment. The original construction
drawings, soils report and
building permit were not av... 阅读全帖
w*******d
发帖数: 59
43
Monte Carlo tree search只是把电脑围棋提升到业余六段水平(比如ZenGo)。再结合
Deep
reinforcement learning 把业余六段提升到职业九段。和以前对比,alphaGo在
reinforcement learning里把那些决策函数和评价函数都换成了neural network,而以
前这些函数大部分是线性函数。非线性的函数空间要大很多,评价函数也准确很多,肯
定比线性的要好。但是以前大家只用线性函数而不用neural net是因为之前的理论研究
指出非线性函数在converge上会有问题,很多人尝试都失败了。David Silver他们可能
是最早成功的,用了很多trick和hacky的办法让算法收敛了,于是就牛逼了。围棋和象
棋比不仅仅是计算力的问题,如果和IBM采用同样的策略,计算力的增长还比不过围棋
相比于象棋复杂度的增长。Monte Carlo tree search以及Deep reinforcement
learning的成果都很关键。
u*******r
发帖数: 2855
44
来自主题: EB23版 - 放弃绿卡有排期
of coz I agree with you that tax is super critical and Mei Di will be able
to reinforce this in US
the question is can it reinforce its policies in other countries in the
world? If so, Mei Di is still the only super power in the world ah...
No other country can reinforce its policies like Mei Di does...
p****x
发帖数: 1346
45
SAN FRANCISCO (Reuters) – California will experience unthinkable damage
when the next powerful quake strikes, probably within 30 years, even though
the state prides itself on being on the leading edge of earthquake science.
Modern skyscrapers built to the state's now-rigorous building codes might
ride out the big jolt that experts say is all but inevitable, but the
surviving buildings will tower over a carpet of rubble from older structures
that have collapsed.
Hot desert winds could fan fires t... 阅读全帖
d*2
发帖数: 2053
46
http://news.yahoo.com/s/ap/20110202/ap_on_re_mi_ea/ml_egypt
By HADEEL AL-SHALCHI, Associated Press Hadeel Al-shalchi, Associated Press
– 10 mins ago
CAIRO – Supporters of President Hosni Mubarak charged into Cairo's central
square on horses and camels brandishing whips while others rained firebombs
from rooftops in what appeared to be an orchestrated assault against
protesters trying to topple Egypt's leader of 30 years. Three people died
and 600 were injured.
The protesters accused Mubarak's re... 阅读全帖
t**********g
发帖数: 3388
47
【 以下文字转载自 Military 讨论区 】
发信人: StephenKing (金博士), 信区: Military
标 题: 白宫请愿:驱逐薛肛,李含琳,和所有撒谎的中国人
发信站: BBS 未名空间站 (Sun May 19 11:28:49 2013, 美东)
https://petitions.whitehouse.gov/petition/deport-xue-gang%E8%96%9B%E5%88%9A-
li-hanlin%E6%9D%8E%E5%90%AB%E7%90%B3-who-involved-poison-murder-case-
reinforce-background-checks-pharma/vvsq0tsC
Deport Xue Gang(薛刚) / Li Hanlin(李含琳) Who involved in a poison murder
case, Reinforce Background Checks In Pharma. Ind.
Deport所有撒谎的中国人.
In 1995, Zhu Ling was poisoned by Th... 阅读全帖
c******a
发帖数: 4400
48
来自主题: SanFrancisco版 - 这样反AA是最有效的,还是白人懂
大家要抓住中心思想,本质不是为了亚裔利益,是说AA会伤害想帮助的人(和我们想的
一样),这太好了,大家尽量不要往SCA5上引。在加州你只能这么说
The Painful Truth About Affirmative Action
Richard Sander and Stuart Taylor Jr. Oct 2 2012, 10:30 AM ET
0
inShare
More
Why racial preferences in college admissions hurt minority students -- and
shroud the education system in dishonesty.
affirmative-top2.jpg
michaeljung/Shutterstock
Affirmative action in university admissions started in the late 1960s as a
noble effort to jump-start racial integration and foster equal opport... 阅读全帖
h*******g
发帖数: 2201
49
I think it is a perfect example of incompetent law reinforcement personal.
They can't reinforce the law on criminals to guarantee the society security,
so all they can do is to reinforce the law against good citizens to
guarantee less guns on the street. Sad.
1 2 3 4 5 6 7 8 9 10 下页 末页 (共10页)