这次starcraft的论文出来了吗？ - Programming版 - 未名存档

本页内容为未名空间相应帖子的节选和存档，一周内的贴子最多显示50字，超过一周显示500字访问原贴

Programming版 - 这次starcraft的论文出来了吗？

相关主题
● deepmind在星际争霸后面就是LoL	● What is wrong?
● 学machine learning需要先修AI的课吗？	● lua优势劣势在哪里
● 深度学习(Deep Learning)和重整化群理论(Renormalization Group Theory)	● Deepmind，蜘蛛坦克，小笼包(zz)
● Google的Quoc Le有多牛？	● DeepMind创始人自述：我们的算法可以横扫一切棋类博弈
● Question about learning C#	● 懂deepmind得说说
● A question related to pipe	● 王垠的40行代码是干嘛的
● An interview question	● DeepMind: AlphaGo 携手中国顶尖棋手：共创棋妙未来
● 问个 ctor/copy ctor的问题	● AI的终极优化目标

相关话题的讨论汇总
话题: openai话题: dota话题: artificial话题: strategies

进入Programming版参与讨论

1

(共1页)

c*******v 发帖数: 2599	1 上次openAI的人说dota是genetic algorithm + RL。 “At the beginning it is worth noting that OpenAI’s artificial intelligence learns to play with itself. All the strategies noted by the researchers are the result of many hours of sessions, during which two independent instances are fighting each other. One of them is still learning and the other one is blocked. When the learning bot achieves an advantage, it is cloned and the researchers continue the process. The genetic algorithms work underneath all the time, which on the basis of the results achieved determine which behaviours bring the intended effect, and which are meaningless and translate into a failure. In the following video, OpenAI employees present strategies that their artificial intelligence used when playing with real Dota 2 players” Deepmind的方案现在有报道了吗？
C*****l 发帖数: 1	2 https://deepmind.com/blog/alphastar-mastering-real-time-strategy-game- starcraft-ii/ deepmind自己的主页就有，但是没有太多实现的细节。计算消耗巨大，他们用了一个不同agent的联赛，每个agent后面16个TPU. 这个版本可能还有缺陷。 intelligence are work 【在 c*******v 的大作中提到】 : 上次openAI的人说dota是genetic algorithm + RL。 : “At the beginning it is worth noting that OpenAI’s artificial intelligence : learns to play with itself. All the strategies noted by the researchers are : the result of many hours of sessions, during which two independent : instances are fighting each other. One of them is still learning and the : other one is blocked. When the learning bot achieves an advantage, it is : cloned and the researchers continue the process. The genetic algorithms work : underneath all the time, which on the basis of the results achieved : determine which behaviours bring the intended effect, and which are : meaningless and translate into a failure. In the following video, OpenAI
C*****l 发帖数: 1	3 dm在arxiv上面放出了一个短文，提到了一个外循环是拉马克算法，算是遗传算法的训练过程。agent自己是BP，agent会学习胜者的网络权重和超参数，其他细节不多，没有披露神经网络的细节

1

(共1页)

进入Programming版参与讨论

相关主题
● 陈经：Deepmind与暴雪开源接口，人工智能挑战星际争霸到哪一步	● Question about learning C#
● Deepmind 的星际二挑战	● A question related to pipe
● FPGA-based DNNs	● An interview question
● hinton的胶囊本版什么评价	● 问个 ctor/copy ctor的问题
● deepmind在星际争霸后面就是LoL	● What is wrong?
● 学machine learning需要先修AI的课吗？	● lua优势劣势在哪里
● 深度学习(Deep Learning)和重整化群理论(Renormalization Group Theory)	● Deepmind，蜘蛛坦克，小笼包(zz)
● Google的Quoc Le有多牛？	● DeepMind创始人自述：我们的算法可以横扫一切棋类博弈

相关话题的讨论汇总
话题: openai话题: dota话题: artificial话题: strategies

未名新帖统计// 7月16日

#	版面	帖数(主题数)
-	全站	4871 (796)
1	Military	3777 (569)
2	Stock	341 (51)
3	Joke	117 (17)
4	History	116 (3)
5	Automobile	100 (9)
6	USANews	55 (9)
7	Midlife	45 (1)
8	Headline	41 (41)
9	Dreamer	33 (13)
10	FleaMarket	32 (20)
11	Living	30 (7)

* 这里只显示发帖超过25的版面，努力灌水吧:-)