今天碰见一个高手 - Programming版 - 未名存档

本页内容为未名空间相应帖子的节选和存档，一周内的贴子最多显示50字，超过一周显示500字访问原贴

Programming版 - 今天碰见一个高手

相关主题
● 有没有什么轮子可以管理大堆的binary文件啊？	● consistent hashing实际应用
● 为什么无论Java还是Ruby，转成Node代码量都是几十倍的减少呢？	● 请问MySQL的replication不通过应用程序能达到strong consistenc (转载)
● 连续变化的地理位置的query	● 真心请教,究竟怎么设计才能处理每秒100万次的写操作
● 关于按用户分割的sql设计	● Re: 请教一道题目
● 问二爷一个题外话	● 请问这道题怎么解决？
● job schduleing - one and only one	● [合集] 一个链表倒转的问题
● 请教一个 F的message设计问题，不能理解	● C++如何实现graph？
● node现在还行么？用的地放多不多？	● 有人set up过多个node的Cassandra 么？ (转载)

相关话题的讨论汇总
话题: node话题: elastic话题: ebs话题: controller话题: failure

进入Programming版参与讨论

1

(共1页)

c***d 发帖数: 996	1 先看个video: http://www.youtube.com/watch?v=hEqQMLSXQlY 他主持的一个系统，replication策略我觉得和hdfs有重复，我就问为什么不直接架在 hdfs上作。这位大哥大概是这么解释的： hdfs的metadata management其实是single point of failure，确实可以用backup namenode来改善，但这不是问题的根本。问题的根本是distributed storage的 replication information不应该用lookup 来解决，而应该用比较robust的hash function。那会不会不elastic呢？确实有这个问题，这个问题的关键在于，系统本身要有一个作elastic logic的controller, 随时得到整个系统的信息。当有node增加进来或者fail掉，elastic logic controller会根据这个当前情况计算出理想的 distribution, 并开始移动block。移动block是copy and delete, 系统在近optimal的情况下serve。系统不停有failure和增加，就不停的有这种移动，在设计流量的时候就应该把这个考虑进去。gateway node从elastic logic controller拿当前block availability信息，根据事先设计好的hash function partition request. 显然gateway node和storage node都不是single point of failure, 那这个 elastic logic controller会不会成为single point of failure呢？不会，因为 elastic logic controller唯一的作用就是从storage node 收集信息，发布给gateway node, 所以它也是stateless的，唯一有state的就是actual storage node... 为啥hadoop不这样设计呢？因为gfs也是有个metadata node作lookup的。为什么 gfs要用central metadata management呢？因为要提供一个文件系统的界面虚拟。我觉得这位大哥说的挺在理，前两天读ec2的failure report, 似乎ec2 的recovery 也是这么设计的。不过流量设计有问题。记不清楚cassandra是怎么设计的了，等下再看看。
r*********r 发帖数: 3195	2 不错不错。这个方向还是挺热的。现在大家都很依赖 ec2. 但是对ec2 设计了解的又不够。
c***d 发帖数: 996	3 回去看aws 65648 report，把我给看乐了: "Two factors caused the situation in this EBS cluster to degrade further during the early part of the event. ... ... There was also a race condition in the code on the EBS nodes that, with a very low probability, caused them to fail when they were concurrently closing a large number of requests for replication. In a normally operating EBS cluster, this issue would result in very few, if any, node crashes; however, during this re-mirroring storm, the volume of connection attempts was extremely high, so it began triggering this issue more frequently. Nodes began to fail as a result of the bug, resulting in more volumes left needing to re-mirror. This created more “stuck” volumes and added more requests to the re-mirroring storm. " 然后5:30am： “ ... ... As more EBS nodes continued to fail because of the race condition described above, the volume of such negotiations with the EBS control plane increased. ... ... " 最后： ” Finally, we have identified the source of the race condition that led to EBS node failure. We have a fix and will be testing it and deploying it to our clusters in the next couple of weeks. “ bug就是bug，碎念念的叽叽歪歪反而让人觉得心里没底。加上前面的api access control，觉得ec2的开发过程也有点ad hoc :-P 总的看来ebs是和cassandra gossip based replication 差不多的。不知道这种peer ring的和hierarchical的设计选择上有什么更多考虑。 backup 【在 c***d 的大作中提到】 : 先看个video: : http://www.youtube.com/watch?v=hEqQMLSXQlY : 他主持的一个系统，replication策略我觉得和hdfs有重复，我就问为什么不直接架在 : hdfs上作。这位大哥大概是这么解释的： : hdfs的metadata management其实是single point of failure，确实可以用backup : namenode来改善，但这不是问题的根本。问题的根本是distributed storage的 : replication information不应该用lookup 来解决，而应该用比较robust的hash : function。那会不会不elastic呢？确实有这个问题，这个问题的关键在于，系统本身 : 要有一个作elastic logic的controller, 随时得到整个系统的信息。当有node增加进 : 来或者fail掉，elastic logic controller会根据这个当前情况计算出理想的
z***e 发帖数: 5393	4 single point实现简单嘛，有了bug也好找，对吧？你要elastic每次加入新node又重新去计算还要copy&delete，再发生bug恐怕就不是一个星期能fix，而是一个月了... backup 【在 c***d 的大作中提到】 : 先看个video: : http://www.youtube.com/watch?v=hEqQMLSXQlY : 他主持的一个系统，replication策略我觉得和hdfs有重复，我就问为什么不直接架在 : hdfs上作。这位大哥大概是这么解释的： : hdfs的metadata management其实是single point of failure，确实可以用backup : namenode来改善，但这不是问题的根本。问题的根本是distributed storage的 : replication information不应该用lookup 来解决，而应该用比较robust的hash : function。那会不会不elastic呢？确实有这个问题，这个问题的关键在于，系统本身 : 要有一个作elastic logic的controller, 随时得到整个系统的信息。当有node增加进 : 来或者fail掉，elastic logic controller会根据这个当前情况计算出理想的

1

(共1页)

进入Programming版参与讨论

相关主题
● 有人set up过多个node的Cassandra 么？ (转载)	● 问二爷一个题外话
● C++: What is the difference between the two approaches?	● job schduleing - one and only one
● Cassandra 里的 partition	● 请教一个 F的message设计问题，不能理解
● 板上的高人们能给科普比较functional programming language么	● node现在还行么？用的地放多不多？
● 有没有什么轮子可以管理大堆的binary文件啊？	● consistent hashing实际应用
● 为什么无论Java还是Ruby，转成Node代码量都是几十倍的减少呢？	● 请问MySQL的replication不通过应用程序能达到strong consistenc (转载)
● 连续变化的地理位置的query	● 真心请教,究竟怎么设计才能处理每秒100万次的写操作
● 关于按用户分割的sql设计	● Re: 请教一道题目

相关话题的讨论汇总
话题: node话题: elastic话题: ebs话题: controller话题: failure

未名新帖统计// 7月16日

#	版面	帖数(主题数)
-	全站	4871 (796)
1	Military	3777 (569)
2	Stock	341 (51)
3	Joke	117 (17)
4	History	116 (3)
5	Automobile	100 (9)
6	USANews	55 (9)
7	Midlife	45 (1)
8	Headline	41 (41)
9	Dreamer	33 (13)
10	FleaMarket	32 (20)
11	Living	30 (7)

* 这里只显示发帖超过25的版面，努力灌水吧:-)