请问k-mean clust或decision tree或stratify sampling? - Statistics版 - 未名存档

本页内容为未名空间相应帖子的节选和存档，一周内的贴子最多显示50字，超过一周显示500字访问原贴

Statistics版 - 请问k-mean clust或decision tree或stratify sampling?

相关主题
● 讨论一下，非独立sample的显著性比较	● 请问一个统计问题
● 这样还能算Randomized sample吗	● [R] How to stratify data in R?
● sampling weight variable怎么用到linear regression里啊？	● 遇上这样的老板该怎么办
● 用SAS sampling的一个问题	● 请教一个survey weight的问题
● 急求马上要选课了谢谢各位大神	● 请教一个统计建模的问题。
● onsite求建议呀	● 和不很懂统计和DESIGN且不愿接受新东西总以为自己是对的老板工
● Question for Stratify sampling.	● 大家平时怎么处理missing data？
● 样本数量问题求助	● Help! How to get two CDFs on the same plot in SAS

相关话题的讨论汇总
话题: group话题: test话题: population话题: pre话题: sampling

进入Statistics版参与讨论

1

(共1页)

s********r 发帖数: 297	1 大家好，我想请教大家一道题，所以希望各位能帮忙指点，先谢谢大家了！老板让我做一个A/B testing, 但是没有control group和test group。现在只有 pre- test population group 和 post-test population group。因为只有一小部分人被 test, 所以打算用 post-test population group作为test group, 然后pre-test population group作为control group 为了确保 comparing apples with apples, 老板让我trim pre-test population group by different attributes/dimensions 比如 income level, age group, activity engagement 都和 post-test population group 类似。最终要确保trimmed pre-test group 在各个attribute 比如 income level, age group, activity engagement 都和 post-test population group 类似, 并且 pass t -test。换句话讲，用 t-test 测试 trim 过的pre-test group 和 post-test population group 在各个attribute 上都没有 statistical significant difference 我看了下两个 population group 各个attribute的distribution都相差很大，请问我该怎么trim pre-test group 呢？请问我是该用k-mean clustering还是decision tree还是stratified sampling呢？谢谢大家了！
h*****m 发帖数: 955	2 match on the attributes
s********r 发帖数: 297	3 非常感谢楼上的回复，请问能说得更具体些吗？非常感谢您！
s**e 发帖数: 294	4 我好久以前给epi的数据做过，好像是要match case和control，用SAS写个macro，每一个case在control data里面找一个或几个基本一样的人，比如性别种族吸烟史完全一样，年龄那种contious上下允许个两三岁，这样最后ttest没问题。用SQL应该更容易。一种思路供参考。
h*****m 发帖数: 955	5 try 'matchit' in R For SAS I think a macro will needed to make it better. SAS may also have some simple procedures, but I think R does this well.
s********r 发帖数: 297	6 谢谢楼上大家的回复

1

(共1页)

进入Statistics版参与讨论

相关主题
● Help! How to get two CDFs on the same plot in SAS	● 急求马上要选课了谢谢各位大神
● 抽样问题求助	● onsite求建议呀
● cluster effect in case control study	● Question for Stratify sampling.
● 报一个Apple的Offer和面经	● 样本数量问题求助
● 讨论一下，非独立sample的显著性比较	● 请问一个统计问题
● 这样还能算Randomized sample吗	● [R] How to stratify data in R?
● sampling weight variable怎么用到linear regression里啊？	● 遇上这样的老板该怎么办
● 用SAS sampling的一个问题	● 请教一个survey weight的问题

相关话题的讨论汇总
话题: group话题: test话题: population话题: pre话题: sampling

未名新帖统计// 7月16日

#	版面	帖数(主题数)
-	全站	4871 (796)
1	Military	3777 (569)
2	Stock	341 (51)
3	Joke	117 (17)
4	History	116 (3)
5	Automobile	100 (9)
6	USANews	55 (9)
7	Midlife	45 (1)
8	Headline	41 (41)
9	Dreamer	33 (13)
10	FleaMarket	32 (20)
11	Living	30 (7)

* 这里只显示发帖超过25的版面，努力灌水吧:-)