第3页 - 关于stratified的讨论汇总 - 话题女王

全部话题 - 话题: stratified

m*******1
发帖数: 328

来自主题: MedicalCareer版 - 2015 and step socre.

How many people are unmatched? Have you taken a look at their credentials?
They did not report in Medi at all. I am pretty sure that the mean and SD
in matched group are higher than those of the unmatched. Of course, score
is not everything. Someone with score<200 can match with a stroke of luck.
But the vast majority with average scores (<210-220) will end up with
nothing. Of course, it is simply not fair to look at proportions of the
unmatched IMGs only (60% at this time). The unmatched shou... 阅读全帖

m*******1
发帖数: 328

来自主题: MedicalCareer版 - AAMC Concerned about Reports of Unmatched Students !!!

其实美国的med students有很多差生。尤其是med schools进门就不是很公平，在很大
程度上是按照ethnicity去stratify candidates。但是社会是相对公平的，不行就不
行，靠优惠进来的，在这个level就出局了。

p*x
发帖数: 260

来自主题: Quant版 - 问个跟股票有关的问题

For equity prices, brownian motion by no means produces good results.
A possible improvement would be Normal Inverse Gaussian process coupled with
Inverse Gaussian Bridge technique.
In terms of MC simulation, you would be able to further improve your results
by the stratified sampling method.
Good luck.

x********o
发帖数: 519

来自主题: Quant版 - JP 2nd round 电面

antithetic, control,
stratified sampling,
what's the fourth one?

s********r
发帖数: 529

来自主题: Quant版 - 关于期限较长，波动较大的几何布朗运动模拟问题

近日碰到一个模拟几何布朗运动的问题，较为棘手，不知道版上的工友们有没有好的解
决办法，希望指点一二，问题如下：
我需要模拟一个股票的dynamics，过程并没有多少难度，就是一个简单的log normal过程
dfrac{dS_t}{S_t}=\sigma dW_t, S_0=1
但是在具体的实现过程中发现很多问题，当我把模拟的期限设得比较长，比如说25年，
波动性调的比较大，比如说0.9左右，模拟出来的结果不是很好，需要模拟次数达到10^
5乃至10^6级别之后平均值才有可能接近1,也就是理论平均值，只要模拟的次数稍微小
一点的话sample mean就偏得相当厉害了
我想到了使用stratified sampling这样的方法来改进模拟过程，当然我目前还只用了
最naive的平均分组法，效果也不是很理想，结果和一般的sampling方法并无大的差异。
现在我想到能够改进的也就是跑一下plot run，是这让分组更加优化一些，但是感觉优
化之后并不会有大的进步，不知道各位有什么好的方法可以让模拟的次数能够在比较少
的情况下尽量能够接近1呢？
多谢各位了！

A*******s
发帖数: 3942

来自主题: Quant版 - 请问一个关于em algorithm处理missing data的

missing imputation 得先看missingness是not observable还是not applicable. 如果
是后者， missing imputation is not well defined. better do stratified
analysis.
当然在现实里很多人不管三七二十一就直接做了，只要能fit data，倒也不能说错。

j*****e
发帖数: 182

来自主题: Statistics版 - [合集] how to randomly draw 10% sample from a data set?

Suppose your data has 1000 observation. You want to draw a sample of 100.Use
the following SAS code,
proc surveyselect data=dataname method=urs SAMPSIZE=100 rep=1 out=sample
seed=1594 outhits;
run;
There are other methods to randomly select observations (w or w/o
replacement, stratified sampling, clustered sampling, PPS sampling, etc).
Read SAS help for more detail.

e******o
发帖数: 644

来自主题: Statistics版 - 大牛们，帮忙解决个很小的统计问题

其实就是一个从1-200不重复取10个样本 R中的程序如下
x<-c(1:200)
y<-sample(x, size=20, replace = FALSE, prob = NULL)
没有什么代表性一说如果你认为这10根元件没有代表性除非显然样本存在分层情况
那么就要stratified sample了

l*******n
发帖数: 19

来自主题: Statistics版 - 这样还能算Randomized sample吗

"Random sample" and "Randomized design" are different concepts.
If there is no treatment intervention, you have an observational study.
Your samples (no matter old or not) are still random.
If there is treatment intervention, the subgroup design (old people) may
not be randomized even your overall design is randomized, when your design
was not stratified by age (with old as a stratum).

b*******r
发帖数: 152

来自主题: Statistics版 - 说说几个把数据随机打乱的方法吧

or maybe stratified.

h********e
发帖数: 15

来自主题: Statistics版 - 遇上这样的老板该怎么办

我不是statistics的，但是要用到很多统计。不知道发在这里合不合适，如果不合适，
麻烦帮忙转到合适的版吧。
现在还在学校，结果老板很不好。一天一个想法，而且总是不相信事实，想通过
stratification来得到她想要的结果。我已经在她摧残下做了各种各样的
stratification，前后数据分析了十几遍，无论她怎么凑都凑不到她想要的结果。有几
次我觉得她说的事情不太准确，于是表达自己的看法和意见，结果她非常生气，之后一
段时间非常harsh。本想熬到毕业就算了，结果最近她又想做3-way interaction，结果
不出所料，是跟先stratify 后作2-way interaction的结果是一致的。可是她无论如何
都不相信这个结果，不停得让我再confirm, 实际上她根本不知道3-way interaction该
如何去做，因为之前我给她看结果时她说你用别的软件作的么，为什么我做不出来（在
stata里）。我问她为什么要纠结这样的结果，3-way interaction 很少见，又不好去
解释，她不做回答，说你先做我让你做的，可是我能做的都已经做了。而实际上数据分
... 阅读全帖

l*********s
发帖数: 5409

来自主题: Statistics版 - 遇上这样的老板该怎么办

there are some possible options, like time-dependent covariates,
stratified proportional hazard
model, or discrete time logistic model.
Ask the statistic students/professors of your school,usually they would
not mind do some consulting work in exchange of authorship.

w***4
发帖数: 1205

来自主题: Statistics版 - 请教关于stata中的svyset如何设置

以前没有做过survey data,想请教一下这个方面的专家，如何设置svyset.
问题如下：
这是一个four-stage stratified random sample: 首先选counties，然后选
townships, 然后再选villages，最后选个人。
选counties,一共有70个，排除掉两个偏僻的，在剩下的68个中选15个，根据人均收入
排这68个counties,然后，每隔4个抽一个。随机产生开始的参数，从第3个county开始
抽。
选中的15个counties，一共有337个townships,根据人均收入排这337个townships,从中
抽取32个townships。
选中的32个townships中，一共有500个villages。用前面类似的方法抽出60个villages。
在这60个villages里面有一共6700个合适的对象。再在每个village里面抽出20个做最
终的调查对象。
请问，应该如何设置stata里面的svyset呢？
另外，第二轮调查的时候，有些调查对象因故不再能被调查。分析第二轮数据的时候，
需要重新设置svyset吗？... 阅读全帖

h*****o
发帖数: 47

来自主题: Statistics版 - StatXact question

I am looking at the manual for StatXact6. On Page 331, it talks about using
two procedures for s 2 by 2 tables. The title of the chapter is Stratified 2
by 2 contingency tables.

y*****n
发帖数: 5016

来自主题: Statistics版 - 面试问题紧急求助！

Yes and no, depends on the existing model data (previous campaign data).
If the previous campaign data is believed to be a random sample of the
population or a stratified sample with known weight, then probably you don’
t have to worry about reject inference…

s*r
发帖数: 2757

来自主题: Statistics版 - main effect not significant, interaction significant

stratified analysis

F****n
发帖数: 3271

来自主题: Statistics版 - help! a q about sampling--------- Thanks

You need to sample each state if you want to capture state-based variation.
Just treat it as a stratified sampling. You can over-sample states with
small number of cases.

below.
these

m******2
发帖数: 564

来自主题: Statistics版 - 问问各位大佬：SVM和RF在几百个sample几万个variable的情况下

Stratified Sampling

m****t
发帖数: 754

来自主题: Statistics版 - sampling weight variable怎么用到linear regression里啊？

请教大家一个问题：
survey data里有上千的受访者，每个受访者对应一个sampling weight variable。我
只知道在stratified random sample里, 每个受访者被选到的概率是不一样的。这个
sampling weight=1/probability.
那么在我用统计软件run linear regression的时候，这么sampling weight variable
怎么用啊？
（1）先run linear regression里其他的independent variables without the
sampling weight variable,得到 linear regression equation之后再加进这个
sampling weight variable。
（2）run linear regression的时候就加上这个sampling weight variable。
我感觉应该是（1），但不知道得到linear regression equation之后怎么加这个
sampling weight variable啊。... 阅读全帖

k*******a
发帖数: 772

来自主题: Statistics版 - Baozi award: event in survival analysis

So what are you interested in?
You may try stratified analysis.

x*******i
发帖数: 1237

来自主题: Statistics版 - PROC FREQ

Stratified analysis using PROC FREQ can indicate the potential for effect
modification (interaction) with information obtained from the Breslow-Day
Homogenieity of OR test.
True or False?

s**f
发帖数: 365

来自主题: Statistics版 - 请教一个survey weight的问题

请大家多指教指教啊。有没有什么通俗易懂的实用材料能推荐一下？

stratify

w*******9
发帖数: 1433

来自主题: Statistics版 - 请教一个统计建模的问题。

Stratified regression: add an interaction term between smoke and the
indicator I_{weight>200}---250 if you like.

w*******9
发帖数: 1433

来自主题: Statistics版 - 请教一个统计建模的问题。

Stratified regression: add an interaction term between smoke and the
indicator I_{weight>200}---250 if you like.

w*******9
发帖数: 1433

来自主题: Statistics版 - 请教分析数据结构随时间变化的方法

如果非得给老板一个交代：如果副反应有一个定量的程度（比如血压升高多少）可以做
个stratified regression, intercept设为时间的函数，看看fit的效果如何。如果副
反应就是个indicator,除了画画例数x时间by sex or race以外，看不出有啥可做的。
比如你画出来发现黑人副反应组年增加而白人组年减少，这将是一个有意义的pattern.

h***x
发帖数: 586

来自主题: Statistics版 - 用SAS sampling的一个问题

Create a new variable to count the number of patients in a hospital, then
use proc surveyselect by Stratified Sampling

hospital

h***x
发帖数: 586

来自主题: Statistics版 - 求教如何用sas从一个大population选sample

没有做过，只是个idea,用zhongdianshi说的survyselect，stratified sampling,用
weight.把年龄化成区间，create a weight variable,越靠近20的weight越大
比方：age
19-21:10
16-18:8
22-24:8
25-27:7
...
>50 : 1
...

variable

k*******a
发帖数: 772

来自主题: Statistics版 - 请教：如何估计两组样品的方差，当其中一组只有部分样品被测试？

这个是stratified sampling
可以参考 sampling design的书，一般都有讲怎么估算的

o******6
发帖数: 538

来自主题: Statistics版 - 和不很懂统计和DESIGN且不愿接受新东西总以为自己是对的老板工

当初毕业为了家庭原因没找工作就留在了现在的学校工作，组里还有个工作多年的常青
藤毕业的STATISTICIAN,老板很信任他，这几年重要的PROJECT都是同事做的。今年老板
没钱了，同事走了，我才开始接受他的项目，接受了也开始我痛苦的日子了。
比如，刚开始做的新项目的统计分析的时候，从RA处了解了项目后一看就知道是
CROSSOVER DESIGN WITH REPEAT MEASURES，因为老板要我用原来的同事做的方法做，
我就说这个是啥DESIGN，需要TEST CARRYOVER EFFECT，需要吧PERIOD EFFECT放到
MODEL里等等，结果她都不给我机会解释，还很生气地说如果我们不WORK OUT，
BLAHBLAH，她现在压力大养不起几十号人对底下人都很MEAN，可能对她来说只要有结果
就可以了，MODEL越简单越好，正确与否是另一回事。后来，我还傻乎乎的和她辩论了
几回，比如建议如果RESPONSE是POSSION DIST用GEE(MAKE INFERENCE ABOUT THE
POPULATION..)或GLMM (MAKE INFERENCE ABO... 阅读全帖

a****g
发帖数: 8131

来自主题: Statistics版 - logistic regression on 3 billion records (转载)

如果做sampling的话,这个是做random sampling还是做stratified sampling?
being curious
thanks

m******4
发帖数: 79

来自主题: Statistics版 - 急求马上要选课了谢谢各位大神

选课这门课重要性大吗？Sampling Techniques .内容：Theory of probability
sampling designs. Unrestricted random sampling. Stratified sampling. Cluster
sampling. Multistage or sub sampling. Ratio estimates.

h**********1
发帖数: 155

来自主题: Statistics版 - onsite求建议呀

恩，刚onsite完一个，也是做survey sampling的
问了好多问题，最清楚的就是stratified sampling vs. cluster sampling.
还有use one line to explain survey methodology
还问了quality assurance.
楼主好好看看job description吧~~~
然后也多准备一些behavioural questions.我被问了好多这方面的。要是没准备的话估
计当时会比较难堪

h**********1
发帖数: 155

来自主题: Statistics版 - onsite求建议呀

a********y
发帖数: 474

来自主题: Statistics版 - 讨论一下，非独立sample的显著性比较

suppose it's a survey data with sampling weight
treatment: A (0,1) ,B (0,1)
sex: male; female;
age groups: young, middle, old
all patients may use A, or B, or both A and B at the same time. So the
independent sample assumption does not hold.
Question: is treatment A significantly more adopted than treatment B (
overall, stratified by sex, age, etc)?
请问可以 dependent t-test 吗？
谢谢！！

T*******I
发帖数: 5138

来自主题: Statistics版 - 讨论一下，非独立sample的显著性比较

The "independence" of a sample should mean that each individual (here is a
patient) in a sample is independent to all others. So, your sample can be
treated as three independent groups:
Use A only
Use B only
Use A and B
You can take the simple ANOVA with interaction A*B. I believe the model can
be
Effect = groups(include A, B and A+B) + A*B + adjusters( age, gender)
or stratified models by categorized Age groups, or Gender, etc.
At leaset, the effect models established here can help you to find
... 阅读全帖

p********6
发帖数: 1339

来自主题: Statistics版 - 讨论一下，非独立sample的显著性比较

T-test is to compare means of a continuous variable in two groups.
What you want to compare are the proportions (of treatment A and B) so t-
test is not appropriate. The test you want to use is McNemar's test. Search
McNemar's test and stratified McNemar's test to get more details.

k*******a
发帖数: 772

来自主题: Statistics版 - 请教survival analysis里面的sample size问题

我觉得这个可以看作个stratified cox PH model
可以把每个pair 看作个statum
这样的话，我觉得sample size应该和以前一样，因为每个pair提供的information和普
通的两个人提供的informaiton一样

A*******s
发帖数: 3942

来自主题: Statistics版 - 大家平时怎么处理missing data？

missing data analysis is a huge topic and you can find tons of literature
discussing it. Before jumping to any fancy techniques on missing imputation,
i think the very first step is to ask two questions.
The first question is--are the data really missing, meaning there are indeed
true values but we just don't observe them, or, are they actually not
applicable, meaning there is no valid value at all?
If the answer is the latter, then you cannot well define a random variable
on those 'Not Applicab... 阅读全帖

A*****a
发帖数: 1091

来自主题: Statistics版 - Question for Stratify sampling.

看了下这个完整的例子，他只是随意指定了size，来举例怎么使用strata这个命令罢了吧

Does
lot!
rep

A*******s
发帖数: 3942

来自主题: Statistics版 - 发个高难度的面试题

the
我说的可能不够详细。
最简单的处理recurrent event的AG model就认为correlation完全被time dependent
covariates解释了(如果搞不定的，再上其他的frailty/stratified...)。
frailty/random effect可以用来model subject-specific effect, 相对于model没有
frailty/random effect而言，bias是减少的。但相对于fixed effect 而言，bias增加
了，variance减少了。
你的理解是怎么样的？
顺便问一句，大
对于internal predictor来说，实际上还是把它当成time independent的。先不说fair
lending的要求，我们一般要预测的是“如果我看到一个人今天没结婚，那么一年内他
破产的概率是多少”。我们用的其实是observed value at timespot，完全time
independent...
external的好办，最简单的就是单独搞个time series的来预测它然后plug ... 阅读全帖

s***1
发帖数: 343

来自主题: Statistics版 - Help! How to get two CDFs on the same plot in SAS

Does anyone know how to do this?
CDF plots, stratified by groups, on the same plot
Know how to do this in R, but have no idea how to make it in SAS
Thanks!!

D**u
发帖数: 288

来自主题: Statistics版 - rare events的modeling 问题

1st choice:
oversampling/stratified sampling, add weight to the observations, both sas
and R can do this easily.
2nd choice:
rather simple negative binomial model than any zero inflated models for the
ease of avioding mixture model
of course, you can always try zero inflated models
a good explanation is here:
http://www.statisticalhorizons.com/zero-inflated-models

c*****l
发帖数: 1493

来自主题: Statistics版 - 问个问题关于LOGSITC REGRESSION，急切

用Stratified sampling怎么样
不过不是很了解为什么需要这样

,

a**j
发帖数: 60

来自主题: Statistics版 - 一道面试题，向本版求教一下。

poison regression or logistic regression
stratified by levels

q********n
发帖数: 355

来自主题: Statistics版 - 抽样问题求助

为调查州内高速公路质量，以100米为一个单位进行抽检。这是一个典型的总体成数
population proportion问题。可以根据总数，合格率，置信度和精确度，推算出需要
的样本数量。如果合格率很高，需要的样本量少，如果合格率低，需要的样本量多，0.
5的合格率需要的样本量最多。
1. 实际上每个单位样本有多个检测项目，比如路面，护栏，标志，反光片等，每一个
检测项目都有一个通过率。另外，而每段路的检测项目又不尽相同。比如沥青路面和水
泥的不同，如果有桥梁，涵洞，隧道什么的又不一样。所以选取的样本要能够保证所有
可能的检测项目的数量都足够。请问有没有针对这种情况的抽样方法呢。
2. 关于是否要分区，stratified sampling，我几天前问过.但是如果每个区域内的通
过率变化不大，我用paired t-test比较过，是不是就没有必要分区了。

A*******s
发帖数: 3942

来自主题: Statistics版 - cluster effect in case control study

case control的优点就是可以stratified/conditional logistic regression
cluster/stratum effect会变成nuisance parameter
sampling weight不起作用
因为会同时出现在conditional likelihood的分子分母中被消掉

adjust
sample
control

t********6
发帖数: 43

来自主题: Statistics版 - cluster effect in case control study

stratified/conditional logistic regression是不是说的是match了的情况？问题是
我的data没match，每个cluster里面case和control的ratio都不一样，所以cluster的
confounding去不掉

F8
发帖数: 348

来自主题: Statistics版 - 怎样弥补sample distribution?

not recommended
this is a sort of stratified PPS problem
use weighted estimation to get an unbiased estimate

the

l*******o
发帖数: 71

来自主题: Statistics版 - 请大侠指点：全职妈妈自费读统计小硕有出路么？

想今秋转行学统计，请有经验的同胞们给点建议和鼓励给我指条明路。现在的我很
迷茫不知道自己的选择是不是正确的。最近看过很多帖子说统计小硕找工作不容易什么
的，弄得我一身冷汗。先做一下自我介绍本人31了，2009年毕业于国内一普通大学2E专
业。来美国5年了一直做全职，现在宝宝也快2岁了。我和老公的绿卡正在排期，大概是
两年后能排到，所以想现在学点东西到时候找个工作。我现在的想法就是只要毕业能找
到工作就好。真怕花了老公大把银子最后还是家庭主妇。哦对了忘交代了，我的英语水
平是一般。教育宝宝都是全中文的。我的问题主要有2个方面。
1.学统计从事哪个方向的就业机会多啊？我看有学生物统计去药厂的，有在银行和保险
公司的。看有人说药厂的工作经验到换工作的时候是很难转的。那什么方向的工作经验
是越久越光明呢？
2学校的curriculum上有好多课程，有经验的兄弟姐妹们能帮我看看哪些课程的实用性
强么？里边有的应该是phd的课程，不知道我能不能选呢。我把一些我觉的不用选的和
不能选的课程删掉了。小硕就是10门课。
courses list：
8001. Probability ... 阅读全帖

A*******s
发帖数: 3942

来自主题: Statistics版 - repeated event survival data

再加一个shared frailty model
无非就是如下model的censored data版本
fixed effect
stratified model
robust sandwich
random effect

#	版面	帖数(主题数)
-	全站	4871 (796)
1	Military	3777 (569)
2	Stock	341 (51)
3	Joke	117 (17)
4	History	116 (3)
5	Automobile	100 (9)
6	USANews	55 (9)
7	Midlife	45 (1)
8	Headline	41 (41)
9	Dreamer	33 (13)
10	FleaMarket	32 (20)
11	Living	30 (7)

topics

未名新帖统计// 7月16日

历史上的今天