第6页 - 关于glm的讨论汇总 - 话题女王

全部话题 - 话题: glm

s*r
发帖数: 2757

来自主题: Statistics版 - [Help] SAS Proc mixed

proc glm 和 proc mixed有不同的要求

r**********9
发帖数: 174

来自主题: Statistics版 - multivariate glm

有人知道 R 里面有做 multivariate generalized linear model的 package或者
function 么，谢谢！

B******5
发帖数: 4676

来自主题: Statistics版 - multivariate glm

怎么个multivariate法？

s**c
发帖数: 1247

来自主题: Statistics版 - SAS vs. Stata

depends
survival analysis，我觉得stata好用
linear regression, glm, sas好用

o******6
发帖数: 538

来自主题: Statistics版 - [合集] 求助:刚开始学习R的菜鸟求教一个比较白痴的问题

☆─────────────────────────────────────☆
sweetandlow (Pepper) 于 (Wed Mar 18 23:30:41 2009) 提到:
刚开始用R, 啥都搞不清, 只能依葫芦话瓢, 大家不要笑我. 帮我看一下这个好吗, 很
急哦.
今天遇到一个问题, 我想应该是一次就可以得到所有答案的,可是我实在不知道怎么弄,
于是我得每次改一个数字, 再输出X, 觉得实在太繁琐了, 请大家教我一下怎么做吧.
...
fm <- glm(cbind(Mim, Total-Mim) ~ Age+ I(Age^2)+I(Age^3), mim, family=
binomial)
tfct <- function(x) predict(fm, newdata=data.frame(Age=x)) - zxx
zxx <- log(0.1/(1-0.1))
uniroot(tfct, range(mim$Age))$root ->X1
X1
zxx <- log(0.2/(1-0.2))
uniroot(tfct, range(mim$Age))

d******e
发帖数: 7844

来自主题: Statistics版 - LOGISTIC REGRESSION需要DATA正态分布么？

并不要求正态分布啊。
各种Linear Model或者GLM的核心是研究predictors和reponse(或者是reponse的转换)
是否是线性关系。

h******e
发帖数: 1791

来自主题: Statistics版 - 问个sas的问题。

使用glm，mixed这些procedure中的estimate statement，算出的结果是什么呀？该如
何解释？于lsmean statement算出的结果有什么不同？谢谢。

d******e
发帖数: 551

来自主题: Statistics版 - 谈谈最近两次面试经历

那楼主一看就是学术界的。工业界最重要的是1）数据 2）解释模型给不懂统计的傻子
看，那些眼花缭乱的方法，根本没法解释给决策者听。另外最后的结果往往会不如GLM
或者Treenet做出来的好。

A*******s
发帖数: 3942

来自主题: Statistics版 - 谈谈最近两次面试经历

treenet的方法已经够眼花缭乱了吧. Brieman已经说它是best off-the-shelf
classifier. 只要老板能接受，确实不容易搞出一个能比它好的。

GLM

s*********e
发帖数: 1051

来自主题: Statistics版 - 谈谈最近两次面试经历

it depends. in west coast, machine learning methods, e.g. ensemble, is very
popular in .COMs.
something is even fancier than what you could think of, such as hadoop.

GLM

w**********y
发帖数: 1691

来自主题: Statistics版 - 大家是对统计真得感兴趣还是只为了混口饭吃？

好像还是对"钱景"感兴趣的人多一点点嘛..
俺所谓的"金融统计",是花街能用得着的统计.比如说statistical arbitrage和high
frequency trading.然后传说很多hedge fund用machine learning用的很多.据说90年
代储,神经网络还被用于做trading system呢.MIT的andrew lo用过pattern
recognization做股票的TA分析.
不过我没有很好的朋友在HF做,所以不是很了解.只知道一些公司,比如ITG,SIG用的比较
多,不过不算hedgefund吧.
另外还有一些支离破碎的统计知识在用,比如有人用sequential sampling或者MCMC去做
model calibration;那个"人人喊打"的copula用于credit derivative pricing. Monte
Carlo simulation是花街的必不可少的工具,不过不学统计的也都会...
系统的去用统计model,比如GLM啊,FDA啊,俺没见过...当然了,去保险公司或者做risk
management,统计就会用的

P******e
发帖数: 75

来自主题: Statistics版 - [求助]三因素unbalanced的文献数据方差分析

我们从Literature里收集了很多关于牛奶中18种氨基酸的含量的数据，每个数据点是文
献当中的平均值。不同文章的牛奶在不同时间收集，不同地方来源.
数据整理后,我们想看一下不同因子有没有影响。
从三个因素考虑，分别是time,region,term.数据是不平衡的.
下表标出了其中两个因素。每个格子里面是多少个观测点。可见是很不平衡的。
Region1 Region2 Region3 Region4
time1 . 6 2 1
time2 . 7 2 3
time3 1 8 7 5
time4 1 11 4 5
time5 3 10 . 2
我们有几个问题请教：
1. 由于我们的每个数据点是每篇文献当中的平均值,能用ANOVA么?数据好像不是很正态
分布。要检测Assumptions?
2. 对我们这个unbalanced的数据处理,可以做Three-way ANOVA么?
3. 用proc GLM, Model AA1 AA2=tim

P******e
发帖数: 75

来自主题: Statistics版 - [求助]三因素unbalanced的文献数据方差分析

非常感谢你的建议。由于有三因素，作three-way的话，每个格子的数据点很少。作
interation好像太单薄了。可以只作 proc GLM, Model y1 y2=time region term可以
么?
sorry, 刚才不小心删了，重新发了。

P******e
发帖数: 75

来自主题: Statistics版 - [求助]三因素unbalanced的文献数据方差分析

谢谢，我RUN了。
我直接用
PROC GLM
Model Y1 Y2 = time region term
不考虑interation没问题吧?

P******e
发帖数: 75

来自主题: Statistics版 - [求助]三因素unbalanced的文献数据方差分析

没有文献的具体原始数据，我们的数据是提取每篇文献的平均值，综合起来。按照三个
因素分组，每个因素有不同的level。每个因素大概有100个数据点。每个数据点是一篇
文章里的列出的平均数。
请大家看看上面列出的表格。每个格子有不平衡的数据点，有些还是missing data
points.
可以用 SAS的 proc GLM 吗?

P******e
发帖数: 75

来自主题: Statistics版 - [求助]三因素unbalanced的文献数据方差分析

用observational meta-analysis要筛选掉很多文献，似乎不能达到我们想要得目的。
我们想知道这三个因素对这些数据有没有significant 的影响。希望能test variance.
请问如果不能做proc GLM 的话，还有别的办法么？

a********s
发帖数: 188

来自主题: Statistics版 - 统计在保险业（Casualty & Property）中的应用以及发展

大家来谈谈你们所了解或者经历的统计方法或者模型在保险业（Casualty and
Property）中有哪些应用？用到了哪些统计模型？预测一下以后发展的前景如何？
听说最多的用的是GLM.
至于前景，oloolo大作中提到的“现在保险业正在从传统精算逐渐转型到依赖现代统计
模型来精细化业务流程和产品类别”，很受鼓舞。

s*r
发帖数: 2757

来自主题: Statistics版 - trend test是怎么回事？

you are looking for the thing mentioned by DaShagen
google "trend test proc glm contrast"

,
estimated
endpoints

A*******s
发帖数: 3942

来自主题: Statistics版 - 如何确定什么情况time series，什么情况linear reg？

yep, but in most cases for prediction, many data mining methods are better
on capturing the non-linearity nature. GLM is better on explanation though.

use

a********l
发帖数: 40

来自主题: Statistics版 - PROC REG中可以用IF语句么？

why not use proc glm
class

p***r
发帖数: 920

来自主题: Statistics版 - R 里面 logistic regression （glm）怎么样输出 Odds Ratio 的 confidence interval

想偷懒一下，不想自己google了，呵呵，等待牛人

a***r
发帖数: 420

来自主题: Statistics版 - R 里面 logistic regression （glm）怎么样输出 Odds Ratio 的 confidence interval

我不是牛人，正好知道
exp(coefficients）
似乎没有直接输出的命令，有的话围观

d*****u
发帖数: 111

来自主题: Statistics版 - R 里面 logistic regression （glm）怎么样输出 Odds Ratio 的 confidence interval

比较肯定没有直接输出的，要用coefficient自己算。

p***r
发帖数: 920

来自主题: Statistics版 - R 里面 logistic regression （glm）怎么样输出 Odds Ratio 的 confidence interval

好像理解有些错误，我要的是OR的confidence interval

H******e
发帖数: 333

来自主题: Statistics版 - 弱弱的问一下关于one-way repeated ANOVA

我之前没学过这个，但现在做一个东西需要这个分析。
我查了半天资料，加上我手上现在的书相关的介绍不太多。
我想麻烦大家问一下，就是在做one-way repeated ANOVA之前需不需要有什么
assumption？
还是说直接在SAS里面用proc ANOVA的statement就可以了？
我在网上查有的说对于assumption需要先做Mauchly's sphericity test。
但是对于这个test的结果分析没有，或者说这个test是对应用proc GLM做one-way
repeated ANOVA的。
我是新手，这是第一学期，还是老师告诉我要用这个的。
但现在对这里非常confuse，希望高人们可以帮帮我。
谢谢

w*********y
发帖数: 7895

来自主题: Statistics版 - 弱弱的问一下关于one-way repeated ANOVA

我是半桶水。据我了解的是，做任何TEST都要看看其数据是否符合该TEST的
ASSUMPTION。如果你的数据不符合MAUCHLY'S SPHERICITY TEST的话，确实
应该用PROC GLM。

d******r
发帖数: 107

来自主题: Statistics版 - 再问个travelers的面试，汗；

hiring manager要面我职位是predicting model intern 他是pricing/cost &
reserving 部门的。估计会问sas和glm。 SAS我都是去年上课的时候用用做作业的，
细节不熟，但是要是能看help，就能很快学会并解决问题。有个base的证。我research
都是用R.有经验的人能说说嘛？好着急啊。

j*****e
发帖数: 182

来自主题: Statistics版 - 有什么Trend test 可以用在这组数据上呢？

You could run the analysis in GLM or MIXED and use contrast (something like
1 -2 1, I don't remember clearly) to test the linear trend. This is a
response surface approach.
Or, if you prefer nonparametric test, search the key word "ordered
alternative". There are a whole bunch of tests developed for this. And there
is no definite "best" test for this situation.

s******a
发帖数: 184

来自主题: Statistics版 - residual deviance 和 dispersion parameter

在一些材料中看到，在R 中用GLM 做完model fitting 后，用residual deviance 去除
dispersion parameter。这个得出的指标是什么？

D******n
发帖数: 2836

来自主题: Statistics版 - 请教一个R问题

glm(y~.)

is
Is

o****o
发帖数: 8077

来自主题: Statistics版 - 请教一个R问题

难道不是
glm(y~x[,1:p])
么？

is
Is

m*******7
发帖数: 244

来自主题: Statistics版 - GLM and GEE

What is the difference between Generalized linear model and Generalized
estimating equation?

p**********l
发帖数: 1160

来自主题: Statistics版 - GLM and GEE

When we learned it in a consulting class, GEE was used to analyze correlated data consists of measurements taken on different clusters which are comprised of members or subunits, for example, longitudinal discrete data.
http://en.wikipedia.org/wiki/Generalized_estimating_equation
Generalized estimating equation
In statistics, a generalized estimating equation (GEE) is used to fit the
parameters of a generalized linear model where unknown correlation is
present.
The GEE allows for correlation wit

w******a
发帖数: 25

来自主题: Statistics版 - 该怎么分析“所用的颜色的数量”和“诊断结果”之间的关系？

> mylogit<-glm(DIAGNOSIS~Color_used,family=binomial(link="logit"), na.action=na.pass)
> summary(mylogit)
Coefficients:
Estimate Std. Error z value Pr(>|z|)
(Intercept) -1.57289 1.13605 -1.385 0.166
Color_used 0.09416 0.18558 0.507 0.612
The table of coefficients shows that Color_used is statistically non-
significant.

T*******I
发帖数: 5138

来自主题: Statistics版 - 急问：用什么方法好？？？

补充一点：
在建Logistic model时，除了那几个已知变量外，还可在模型中设立几个交互项，
例如chemical_1*chemical_2, chemical_1*水流速度，chemical_2*水流速度。
同样，在建GLM时，交互项可以定义如下：
如果chemical_1作为因变量，则交互项可以有chemical_2*断层、chemical_2*水流
速度，以及水流速度*断层。
在上述两类模型中设立交互项后，不仅可以了解并修正各个自变量对因变量的独立
联系，还可以了解各交互变量之间的关联性对因变量的贡献，从而间接说明各交互变量
之间的联系在整个观察系统中是存在的。

GIS
regression和

p********0
发帖数: 186

来自主题: Statistics版 - dynamic formula construction in R

Hi,
In SQL server, I can construct a dynamic SQL annd execute
like sqlstatement ='select '+variable_name+'from '+table_name. then I exec(
sqlstatement)
In R I tried to specify text = paste('logitdd <- glm( Y ~ ', 'Xs', variable_
name,
'..).
then eval(expression(text))
there is no error, but the logitdd didnot get assigned. I tried to copy the
text and execute in command line, everything works. Does anyone know how to
do it?
Thanks in advance
'

n******r
发帖数: 4

来自主题: Statistics版 - [OPENING] Statistical Analyst

The Statistical Analyst is responsible for producing and reviewing forecasts
. The Analyst uses statistical forecasting methods (GLM, and Longitudinal
Analysis; develop linear and non-linear deterministic forecasting models;
perform residual analysis and run autocorrelations tests); interpret and
present results from technical data analyses to clients using non-technical
language producing forecasts that support demand planning decisions and
utilizes established analytical methods to develop dem

B******y
发帖数: 9065

来自主题: Statistics版 - cro里面比较常用的SAS proc都有什么

PROC REPORT, MEANS, FREQ, GLM and LOGISTICS，包括换汤不换药的有UNIVARIATE，
MIXED, GENMOD等，基本上涵括了90％以上的工作。剩下的10％就为PROC LIFETEST，
PHREG，POWER，PLAN等等了。

h******e
发帖数: 1791

来自主题: Statistics版 - 请教统计问题。

两组病人，同一种病但病的进程(status)不一样，测量的变量是心脏长度(length)和容
积(volume)。现需要检查心脏长度和容积之间有无相关性。我是这样做的：
proc glm;
class status;
model length = volume status;
run;
请给点意见，谢谢。

d*******o
发帖数: 493

来自主题: Statistics版 - 请教统计问题。

status是果，其它的是因吧。有没有相关性应该看significance of interaction term.
proc glm;
class status;
model status = volume length volume*length;
run;

A*******s
发帖数: 3942

来自主题: Statistics版 - credit card risk management，求建议

if your resume didnt show any backgrounds about banking/finance, i doubt HM'
s would ask you technical questions specific to credit risk management. I
would prefer to review GLM, data mining and SAS, according to my limited
interview experience(twice for risk modeling positions). Just my 2 cents.

s**********y
发帖数: 38

来自主题: Statistics版 - WHAT IS CART?

非常感谢! 请问还有别的软件吗? 公司要求GLM, LOGISTIC REGRESSION , CART,
我会点CHAID. 请问cart 跟CHAID 差不多吗?

R******o
发帖数: 83

来自主题: Statistics版 - Finance方面的 SAS/Statistical modeler 都需要会些啥？？

一个软件公司的Recruiter 想找个Contractor，跟我通过两次电话，知道我以前做的
基本都是clinical studie/health care
research..，问我有没有更复杂的modeling经验，说他们做的东西需要高级的统计模型
作Finance方面的预测，我觉得自
己做过的都挺简单的(GLM, Logistic, Survival,etc...)，估计是不适合了，不过很好
奇这方面需要些什么高深的知识，请问
版上有没有做这方面并且用SAS的，敬请指教。

s*****n
发帖数: 2174

来自主题: Statistics版 - 如何在1，2，3，4，5中随机选出2个数来？

其实就是一个循环, 循环里面包含一个判断. 实现的话在R里也就十几行.
data <- read.table(...)
result <- data.frame(try = 1:1000, output = NA, case = NA)
for (i in 1:1000){
data1 <- data[sample(100000, 10000), ]
data2 <- data[sample(100000, 10000), ]
if (mean(data1$var1) > 0){
fit1 <- lm(...)
result$output[i] <- functionA(data2, fit1$parameter_a)
result$case[i] <- "A"
} else {
fit2 <- glm(...)
result$output[i] <- functionB(data2, fit2$parameter_b)
result$case[i] <- "B"
}
}
hist(result$output[

p*****o
发帖数: 543

来自主题: Statistics版 - 请问怎么建立变量全是ordinal data的model？急，谢谢。

那就先试试都建立DUMMY VARIABLE，再REGRESSION看看好了。
或者有GLM MODEL好了。

x*******i
发帖数: 1590

来自主题: Statistics版 - 有人知道这是什么回事？proc genmod.

各位，不好意思，确实不是统计专业出生却在做一些统计的活，不太懂一些名词。
有了问题，就到处google，然后依葫芦画瓢。
我是想test 6 interactions(b*a) coefficients 的不同。网上找到了这个：
http://www.ats.ucla.edu/stat/sas/faq/compreg3.htm
里面有一个sas codes
"PROC GLM DATA=htwt2 ;
CLASS age ;
MODEL weight = age height age*height / SOLUTION ;
CONTRAST 'test equal slopes' age*height 1 -1 0,
age*height 0 1 -1 ;
RUN;“
例子里是Ho: B1 = B2 = B3
我想test ho: B1 = B2 = B3=B4= B5 = B6,所以就模仿了他的contrast code.
这是一个contrast statement
CONTRAST 'test equa

s*r
发帖数: 2757

来自主题: Statistics版 - 谁做过PROPENSITY SCORE的SIMULATION？

要不你做quantile stratification
把数据分组之后，在每个组里运行
proc glm ;
model y=grp x1 x2 x3;
run;
看grp effect的average是不是要比在所有data里面运行同样model的估计值更接近true
value
感觉这样的comparison比较公平

c*******o
发帖数: 8869

来自主题: Statistics版 - 请问：R-square 可以用来评估GLM model得好坏吗

用MSE

A*******s
发帖数: 3942

来自主题: Statistics版 - 请问：R-square 可以用来评估GLM model得好坏吗

那你就用p-value of goodness of fit test statistics。虽然我觉得没啥意义
你看看SAS有没有gamma distribution对应的pseudo r square吧，如果有的话也成

w********o
发帖数: 1621

来自主题: Statistics版 - 请问：R-square 可以用来评估GLM model得好坏吗

Use likelihood. Compared it to the likelihood of a null model. The bigger
the likelihood is, the better is the model fitting. From underlying
rationales of AIC, BIC for mixed models.

#	版面	帖数(主题数)
-	全站	4871 (796)
1	Military	3777 (569)
2	Stock	341 (51)
3	Joke	117 (17)
4	History	116 (3)
5	Automobile	100 (9)
6	USANews	55 (9)
7	Midlife	45 (1)
8	Headline	41 (41)
9	Dreamer	33 (13)
10	FleaMarket	32 (20)
11	Living	30 (7)

topics

未名新帖统计// 7月16日

历史上的今天