z****h 发帖数: 203 | 1 关于捐款的,有历史数据,包含捐款人的各种信息的predictor variables, 还包括这
个人捐款数量(可能很多,也可能很少, 更可能是0), 我开始回答说用linear
regression model来model 捐款数量和predictors的关系,但是interviewer说如果那
些捐款人里面99%的都是捐款为0,只有1%的捐款了,数量还不齐。那这样model会很有
问题,请问什么办法可以解决。
我已经最近被两个公司问了这种类似问题,一个是APPLE,一个是今天面试的一个银行。 |
k****n 发帖数: 165 | 2 For regression, usually a transformation for the response variable is
required if its distribution is skewed. If 99% are 0, a common approach is
to split the model into 1) predict whether an observation will donate, 2) if
yes to 1), how much will he donate. |
z****h 发帖数: 203 | 3 Thanks. Are you saying I should use logistic regression model first, then
use linear model on the non 0% part? Fitting two models separately?
if
【在 k****n 的大作中提到】 : For regression, usually a transformation for the response variable is : required if its distribution is skewed. If 99% are 0, a common approach is : to split the model into 1) predict whether an observation will donate, 2) if : yes to 1), how much will he donate.
|
k****n 发帖数: 165 | 4 Yes. Google zero inflated regression for more details.
BTW, I'm wondering what kind of position were you interviewing for APPLE.
This is a very statistical question. |
s********0 发帖数: 2625 | 5 你说的是two-stage method,比较常用,因为简单,易懂。
复杂一点,可以google censored regression model and tobit model.我知道stata里
有现成的package,SAS/R不太清楚。
【在 z****h 的大作中提到】 : Thanks. Are you saying I should use logistic regression model first, then : use linear model on the non 0% part? Fitting two models separately? : : if
|
z****h 发帖数: 203 | 6 the Apple position is Statistician/Algorithm Analys
【在 k****n 的大作中提到】 : Yes. Google zero inflated regression for more details. : BTW, I'm wondering what kind of position were you interviewing for APPLE. : This is a very statistical question.
|