H*******r 发帖数: 98 | 1 one is...
data select;
set original;
where var1 in (value1 value2 value3 value4 value5 value6 value7 value8)
or
var1 between minvalue and maxvalue or
var2 in (number1 number2 number3) or
var3 in (number1 number2 number3) or
var4 in (number1 number2 number3) or
var5 in (number1 number2 number3) or
var6 in (number1 number2 number3) or
var7 in (number1 number2 number3) or
var8 in (number1 number2 number3) or
... 阅读全帖 |
|
y**i 发帖数: 1050 | 2 how can I know the number?
so I can use
Do i=1 to rank (var2) over (partition by var1) order by var1 var2;
is this ok? whole of this "rank (var2) over (partition by var1) order by
var1 var2" will give me a number?
actually, I found I have VAR3, I need to know the level of VAR3 conditional
on VAR2 and VAR2 condition on VAR1.
Can you help me know?
thank you very much |
|
y**i 发帖数: 1050 | 3 thank you
I changed my data, actually has 3 VARs.
I have VAR3, how can I count level of VAR3 condition on VAR2 and VAR1
thanks |
|
y**i 发帖数: 1050 | 4 proc sort data nodupkey;
by var1 var2;
run;
proc sql NUMBER;
select VAR1,VAR2 , COUNT (DISTINCT VAR3) AS LEVEL
FROM ONE
GROUP BY VAR1 VAR2
order by VAR1, VAR2, VAR3;
QUIT;
is this ok?
Can I do this in data step , not in proc sql?
Var1
group |
|
a**t 发帖数: 57 | 5 菜鸟好不容易有个PHONE INTERVIEW,之后给了题让坐做完发回去,高手们其帮忙给个
解答,第9题以后开始就行了,也许其他人也可用上。
The test below is designed to gauge your level of SAS experience. Feel free
to reference any sources of help (SAS documentation, etc.) you would
normally use in the course of developing code. We are looking for your
general knowledge of concepts, so please DO NOT spend too much time trying
to perfect minor details of syntax. We understand that writing code without
the chance to run it to discover errors can be difficult.
The ... 阅读全帖 |
|
a**w 发帖数: 60 | 6 proc sql里可以实现, 但是code 显得太笨:
proc sql;
create table a as
select *,
sum(P1) as sum_P1,
sum(P2) as sum_P2,
sum(P3) as sum_P3,
sum(P4) as sum_P4,
.
.
.
sum(P98) as sum_98,
sum(P99) as sum_99,
sum(P100) as sum_100,
sum(Q) as sum_Q,
calculated sum_P1/calculated sum_Q as F1,
calculated sum_P2/calculated sum_Q as F2,
calculated sum_P3/calculated sum_Q as F3,
calculated sum_P4/calculated sum_Q as F4,
.
.
.
calculated sum_P98/calculated sum_Q as F98,
calculated sum_P99/calculated sum_Q as F99,
calculated... 阅读全帖 |
|
w********p 发帖数: 948 | 7 问题继续, 包子继续
class Base
{
public:
Base(){ cout<<"Constructor: Base"<
~Base(){ cout<<"Destructor : Base"<
};
class Derived: public Base
{
public:
Derived(){ cout<<"Constructor: Derived"<
~Derived(){ cout<<"Destructor : Derived"<
};
int main()
{
Derived Var1;
Base Var2(Var1);
Derived Var3(Var1);
return 0;
}
output:
Constructor: Base
Constructor: Derived
Destructor : Derived
Destructor : Base
Destructor : Base
Destructor : Derived
Destructor : Base
请解释一下 |
|
l*******3 发帖数: 1074 | 8 MySQL数据库:
select something from table where varchar1 like "%var1%" and varchar2 = "
var2" and varchar3 = "var3";
table里大约有百万条几率,每次查询都需要几秒钟;我已经分别为varchar1,
varchar2,varchar3建立了单列索引和多列索引(varchar1和varchar2和varchar3,以
及varchar2和varchar3),改进不明显。
多谢指点! |
|
B*********L 发帖数: 700 | 9 谢谢了。
看来的确没有特别简洁的办法。俺不死磕了,现在比较罗嗦的设了4个variable, in (
@var1,@var2,@var3,@var4),凑合用了。 |
|
l*****u 发帖数: 12114 | 10 export VAR1=1
export VAR2=2
export VAR3=3
现在要写个script, echo_var 2,要输出2.
就是说要输出VAR$1 , 用个shell function(), 怎么做? |
|
x**m 发帖数: 941 | 11 最直接可以用case。
要fancy的话,我试了试varible substitution, 不过好像都不work。需要高人指点。
#!/bin/bash
VAR1=11
VAR2=22
VAR3=33
var=$1
v=VAR$var
echo ${$v}
echo ${VAR$var}
echo ${VAR{var}}
echo ${VAR{`echo $var`}} |
|
x**m 发帖数: 941 | 12 多谢,似乎可以了。不过有更简单点的办法没有?
#!/bin/bash
VAR1=11
VAR2=22
VAR3=33
var=$1
echo $(eval "echo \$$(echo VAR${var})") |
|
v*****r 发帖数: 1119 | 13 简单,用eval
#!/bin/bash
VAR1=11
VAR2=22
VAR3=33
eval echo \$VAR$1 |
|
b********1 发帖数: 291 | 14 你们都是聪明人,谁能帮忙写个例子。 就算是1对多的join.
a的变量假设是var1, var2,var3
b的变量假设是var4,var5,var6
假设var1,var4是primary key, foreign key的关系。
不管是用scala, python 还是rdd spark.
我打算先看懂哪个就学哪个.
本人编程零基础, 只会写query.
说穿了,我就是想要 create table c as
select a.*, b.var5, b.var6
from a
join b
on a.var1=b.var4
然后把c下载到excel里面看看.
网上看了半天教程, 都是天马行空的东西。
我就奇了怪了, 这么简单的事情,hadoop上怎么就这么难实现? |
|
a**n 发帖数: 313 | 15 In Bourne shell..
#############################
#if you file is like "a b c d"
#!/bin/sh
while read var1 var2 var3 var4
do
: #the command you want
done < filename
############################## |
|
a*********n 发帖数: 1331 | 16 第一次用heckman two step to correct selection bias. 我用的stata12。可是总是
出错 。帮我看看哪里不对?
error message "Dependent variable never censored because of selection: model
would simplify to OLS regression"
我的情况是有一半的cases missing dependent variable information(outcome),所以
是truncated的情况。 这种情况那个dependent variable应该怎么code? 我的是0=no
,1=yes, 9=truncated missing outcome。 这样code对吗? 要不要把9这个category直
接改成
。(missing)?
另外我的程序如下,到底哪里错了?
heckman outcome iv1 iv2 iv3, select(var1 var2 var3) twostep |
|
w*****1 发帖数: 473 | 17 【 以下文字转载自 Statistics 讨论区 】
发信人: wz99331 (dotti), 信区: Statistics
标 题: 请教proc transpose 问题
发信站: BBS 未名空间站 (Wed Oct 25 15:16:10 2017, 美东)
我想用proc transpose 把long data 转化为wide data,但是转化以后的column name
变成了var1, var2 var3 var4....,而不是原来的probe_id。我用了profix=probe_id,
结果column name 变成了probe_id1, probe_id2...,而不是原来的PROBE_ID,我希望转
化以后的column name 是ILMN_1762337,ILMN_2055271......
下面是 long data的部分数据,从第三个变量开始是sample name,下面的数据是gene
expression level,一共有几百个sample, 几十万个probe.
PROBE_ID SYMBOL 5117-H471Fwk12-... 阅读全帖 |
|
w*****1 发帖数: 473 | 18 【 以下文字转载自 Statistics 讨论区 】
发信人: wz99331 (dotti), 信区: Statistics
标 题: 请教proc transpose 问题
发信站: BBS 未名空间站 (Wed Oct 25 15:16:10 2017, 美东)
我想用proc transpose 把long data 转化为wide data,但是转化以后的column name
变成了var1, var2 var3 var4....,而不是原来的probe_id。我用了profix=probe_id,
结果column name 变成了probe_id1, probe_id2...,而不是原来的PROBE_ID,我希望转
化以后的column name 是ILMN_1762337,ILMN_2055271......
下面是 long data的部分数据,从第三个变量开始是sample name,下面的数据是gene
expression level,一共有几百个sample, 几十万个probe.
PROBE_ID SYMBOL 5117-H471Fwk12-... 阅读全帖 |
|
s*****n 发帖数: 2174 | 19 1. names(data)[1] <- "newname" 就可以, 如果你不喜欢用数字index, 也可以这样
names(data)[names(data)=="var1"] <- "newname" 或者
names(data) <- gsub("var1", "newname", names(data)) 都可以
2. 你说那个有个条件, 就是BY variable必须是相同的. 考虑如果data1, data2,
data3之间做一个merge. data1和data2之间用var1和var2来做index match, 而data1和
data3之间用var3来做index match. 反正就是这种比较复杂的merge, 每个data之间的
BY variable都不确定. 很难定义一个函数来handle多个data, 除非这个函数本身提供
很多很多参数.
3. 除了SAS, 还有别的语言有你说的这种"最近的data"的概念吗?
是最近一个赋值(写)的, 还是最后一个取值(读)的? 比如
data3 <- merge(data1, data2)
print(data2 |
|
b*******g 发帖数: 513 | 20 filename aa "path.\XX.csv";
data _null_;
set dataset1;
file aa dlm=",";
put var1 var2 var3...;
run;
maybe, this will help.
Good Luck!
文件也行。
件。 |
|
s****y 发帖数: 21 | 21 proc means data=...;
var ......;
output out=dataname sum(var1 var2 var3)=P01 P02 P03;
Run; |
|
a***r 发帖数: 420 | 22 我理解错了,继续抛砖引玉:
data a(keep=a);
input A $ 15. B C $;
datalines;
11/asdsd/890.00 89 gh
123/yuu/8.9 89 ji
;
run;
data a;
set a;
file "e:\temp.txt";
put a;
run;
data b;
infile "e:\temp.txt" dlm='/';
input var1 var2 $ var3;
run; |
|
a********s 发帖数: 188 | 23 library(reshape)
x <- data.frame(id=1:2, var1 = 1:2, var2=3:4, var3=5:6)
melt(x, id=c("id"))
这样应该可以符合你例子的要求。 |
|
w*****y 发帖数: 130 | 24 比方说,
var1 var2 var3
a miss 1.1
a 5 miss
a 6 miss
b 5 0
b 5 0
b 7 miss
.
.
.
如果变量3大于零(1.1),那么所在的组所有观察值得变量3赋值都等于1.1
谢谢! |
|
l******0 发帖数: 313 | 25 Hello,
When I am doing logit regression using SAS, what is the differences between
creating dummy variables and using CLASS statement for categorical data(say,
if I have 4 to 5 categories)?
When I am creating dummy variables, should I always use this command to set
all the variables to 0:
if var>. then do;
var1=o;
var2=0;
var3=0;
...
what does the "." mean? why not 0 itself?
Thank you very much. |
|
p*****o 发帖数: 543 | 26 我有两个DATASET, 1 AND 2. 其中2是1的子集。如何用PROC SQL来生成一个新的DATA
SET=DATASET1 - DATASET2.。。。
DATASET1中有10个变量(VAR1,VAR2,...,VAR10),DATASET2中有三个变量(VAR1,VAR2,
VAR3--跟DATASET1中对应的)
试了
PROC SQL;
SELECT * FROM DATASET1 EXCEPT SELECT VAR1 FROM DATASET2;
是不是EXCEPT中只能最后选出一个变量?(PROC SQL;
SELECT VAR1 FROM DATASET1 EXCEPT SELECT VAR1 FROM DATASET2;) |
|
f*******e 发帖数: 51 | 27 try this:
data countmiss;
input var1 var2 var3 var4 var5 var6;
cards;
0 0 0 0 0 1.2
0 0 0 0 5.8 4.7
58.8 0 0 30 0 33.3
100 0 0 100 0 66.6
;
run;
data _null_;
set countmiss;
array var(*) var1-var6;
call symput("list"||strip(_N_),"");
do i=1 to dim(var);
if var(i)>0 and var(i) <50 then do;
call symput("list"||strip(_N_),symget("list"||strip(_N_))||" "||"var"||strip
(i));
end;
end;
run;
%put &list1 &list2 &list3 &list4; |
|
z****n 发帖数: 67 | 28 多谢楼上提醒,把楼上的code改成下面的就可以运行啦!
data countmiss;
input var1 var2 var3 var4 var5 var6;
cards;
0 0 0 0 0 1.2
0 0 0 0 5.8 4.7
58.8 0 0 30 0 33.3
100 0 0 100 0 66.6
;
run;
data _null_;
set countmiss;
array v(*) var1-var6;
call symput("list"||compress(_N_),"");
do i=1 to dim(v);
if 0< v(i) <50 then do;
call
symput("list"||compress(_N_),left(trim(symget("list"||compress(_N_))||"
"||"
var"||compress(i))));
end;
end;
run;
%put list1=&list1 ;
%put list2=&list2 ;
%put list3=&list3 ;
%put list4=&lis |
|
Y****a 发帖数: 243 | 29 transpose 里有 ‘by’ statement
proc transpose data=yourdatename out=outdataname;
var var1 - var3;
by Seller Year;
run;
proc transpose data=outdataname out=newdataname;
var col1;
by Seller;
run;
try something like this, you may need to drop a few variables |
|
b*t 发帖数: 489 | 30 In a data set, I want to combine two character variables into one.
For example, var1 = "a", var2="b", and I want to create a new variable
var3="ab". This is similar to the "B1&B2" command, but I just don't know
how to implement this in SAS.
Any suggestion is highly appreciated! |
|
|
|
s*******2 发帖数: 791 | 33 proc means data=test noprint;
output out = summary(drop = _:)
mean(var1 var2) =
n(var1) =
median(var1 var3) =
mode(var1 var5) / autoname;
run; |
|
k*******a 发帖数: 772 | 34 data step 不清楚, 用sql很容易啊
proc sql;
create table new as
select var1,var2,var4,var3
from a;
quit; |
|
w*****e 发帖数: 806 | 35 DATA RENEW;
RETAIN VAR1 VAR2 VAR3 VAR4;
SET OLD;
RUN; |
|
y******6 发帖数: 47 | 36 I have a question about mixed models too. For example, Var1 is location(
cities), Var2 is treatment, Var3 is Time (season) Var4 is Year(2008,2009,
2010). I want to make comparsions to see whether 2008,2009 and 2010 have the
same mean or not within each treatment.
proc mixed data= method=type3;
where treatment='Treatment1';
class Location Time Year;
Model counts=time year time*year;
random location;
run;
Type3 tests of fix effects: year, p-value<0.05
While if I change the random from location to ... 阅读全帖 |
|
p***r 发帖数: 920 | 37 purpose: 想要做一个macro 循环调用 display, 输出不同的 survey answer as text,
然 后再根据其内容,人工的输入yes/no,以便于以后的数据分析。
problem: 整个 macro 可以运行,唯一的问题是,display 的 window 在循环调用的时
候不能正确的显示 survey answer. (我是将所有的 survey answer 输出到一连串
macro variable 里面去)。
Any solution or suggestion in proving the code is appreciated
code is here
################################
data surveydata;
input x $40.;
cards;
This is programe is useless
I dont think so
Maybe its usefull
Not very much
;run;
%macro survey;
data _null_;
set surveydat... 阅读全帖 |
|
p***r 发帖数: 920 | 38 there are 4 columns in the table
CLASS WGT1 VAR2 VAR3 |
|
c******d 发帖数: 98 | 39 i think he needs the following:
keep var1 var2 var3;
* var is the variable you want to keep in output |
|
f*********8 发帖数: 165 | 40 sample1 sample2 sample3....sample100
var1
var2
var3
.
.
.
var1000
我想quantile normalize 这100个sample,得到一个reference distribution。 以后有
新的sample,就用这个reference distribution做quantile normalization。
请问版上牛人,这个reference distribution怎末整出来,能给个R code 的例子吗?
多谢。 |
|
S********a 发帖数: 359 | 41 model y=var1 var2 var3 /solution;
sorry for misleading you. |
|
e********6 发帖数: 24 | 42 首先排序var3.
用retain 保留前值,比较,满足条件输出 |
|
x*******u 发帖数: 500 | 43 my data:
var1 var2 var3 var4
2 4 6 7
4 9 7 6
5 2 1 1
如何得到一个新的variable, 它的值是var1-var4中有最大值的那个variable的名字。
结果应该是
newvar
var4
var2
var1
谢谢 |
|
d******9 发帖数: 404 | 44 I used array to do it, same results:
proc format;
value position
1='A'
2='B'
3='C'
4='D';
run;
data E(drop=I) ;
set A;
array X(4) A B C D;
do I=1 to 4;
if X(I)= max(A, B, C, D) then Position=I;
end;
Max_Var=put(position, position.);
run;
However, what if the values have ties? say:
var1 var2 var3 var4
9 9 3 4
9 2 9 8
9 9 9 9
???????? |
|
o****o 发帖数: 8077 | 45 data _xxx;
input var1 var2 var3 var4;
cards;
2 4 6 7
4 9 7 6
5 2 1 1
7 3 7 3
;
run;
proc transpose data=_xxx out=_xxx2;
run;
proc means data=_xxx2 noprint;
var col1-col4;
output out=_xxx3(keep=v1-v4)
maxid(col1(_name_)
col2(_name_)
col3(_name_)
col4(_NAME_))= v1-v4/autoname;
run;
proc transpose data=_xxx3 out=_xxx3t;
var v1-v4;
run;
d... 阅读全帖 |
|
x*******u 发帖数: 500 | 46 我试了, 还是老样子。 能不能把你的code贴上来看看, 谢谢。
还有, 如果我ignore数据的形式, 计算 a=(hour*60+min)*60+sec;
error message is:
NOTE: Invalid numeric data, hour='.1.2.' , at line 663 column 8.
var1= 1 0 / 3 0 / 2 0 0 9 VAR2= 1 2 : 2 1 : 2 9 P M VAR3= 0 . 0 0 VAR4=
6 8 . 0 month= 1 0 day= 3 0 year= 2 0 0 9 AMPM= P M hour= 1 2 min= 2 1
sec= 2 9 var6= 1 2 : 2 1 : 2 9 a=. _ERROR_=1 _N_=2 |
|
g****8 发帖数: 2828 | 47 我是问你都有什么样子的var,你给的那个例子,我怎么觉得不对。
如果是freq的,为什么var1 跟var2的total不一样。
比如说是不是你有var1 var2 var3 gender,对gender做表格? |
|
a*****3 发帖数: 601 | 48 什么是 x window ??
反正我觉得sas造了个黑盒子, 文档也写得不明不白, 比如最简单的format/informat,
这连个东西什么区别,里面的实现机制, 文档里可是语焉不详, 我看了好几遍都没
看懂. 问个最最简单的,如何定义 numeric/character informat/format = ? 我出20
伪币 看谁能把定义从文档里找出来.
再比如说排序这个事, 里面的mechanics也不清楚, 比如说,排玩序了,是在dataset
里面做了记号,还是在pdv里面做了什么手脚? First. Last. 然后是根据什么生成的
?尤其象by var1 var2 var3, 然后引用First.Var2 这种情况 文档里可是一点没说
呀.
还有通过ods结合各种procs生成的dataset, 里面是什么变量,什么命名规则,目前还
没研究过文档..估计也是写得一塌糊涂.
总之 这个黑盒子也许不错, 就是太黑了。 |
|
h********o 发帖数: 103 | 49 You can use array like this:
==================================
data test;
input var1 var3 var2 var10 var6;
array temp(5) var1 -- var6;
array logvar(5);
do i = 1 to dim(temp);
logvar(i) = log(temp(i));
end;
drop i var1 -- var6;
cards;
1 2 3 4 5
; |
|
z**********i 发帖数: 88 | 50 I am trying to run linear regression with the factor as the dependent
variable. This factor was obtained from factor analysis.
________________________________________________________________________
1. There are 10 variables var1 through var10, each variable has 7 categories
. Below is the format:
1="Disagree very much"
2="Disagree"
3="Somewhat disagree"
4="Neutral"
5="Somewhat agree"
6="Agree"
7="Agree very much";
These 10 variables are highly correlated.
2. Factor analysis comes up with 3 fac... 阅读全帖 |
|