- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
Hi,
I have a Variable_ID and 8 quarters of data. This data is the output of a proc compare and I have created a varlist based on required quarters. i need the number of non missing and max for the varlist. I have attached the data and the desired output.
I have tried in two different ways but none of them giving me the correct results. It is considering all the columns.
Code1:
data want; |
set date; |
Count=N(Varlist); |
Max=MAX(Varlist); |
Run; |
Code2:
Proc Sql; |
Select count(variable_id) INTO: count from date; quit; |
Data_null_; |
call symput('count1',left(count)); |
run; |
Proc Sql; |
Select varlist INTO: list1 :list&count. from date; quit; |
%macro dlt; |
data want; |
set date; |
%do I = 1 to &count1.; |
count=N(of &&list&I.); |
max=Max(of &&list&I.); |
%end; |
%mend; |
%dlt; |
Am i missing something or is there any other way to achieve the desired output? please help
Accepted Solutions
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
So you want to use the VARLIST as the names of the variables to check?
You can use the VVALUEX() function to find the value of variable whose name is contained in the string argument to the function.
You can use the INPUT() function to convert the result back into a number.
data expect;
input Variable_ID $ Count Max ;
cards;
1aw 2 3
5td 1 1
t6v 1 9
n4a 5 6
p24 1 2
;
data have;
infile cards truncover;
input variable_id $ _2014Q1-_2014Q4 _2015Q1-_2015Q4 Varlist $100.;
cards;
1aw 3 . 2 . 5 . . . _2014Q1,_2014Q2,_2014Q3,_2014Q4
5td . . . 1 . 8 . . _2014Q1,_2014Q2,_2014Q3,_2014Q4,_2015Q1
t6v 9 . . . . . . . _2014Q1,_2014Q2,_2014Q3,_2014Q4
n4a 2 6 5 4 4 . . . _2014Q1,_2014Q2,_2014Q3,_2014Q4,_2015Q1,_2015Q2,_2015Q3,_2015Q4
p24 . . . . . . . 2 _2015Q1,_2015Q2,_2015Q3,_2015Q4
;
data want;
set have ;
do i=1 to countw(varlist,',');
value=input(vvaluex(scan(varlist,i,',')),??32.);
if not missing(value) then do;
count=sum(count,1);
max=max(max,value);
end;
end;
drop value;
run;
proc compare data=want compare=expect;
run;
The COMPARE Procedure Comparison of WORK.WANT with WORK.EXPECT (Method=EXACT) Observation Summary Observation Base Compare First Obs 1 1 Last Obs 5 5 Number of Observations in Common: 5. Total Number of Observations Read from WORK.WANT: 5. Total Number of Observations Read from WORK.EXPECT: 5. Number of Observations with Some Compared Variables Unequal: 0. Number of Observations with All Compared Variables Equal: 5. NOTE: No unequal values were found. All values compared are exactly equal.
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
Sorry i don't understand what you are trying to do. Please post code using the window opened by the running man icon and post data as data step with datalines.
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
Data date;
Input variable_id $3. @5 _2014Q1 @7 _2014Q2 @9 _2014Q3 @11 _2014Q4 @13 _2015Q1 @15 _2015Q2 @17 _2015Q3 @19 _2015Q4 @21 Varlist $100.;
Datalines;
1aw 3 . 2 . 5 . . . _2014Q1,_2014Q2,_2014Q3,_2014Q4
5td . . . 1 . 8 . . _2014Q1,_2014Q2,_2014Q3,_2014Q4,_2015Q1
t6v 9 . . . . . . . _2014Q1,_2014Q2,_2014Q3,_2014Q4
n4a 2 6 5 4 4 . . . _2014Q1,_2014Q2,_2014Q3,_2014Q4,_2015Q1,_2015Q2,_2015Q3,_2015Q4
p24 . . . . . . . 2 _2015Q1,_2015Q2,_2015Q3,_2015Q4
;
run;
/**Code1**/
data want;
set date;
Count=N(Varlist);
Max=MAX(Varlist);
Run;
/*code 2*/
Proc Sql;
Select count(variable_id) INTO: count from date; quit;
Data_null_;
call symput('count1',left(count));
run;
Proc Sql;
Select varlist INTO: list1 :list&count. from date; quit;
%macro dlt;
data want;
set date;
%do I = 1 to &count1.;
count=N(of &&list&I.);
max=Max(of &&list&I.);
%end;
%mend;
%dlt;
/*Desired output*/
Variable_ID Count Max
1aw 2 3
5td 1 1
t6v 1 9
n4a 5 6
p24 1 2
as requested i have posted using insert SAS code option.
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
Given that you have already created the data set DATE, I think this is what you are trying to accomplish:
data want;
set date;
array dates {8} _201: ;
count = 0;
do k=1 to 8;
if dates{k} > . and index(varlist, strip(vname(dates{k} ) ) ) then do;
count + 1;
maxval = max(maxval, dates{k});
end;
end;
drop k;
run;
It's untested code, but looks about right. Feel free to ask if you have questions.
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
@SAS_prep wrote:
it says ERROR: Too many variables defined for the dimension(s) specified for the array dates.
Do you have other variable names that also start with _201 ? Do you want to include them in the array? Or not?
Note that there is no need to include a dimension when you have given a list of variable names. SAS can count for you.
Some people use a * where the dimension is listed, but you don't even need to do that. Just use the array name and the variable list.
array dates _201: ;
You only need to specify the dimension(s) when for special situations. Like:
- to automatically name the variables by adding numeric suffix to the array name.
- Or create a temporary array.
- Or you want to use non-standard indexes.
- Either multiple dimensions
- or non-standard lower/upper bounds on the indexes.
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
So you want to use the VARLIST as the names of the variables to check?
You can use the VVALUEX() function to find the value of variable whose name is contained in the string argument to the function.
You can use the INPUT() function to convert the result back into a number.
data expect;
input Variable_ID $ Count Max ;
cards;
1aw 2 3
5td 1 1
t6v 1 9
n4a 5 6
p24 1 2
;
data have;
infile cards truncover;
input variable_id $ _2014Q1-_2014Q4 _2015Q1-_2015Q4 Varlist $100.;
cards;
1aw 3 . 2 . 5 . . . _2014Q1,_2014Q2,_2014Q3,_2014Q4
5td . . . 1 . 8 . . _2014Q1,_2014Q2,_2014Q3,_2014Q4,_2015Q1
t6v 9 . . . . . . . _2014Q1,_2014Q2,_2014Q3,_2014Q4
n4a 2 6 5 4 4 . . . _2014Q1,_2014Q2,_2014Q3,_2014Q4,_2015Q1,_2015Q2,_2015Q3,_2015Q4
p24 . . . . . . . 2 _2015Q1,_2015Q2,_2015Q3,_2015Q4
;
data want;
set have ;
do i=1 to countw(varlist,',');
value=input(vvaluex(scan(varlist,i,',')),??32.);
if not missing(value) then do;
count=sum(count,1);
max=max(max,value);
end;
end;
drop value;
run;
proc compare data=want compare=expect;
run;
The COMPARE Procedure Comparison of WORK.WANT with WORK.EXPECT (Method=EXACT) Observation Summary Observation Base Compare First Obs 1 1 Last Obs 5 5 Number of Observations in Common: 5. Total Number of Observations Read from WORK.WANT: 5. Total Number of Observations Read from WORK.EXPECT: 5. Number of Observations with Some Compared Variables Unequal: 0. Number of Observations with All Compared Variables Equal: 5. NOTE: No unequal values were found. All values compared are exactly equal.
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
maxvar = vname(v[whichn(max, of v(*))]); will work?
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
@SAS_prep wrote:
how do i get variable name for the column max?
maxvar = vname(v[whichn(max, of v(*))]); will work?
That should work. But it will search for matching max value over the whole array. That is it will NOT be limited to just the variables listed in the value of VARLIST for this particular observation.
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
_2014Q1 _2014Q2 _2014Q3 _2014Q4 _2015Q1
210.48 82.16 1148.04 10.34 12.77
the max value for the above is 1148.04 when i run the code it is giving 210.48
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
Does below code give you what you're after?
Data have;
infile datalines truncover;
Input variable_id $3. @5 _2014Q1 @7 _2014Q2 @9 _2014Q3 @11 _2014Q4 @13 _2015Q1 @15 _2015Q2 @17 _2015Q3 @19 _2015Q4 @21 Varlist $100.;
Datalines;
1aw 3 . 2 . 5 . . . _2014Q1,_2014Q2,_2014Q3,_2014Q4
5td . . . 1 . 8 . . _2014Q1,_2014Q2,_2014Q3,_2014Q4,_2015Q1
t6v 9 . . . . . . . _2014Q1,_2014Q2,_2014Q3,_2014Q4
n4a 2 6 5 4 4 . . . _2014Q1,_2014Q2,_2014Q3,_2014Q4,_2015Q1,_2015Q2,_2015Q3,_2015Q4
p24 . . . . . . . 2 _2015Q1,_2015Q2,_2015Q3,_2015Q4
;
run;
data want(drop=_i);
set have;
length count_vars 8 max_val 8 max_varName $32;
array dtvars {*} _201:;
count_vars=countw(varlist,',');
do _i=1 to dim(dtvars);
if find(varlist,vname(dtvars[_i]),'i')>0 then
do;
if max_val<dtvars[_i] then
do;
max_val=dtvars[_i];
max_varName=vname(dtvars[_i]);
end;
end;
end;
run;
proc print data=want;
run;
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
Code1: data want; set date; Count=N(Varlist); Max=MAX(Varlist); Run;
Instead of the above, try with the following:
data want; set date; count = n(of _:); max = max(of _:); *put count = max =; run;