Hi,
I'm trying to output the minimum aval record and then a maximum aval record into the existing dataset.
For example lets say i have this set of data:
subjid paramcd aval
1 ALT 34
1 ALT 39
1 ALT 28
1 ALT 28
1 ALT 40
1 ALT 40
1 APT 15
1 APT 15
1 APT 15
1 APT 9
1 APT 9
1 APT 15
1 APT 12
1 ABC 9
1 ABC 9
1 ABC 9
1 ABC 9
I want to create the following dataset as:
subjid paramcd aval dtype
1 ALT 28
1 ALT 28
1 ALT 34
1 ALT 39
1 ALT 40
1 ALT 40
1 ALT 28 MINIMUM
1 ALT 40 MAXIMUM
1 APT 9
1 APT 9
1 APT 12
1 APT 15
1 APT 15
1 APT 15
1 APT 15
1 APT 9 MINIMUM
1 ALT 9 MAXIMUM
1 ABC 9
1 ABC 9
1 ABC 9
1 ABC 9
1 ABC 9 MINIMUM
1 ABC 9 MAXIMUM
This is the current code i am using. First i'm sorting the dataset with
proc sort data=have;
by subjid paramcd aval;
run;
then doing this
data want;
set have;
by subjid paramcd;
if first.paramcd then DTYPE="MINIMUM"; output;
if last.paramcd then DTYPE="MAXIMUM"; output;
run;
However, it keeps giving me two Minimums as the first record gets minimum and then the second record that is being outputted as minimum. As I only want 2 new records that just state minimum and maximum and that's it.
Thank you in advance.
Please try this
data have;
input Row SUBJID $ VISIT $ AVISIT $ ADY PARAMCD $ AVAL ;
cards;
1 1001 Week_1 Week_1 6 SUPINE 130
2 1001 Week_2 Week_2 12 SUPINE 133
3 1001 Week_3 Week_3 18 SUPINE 131
4 1001 Week_1 Week_1 6 ALT 8
5 1001 Week_2 Week_2 12 ALT 8
6 1001 Week_3 Week_3 18 ALT 8
7 1001 Week_1 Week_1 6 APC 5
;
data want;
do _n_=1 by 1 until(last.PARAMCD);
set have;
by SUBJID PARAMCD notsorted;
if first.PARAMCD or aval<_min then do;
_minv=visit;
_min=aval;
end;
if aval>=_max then do;
_maxv=visit;
_max=aval;
end;
output;
end;
array m _maxv _minv;
do _n_=1 to _n_;
set have;
if visit in m then do;
dtype=ifc(_minv=visit,'min','max');
output;
end;
end;
if _n_=2 then do;
dtype='max';
output;
end;
drop _:;
run;
Hi @Dregerator Are you sure about this?
1 APT 9 MINIMUM
1 ALT 9 MAXIMUM
That doesn't make sense to me.
data have;
input subjid paramcd $ aval;
cards;
1 ALT 34
1 ALT 39
1 ALT 28
1 ALT 28
1 ALT 40
1 ALT 40
1 APT 15
1 APT 15
1 APT 15
1 APT 9
1 APT 9
1 APT 15
1 APT 12
1 ABC 9
1 ABC 9
1 ABC 9
1 ABC 9
;
data want;
do until(last.paramcd);
set have ;
by subjid paramcd notsorted;
_min=min(_min,aval);
_max=max(_max,aval);
output;
end;
aval=_min;
do dtype='min','max';
output;
aval=_max;
end;
drop _:;
run;
subjid | paramcd | aval | dtype |
---|---|---|---|
1 | ALT | 34 | |
1 | ALT | 39 | |
1 | ALT | 28 | |
1 | ALT | 28 | |
1 | ALT | 40 | |
1 | ALT | 40 | |
1 | ALT | 28 | min |
1 | ALT | 40 | max |
1 | APT | 15 | |
1 | APT | 15 | |
1 | APT | 15 | |
1 | APT | 9 | |
1 | APT | 9 | |
1 | APT | 15 | |
1 | APT | 12 | |
1 | APT | 9 | min |
1 | APT | 15 | max |
1 | ABC | 9 | |
1 | ABC | 9 | |
1 | ABC | 9 | |
1 | ABC | 9 | |
1 | ABC | 9 | min |
1 | ABC | 9 | max |
This is exactly what I want to do and I believe this is my fault of making an incomplete dataset as I tried to simplify what i wanted. So your code actually does what I want but, it's not copying the entire row...for example I have 49 different variables in this dataset, however, when you're outputting the minimum value, it's copying everything that is in the maximum row:
For example: lets say I have this:
Row SUBJID VISIT AVISIT ADY PARAMCD AVAL
1 1001 Week_1 Week_1 6 SUPINE 130
2 1001 Week_2 Week_2 12 SUPINE 133
3 1001 Week_3 Week_3 18 SUPINE 131
4 1001 Week_1 Week_1 6 ALT 8
5 1001 Week_2 Week_2 12 ALT 8
6 1001 Week_3 Week_3 18 ALT 8
7 1001 Week_1 Week_1 6 APC 5
I want this:
Row SUBJID VISIT AVISIT ADY PARAMCD AVAL DTYPE
1 1001 Week_1 Week_1 6 SUPINE 130
2 1001 Week_2 Week_2 12 SUPINE 133
3 1001 Week_3 Week_3 18 SUPINE 131
4 1001 Week_1 Week_1 6 SUPINE 130 MINIMUM
5 1001 Week_2 Week_2 12 SUPINE 133 MAXIMUM
6 1001 Week_1 Week_1 6 ALT 8
7 1001 Week_2 Week_2 12 ALT 8
8 1001 Week_3 Week_3 18 ALT 8
9 1001 Week_1 Week_1 6 ALT 8 MINIMUM
10 1001 Week_3 Week_3 12 ALT 8 MAXIMUM
11 1001 Week_1 Week_1 6 APC 5
12 1001 Week_1 Week_1 6 APC 5 MINIMUM
13 1001 Week_1 Week_1 6 APC 5 MAXIMUM
The current code due to my lack of providing more context to the data creates the minimum line using all the data that is in the maximum line except for the aval value.
So your code is creating this:
Row SUBJID VISIT AVISIT ADY PARAMCD AVAL DTYPE
1 1001 Week_1 Week_1 6 SUPINE 130
2 1001 Week_2 Week_2 12 SUPINE 133
3 1001 Week_3 Week_3 18 SUPINE 131
4 1001 Week_2 Week_2 12 SUPINE 130 MINIMUM
5 1001 Week_2 Week_2 12 SUPINE 133 MAXIMUM
Please try this
data have;
input Row SUBJID $ VISIT $ AVISIT $ ADY PARAMCD $ AVAL ;
cards;
1 1001 Week_1 Week_1 6 SUPINE 130
2 1001 Week_2 Week_2 12 SUPINE 133
3 1001 Week_3 Week_3 18 SUPINE 131
4 1001 Week_1 Week_1 6 ALT 8
5 1001 Week_2 Week_2 12 ALT 8
6 1001 Week_3 Week_3 18 ALT 8
7 1001 Week_1 Week_1 6 APC 5
;
data want;
do _n_=1 by 1 until(last.PARAMCD);
set have;
by SUBJID PARAMCD notsorted;
if first.PARAMCD or aval<_min then do;
_minv=visit;
_min=aval;
end;
if aval>=_max then do;
_maxv=visit;
_max=aval;
end;
output;
end;
array m _maxv _minv;
do _n_=1 to _n_;
set have;
if visit in m then do;
dtype=ifc(_minv=visit,'min','max');
output;
end;
end;
if _n_=2 then do;
dtype='max';
output;
end;
drop _:;
run;
It's finally time to hack! Remember to visit the SAS Hacker's Hub regularly for news and updates.
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.