BookmarkSubscribeRSS Feed
deleted_user
Not applicable
My table contains 8 variables: ID, VAR1-VAR7.
VAR5 is a character variable and the others are numerical.
Each ID have several records. I want to have only one record for each ID in the following way:

VAR1 = MIN(VAR1)
MIN(VAR3) should be assigned to a new variable, lets say AVAR3=MIN(VAR3).

If VAR4=1 or VAR4=2 in ANY of ID:s records, then I should find the MAX(VAR3) among those. That value should be assigned to a new variable, lets say BVAR3=MAX(VAR3) and VAR2, VAR4, VAR6 and VAR7 should be kept from the same record.

If VAR4~=1 and VAR4~=2 in ALL of ID:s records, then I should find the MAX(VAR2) among those. The assignment to BVAR3 then should be BVAR3=VAR3 from that record and VAR2, VAR4, VAR6 and VAR7 should be kept from the same record.

If ANY of ID:s VAR5='A' or 'B' then I would like to assign the value 1 to the new numerical variable AVAR5, otherwise AVAR5 should have the value 0.

Thanks for any help, with this rather complicated problem.
3 REPLIES 3
CurtisMack
Fluorite | Level 6
Without sample data I couldn't write test this, so consider it pseudo code.

proc sort data = mytable;
by ID;
run;
proc sql;
create table Var3ORVar2 as
select ID,MAX(VAR4 = 1 or VAR4 = 2) as VAR4_1OR2,
max(var3) as BVAR3,
max(var2) as MaxVar2,
(VAR5 in ('A','B') as AVAR5
from mytable
group by ID;
quit;
data results(keep = ID BVAR3 AVAR5 VAR2 VAR4 VAR6 VAR7);
merge mytable Var3ORVar2;
by ID;
if VAR4_1OR2 then do;
if var3 = MaxVar3 then output;
end;
else do;
BVAR3 = var3;
if var2 = MaxVar2 then output;
end;
run;

It could also be done in a single pass through the data using arrays or hashes, but I think the two step approach is best in this case.

Curtis
deleted_user
Not applicable
I don't understand why you have MAX(VAR4=1 or VAR4=2).

VAR4 is a numerical variabel that can have any integer value. Depending on if VAR4 is in {1,2} in any of ID:s records or not, different actions should be taken.

If VAR4 had been a character variable with possible values {q,w,e,r,t,y} and if depending on if VAR4 had been in {q,w} or not, you should have made different selections, how would you then replace MAX? The answer to that question could possibly clarify your use of MAX(VAR4=1 or VAR4=2).

The VAR1=MIN(VAR1) is missing. Perhaps i should introduce a new variable AVAR1=MIN(VAR1).

I would also like to sum the variable VAR7 from records where VAR3 le BVAR3; BVAR3 assigned depending on the value of VAR4, as described in my first message.

Thanks for your help.
deleted_user
Not applicable
Due to the complexity of your conditions, I would recommend you NOT use proc sql, but use a data step.

proc sort data=table; by id;
data condensed;
set table;
by id;

retain min_var1 min_var3 max_var3 flag rem2a rem4a rem6a rem7a max_var2 rem3 rem2b rem4b rem6b rem7b avar5;

if first.id then do;
min_var1 = var1;
min_var3 = var3;
max_var3 = var3;
flag = 0;
rem2 = var2;
rem4 = var4;
rem6 = var6;
rem7 = var7;
max_var2 = var2;
rem3 = var3;
avar5 = 0;
end;

if var1 < min_var1 then min_var1 = var1;
if var3 < min_var3 then min_var3 = var3;

if var4 = 1 or var4 = 2 then do;
flag=1;
if var3 > max_var3 then max_var3 = var3;
rem2a = var2;
rem4a = var4;
rem6a = var6;
rem7a = var7;
end;

if var2 > max_var2 then do;
rem3 = var3;
rem2b = var2;
rem4b = var4;
rem6b = var6;
rem7b = var7;
end;

if var5 = 'A' or var5 = 'B' then avar5 = 1;

if last.id then do;
var1 = min_var1;
avar3 = min_var3;
if flag = 1 then do;
bvar3 = max_var3;
var2 = rem2a;
var4 = rem4a;
var6 = rem6a;
var7 = rem7a;
output;
end;
if flag = 0 then do;
bvar3 = rem3;
var2 = rem2b;
var4 = rem4b;
var6 = rem6b;
var7 = rem7b;
output;
end;
end;
drop min_var1 min_var3 max_var3 rem2a rem4a rem6a rem7a max_var2 rem3 rem2b rem4b rem6b rem7b;
run;

The beauty of SAS is that you have so many ways to work with your data. You have the choice of using SQL when it is appropriate -- for joins, selects, database access, etc. -- and data steps when it is appropriate -- reading files, parsing, complicated data selections and summaries like this case, data groupings difficult to implement in SQL, etc.

"Use the right tool for the job" it's easier, quicker and the results are soooooo much better.

sas-innovate-2024.png

Join us for SAS Innovate April 16-19 at the Aria in Las Vegas. Bring the team and save big with our group pricing for a limited time only.

Pre-conference courses and tutorials are filling up fast and are always a sellout. Register today to reserve your seat.

 

Register now!

What is Bayesian Analysis?

Learn the difference between classical and Bayesian statistical approaches and see a few PROC examples to perform Bayesian analysis in this video.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 3 replies
  • 683 views
  • 0 likes
  • 2 in conversation