BookmarkSubscribeRSS Feed
☑ This topic is solved. Need further help from the community? Please sign in and ask a new question.
SAS-questioner
Obsidian | Level 7

Hi, the situation is little bit complicated, so the title might not describe it well. I have two dataset with same subjects, one dataset is cleaned with one row and one unique subject ID and group variables, like below:

ID

Group 1

Group 2

Group 3

1

1

0

0

2

0

1

0

3

0

0

1

4

0

0

1

and other one is not cleaned with multiple same IDs with a condition variable, like below:

ID

Condition

1

Long strings example 1

2

Short string example 2

3

Long strings example 1

3

Long strings example 1

3

Short strings example 2

4

Long strings example 1

4

Short strings example 2

Original condition contains really long strings so I used "long strings example 1" and "short string example  2" as examples. Generally, if the ID is in the group 1, this subject is "long strings example 1" only; if the ID in the group 2, this subject is "short strings example 2" only; if the ID in the group 3, this subject has both "long strings example 1" and "short strings example 2" several times. 

 

Here is what I want to do, I only want two groups: group 1 and group 2. Group 3 here just is used to mark this ID has both "long" and "short" several times(but I don't know how many times for each "long" and "short"). So if the ID has 2 or more "long strings example 1", I want to divide it into group 1. Here, because ID 3 has more than 2 "long"s, the ID 3 should be divided into group 1. And if ID has one "short strings example 2" and the "long strings example 1" is less than 2, then this ID is belongs to group 2. Here, because ID 4 has only 1 "long", and one "short", it belongs to group 2. So, I need to work on the two datasets together.

 

Eventually, the dataset should look like belows:

ID

Group 1

Group 2

Group 3

Final group

1

1

0

0

1

2

0

1

0

0

3

0

0

1

1

4

0

0

1

0

 

In the final group, 1 is group 1, and 0 is group 2.

 

Could anyone helps me with it? Hopefully, I made myself clearly. Thank you!

1 ACCEPTED SOLUTION

Accepted Solutions
Ksharp
Super User

Plz Make a date step for your data.

Nobody would like to type it for you .

and Nobody would like to answer this kind of question.

 

data have;
infile cards truncover;
input #1 ID #2 Condition $40. ;
cards;
1
Long strings example 1
2
Short string example 2
3
Long strings example 1
3
Long strings example 1
3
Short strings example 2
4
Long strings example 1
4
Short strings example 2
;



data short long;
 set have;
 if condition=:'Long' then output long;
 if condition=:'Short' then output short;
run;
data temp;
ina=0;inb=0;
 merge long(in=ina) short(in=inb);
 by id;
long=ina;
short=inb;
run;
proc summary data=temp;
by id;
var long short;
output out=temp2(drop=_:) sum=;
run;
data want;
 set temp2;
 if long>short then final_group=1;
  else final_group=0;
keep id final_group;
run;

 

 

View solution in original post

4 REPLIES 4
Ksharp
Super User

Plz Make a date step for your data.

Nobody would like to type it for you .

and Nobody would like to answer this kind of question.

 

data have;
infile cards truncover;
input #1 ID #2 Condition $40. ;
cards;
1
Long strings example 1
2
Short string example 2
3
Long strings example 1
3
Long strings example 1
3
Short strings example 2
4
Long strings example 1
4
Short strings example 2
;



data short long;
 set have;
 if condition=:'Long' then output long;
 if condition=:'Short' then output short;
run;
data temp;
ina=0;inb=0;
 merge long(in=ina) short(in=inb);
 by id;
long=ina;
short=inb;
run;
proc summary data=temp;
by id;
var long short;
output out=temp2(drop=_:) sum=;
run;
data want;
 set temp2;
 if long>short then final_group=1;
  else final_group=0;
keep id final_group;
run;

 

 

SAS-questioner
Obsidian | Level 7

Work as charm, thank you! And I will make a data step later in the future if I have similar questions.

FreelanceReinh
Jade | Level 19

Hi @SAS-questioner,

 

Here's another suggestion (using abbreviated "long" and "short" strings).

/* Create sample data for demonstration */

data have;
input id cond $ @@;
cards;
1 L 2 S 3 L 3 L 3 S 4 L 4 S 5 X
;

/* Aggregate on ID level and assign IDs to groups */

data want;
do until(last.id);
  set have;
  by id;
  L=sum(L,cond='L');
  S=sum(S,cond='S');
end;
group1 =  L & ~S;
group2 = ~L &  S;
group3 =  L &  S;
Fgroup = group1 | L>=2;
if sum(of g:)=0 then Fgroup=.;
keep id f: g:;
run;
Tom
Super User Tom
Super User

Is the question how to make the first dataset from the second?

Do you know the conditions to test for in advance?  If so the coding is simple.

data have;
  input ID Condition $40.;
cards;
1 Long strings example 1
2 Short strings example 2
3 Long strings example 1
3 Long strings example 1
3 Short strings example 2
4 Long strings example 1
4 Short strings example 2
;

data want;
  do until(last.id);
    set have;
    by id;
    if condition='Long strings example 1' then group1=1;
    else if condition='Short strings example 2' then group2=1;
  end;
  group3= (group1 and group2);
  group1=group1 and not group3;
  group2=group2 and not group3;
  drop condition;
run;

SAS Innovate 2025: Save the Date

 SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!

Save the date!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

SAS Training: Just a Click Away

 Ready to level-up your skills? Choose your own adventure.

Browse our catalog!

Discussion stats
  • 4 replies
  • 632 views
  • 1 like
  • 4 in conversation