BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
P5C768
Obsidian | Level 7

Trying to convert a table in this format:

 

ID ON_OFF CATEGORY Group
1 OFF P A
1 ON G A
2 OFF C A
2 OFF C B
3 OFF O A
3 OFF R B
4 OFF C A
4 ON G B
5 OFF S A
5 OFF R A
5 ON G C
6 ON S A
6 ON S B
6 ON R A

 

To a single record per id table in this format:

 

ID ON_OFF CATEGORY Group
1 OFF_ON P_G A
2 OFF C A_B
3 OFF O_R A_B
4 OFF_ON C_G A_B
5 OFF_ON S_R_G A_A_C
6 ON S_R A_B_A

 

Note that in some cases, I just need the unique instances of the variable (such as with On_Off) but in other cases I need every instance (such as with Group).  I have been trying first./last. processing with by statements, but I can only get that to work with one variable at a time.  Is there a way to do this for N variables within the same data step?  Any insights are greatly appreciated!

 

Data Want;

	set Have;
	length cat source $500; 
	retain cat source;
	by ID notsorted category on_off;

	if first.ID then call missing(cat);
	if first.category then cat=catx('_',cat,category);

	if first.ID then call missing(source);
	if first.on_off then source=catx('_',source,on_off);

	if last.;

run;
1 ACCEPTED SOLUTION

Accepted Solutions
SuryaKiran
Meteorite | Level 14

Just DO UNTIL(last.ID) will be required to achieve this

Data have;
infile datalines dlm=',' dsd missover;
input ID	ON_OFF :$8.	CATEGORY :$8.	Group:$8.;
datalines;
1,OFF,P,A
1,ON,G,A
2,OFF,C,A
2,OFF,C,B
3,OFF,O,A
3,OFF,R,B
4,OFF,C,A
4,ON,G,B
5,OFF,S,A
5,OFF,R,A
5,ON,G,C
6,ON,S,A
6,ON,S,B
6,ON,R,A
;
run;

proc sort data=have;
by id;
run;

data want(keep=id ON_OFF CATEGORY  Group);
Format ON_OFF CATEGORY  Group $15.;
do until(last.id);
set have(rename=( ON_OFF=ON_OFF_ CATEGORY=CATEGORY_ Group=Group_));
by id;
ON_OFF=tranwrd(Strip(strip(ON_OFF)||' '||strip(ON_OFF_))," ","_");
CATEGORY=tranwrd(strip(strip(CATEGORY)||' '||strip(CATEGORY_))," ","_");
Group=tranwrd(strip(strip(Group)||' '||strip(Group_))," ","_");

end;
run;

 

Alternatively,

data test(drop=category_ Group_ ON_OFF_);
Format ON_OFF CATEGORY  Group $15.;
retain  on_off category group;
set have(rename=( ON_OFF=ON_OFF_ CATEGORY=CATEGORY_ Group=Group_));
by id;
if first.id then do;
			ON_OFF=" ";
			CATEGORY=" ";
			Group=" ";
			end;
on_off=catx("_",on_off_,on_off);
CATEGORY=catx("_",CATEGORY,CATEGORY_);
Group=catx('_',Group,Group_);
if last.id;
run;

 

Thanks,
Suryakiran

View solution in original post

5 REPLIES 5
SuryaKiran
Meteorite | Level 14

Just DO UNTIL(last.ID) will be required to achieve this

Data have;
infile datalines dlm=',' dsd missover;
input ID	ON_OFF :$8.	CATEGORY :$8.	Group:$8.;
datalines;
1,OFF,P,A
1,ON,G,A
2,OFF,C,A
2,OFF,C,B
3,OFF,O,A
3,OFF,R,B
4,OFF,C,A
4,ON,G,B
5,OFF,S,A
5,OFF,R,A
5,ON,G,C
6,ON,S,A
6,ON,S,B
6,ON,R,A
;
run;

proc sort data=have;
by id;
run;

data want(keep=id ON_OFF CATEGORY  Group);
Format ON_OFF CATEGORY  Group $15.;
do until(last.id);
set have(rename=( ON_OFF=ON_OFF_ CATEGORY=CATEGORY_ Group=Group_));
by id;
ON_OFF=tranwrd(Strip(strip(ON_OFF)||' '||strip(ON_OFF_))," ","_");
CATEGORY=tranwrd(strip(strip(CATEGORY)||' '||strip(CATEGORY_))," ","_");
Group=tranwrd(strip(strip(Group)||' '||strip(Group_))," ","_");

end;
run;

 

Alternatively,

data test(drop=category_ Group_ ON_OFF_);
Format ON_OFF CATEGORY  Group $15.;
retain  on_off category group;
set have(rename=( ON_OFF=ON_OFF_ CATEGORY=CATEGORY_ Group=Group_));
by id;
if first.id then do;
			ON_OFF=" ";
			CATEGORY=" ";
			Group=" ";
			end;
on_off=catx("_",on_off_,on_off);
CATEGORY=catx("_",CATEGORY,CATEGORY_);
Group=catx('_',Group,Group_);
if last.id;
run;

 

Thanks,
Suryakiran
Ksharp
Super User
Data have;
infile datalines dlm=',' dsd missover;
input ID	ON_OFF :$8.	CATEGORY :$8.	Group:$8.;
datalines;
1,OFF,P,A
1,ON,G,A
2,OFF,C,A
2,OFF,C,B
3,OFF,O,A
3,OFF,R,B
4,OFF,C,A
4,ON,G,B
5,OFF,S,A
5,OFF,R,A
5,ON,G,C
6,ON,S,A
6,ON,S,B
6,ON,R,A
;
run;
data want;
 array x1{999} $ 100 _temporary_;
 array x2{999} $ 100 _temporary_;
 array x3{999} $ 100 _temporary_;
 call missing(of x1{*} x2{*} x3{*});
 i1=0;i2=0;i3=0;
 do until(last.id);
    set have;
    by id;
    if ON_OFF not in x1 then do; i1+1;x1{i1}=ON_OFF;end;
	if CATEGORY not in x2 then do; i2+1;x2{i2}=CATEGORY;end;
    i3+1;x3{i3}=Group;
 end;
new_ON_OFF=catx('_',of x1{*});
new_CATEGORY=catx('_',of x2{*});
new_Group=catx('_',of x3{*});
drop ON_OFF CATEGORY Group i1 i2 i3;
run;
Jagadishkatam
Amethyst | Level 16

Alternatively you could try the below code

 


data want;
length new_on_off new_CATEGORY new_group $100;
set have;
by id ;
retain new_on_off new_CATEGORY new_group;
if first.id then new_on_off='';
if first.id then new_CATEGORY='';
if first.id then new_group='';
if on_off ne '' and new_on_off ne on_off then new_on_off=catx('_',new_on_off,on_off);
if CATEGORY ne '' and new_CATEGORY ne CATEGORY then new_CATEGORY=catx('_',new_CATEGORY,CATEGORY);
if Group ne '' and new_group ne Group then new_group=catx('_',new_group,Group);
if last.id;
keep new_on_off new_CATEGORY new_group;
run;

 

 

Thanks,
Jag
Tom
Super User Tom
Super User

For the cases where you just need the unique values do you need them in any order?  The order the first appear?  Lexical order?

You can use @Ksharp 's trick of temporary arrays to support all of your combinations.  Then if you did want the distinct values ordered you could add a line to use CALL SORTC() to sort the distinct values.  So for example to list the CATEGORY values in lexical order and the ON_OFF values in whatever order they appear you program might look like this:

 

data want;
  array x1(999) $8 _temporary_;
  array x2(999) $8 _temporary_;
  array x3(999) $8 _temporary_;
  call missing(of x1(*) x2(*) x3(*));
  do i=1 by 1 until(last.id);
    set have;
    by id;
    if ON_OFF not in x1 then x1(i)=ON_OFF ;
    if CATEGORY not in x2 then x2(i)=CATEGORY ;
    x3(i)=Group;
  end;
  length new_ON_OFF new_CATEGORY new_GROUP $200;
  call sortc(of x2(*));
  new_ON_OFF=catx('_',of x1(*));
  new_CATEGORY=catx('_',of x2(*));
  new_Group=catx('_',of x3(*));
  drop ON_OFF CATEGORY Group i ;
run;
jdwaterman91
Obsidian | Level 7
Data Want;
Set Have;
Retain On_Off2 Cat2 Group2 Count;
By Id Notsorted;

Groups = Lag(Group);

If First.Id Then
        Do;
                On_Off2 = On_Off;
                Cat2 = Category;
                Group2 = Group;
                Count = 1;
        End;

Else
        Do;
                If Group ^= Groups Then Count + 1;
                If On_Off ^= On_Off2 Then On_Off2 = Catx('_', On_Off2, On_Off);
                If Category ^= Cat2  Then Cat2 = Catx('_', Cat2, Category);
                Group2 = Catx('_', Group2, Group);
        End;

If Last.Id;

If Count = 1 Then Group2 = Substr(Group2, 1, 1);

Keep ID On_Off2 Cat2 Group2;
Rename On_Off2 = On_Off Cat2 = Category Group2 = Group;

Run;

sas-innovate-2024.png

Available on demand!

Missed SAS Innovate Las Vegas? Watch all the action for free! View the keynotes, general sessions and 22 breakouts on demand.

 

Register now!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 5 replies
  • 3282 views
  • 0 likes
  • 6 in conversation