BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
alagiejatta
Calcite | Level 5

I have a data set like this

ID  vr              bd                fx                  gh                  
A111/2/1999. 6/4/2012       8/1/2021
A123/5/2021...
A133/5/2000..4/17/1988
A1410/24/195612/2/19671/6/2001.
A157/1/199611/7/1990   .7/21/2001

 

I wanted a table like this. Varaibles are selected based dates values- earliest date will fill slit1 and latest date fill slot 4. Missing dates are ignored.

ID  vr              bd                fx                  gh                  yzslot1slot2slot3slot4
A111/2/1999. 8/1/20219/23/2019vrghyz 
A123/5/2021...3/5/2000yzvr  
A133/5/2000..4/17/1988.ghvr  
A1410/24/195612/2/19671/6/2001..fxbdvr 
A157/1/199611/7/1990   .7/21/20016/10/2000bdvryzgh

 

 

1 ACCEPTED SOLUTION

Accepted Solutions
Ksharp
Super User
data have;
infile cards expandtabs;
input ID	$ ( vr              	bd               	 fx                  	gh) (:mmddyy12.); 
format vr bd fx gh mmddyy10.;
cards;                 
A11	1/2/1999	.	 6/4/2012       	8/1/2021
A12	3/5/2021	.	.	.
A13	3/5/2000	.	.	4/17/1988
A14	10/24/1956	12/2/1967	1/6/2001	.
A15	7/1/1996	11/7/1990   	.	7/21/2001
;


data want;
if _n_=1 then do;
 length date 8 vname $ 40;
 declare hash h(multidata:'y',ordered:'y');
 declare hiter hi('h');
 h.definekey('date');
 h.definedata('vname');
 h.definedone();
end;
 set have;
 array x{*}  vr bd  fx gh;
 array y{*} $ 40 slot1-slot4;
 do i=1 to dim(x);
   date=x{i};vname=vname(x{i});
   if not missing(date) then h.add();
 end;
 do i=1 by 1 while(hi.next()=0);
   y{i}=vname;
 end;
 h.clear();
 drop i date vname;
run;

proc print;run;

View solution in original post

11 REPLIES 11
Reeza
Super User
What happens if someone has a test in all 5 categories?
Do you need exactly 5 or does this need to scale?
What are you planning to do with this later on?

I'm leaning very strongly to suggesting whatever you're doing is likely easier using a long format such as:

ID TEST DATE
A11 VR 1/2/1999
A11 GH 8/1/2021
A12 VR 3/5/2021
A13 VR 3/5/2000
....
alagiejatta
Calcite | Level 5

If someone has test in all five then all the slots will be filled. I am sorry I forget to include slot5 in the output sample

PaigeMiller
Diamond | Level 26

@alagiejatta wrote:

If someone has test in all five then all the slots will be filled. I am sorry I forget to include slot5 in the output sample


So, @Reeza asked "What are you planning to do with this later on?" I don't see an answer. After you re-arrange the data, what analysis or reporting are you going to do? Knowing this information can really help devise the best method of handling this data.

--
Paige Miller
alagiejatta
Calcite | Level 5
So the final product is to show a sequence of event. I will submit a file the shows which event occur first, second, third etc by ID. For A11 the final answer is VR-GH-YZ
PaigeMiller
Diamond | Level 26

The long data set as suggested by @Reeza will do that, and should be much easier to program.

--
Paige Miller
Kurt_Bremser
Super User

@alagiejatta wrote:
So the final product is to show a sequence of event. I will submit a file the shows which event occur first, second, third etc by ID. For A11 the final answer is VR-GH-YZ

So you actually do not need the wide layout, you can use the long dataset after sorting.

Reeza
Super User

Convert it to a long structure as I suggested, possibly it started out that way?

 

Either way, look at the options here:

https://gist.github.com/statgeek/d583cfa992bf56da51d435165b07e96a

Kurt_Bremser
Super User

See here the whole code, from converting to the long layout to a report presenting the data in wide form:

data have;
infile datalines dlm="09"x dsd truncover;
input ID $ (vr bd fx gh) (:mmddyy10.);
format vr bd fx gh yymmdd10.;
datalines;
A11	1/2/1999	.	 6/4/2012       	8/1/2021
A12	3/5/2021	.	.	.
A13	3/5/2000	.	.	4/17/1988
A14	10/24/1956	12/2/1967	1/6/2001	.
A15	7/1/1996	11/7/1990   	.	7/21/2001
;

proc transpose
  data=have
  out=long (
   rename=(col1=date)
   where=(date ne .)
  )
;
by ID;
var vr--gh;
run;

proc sort data=long;
by id date;
run;

data want;
set long;
by id;
if first.id
then ct = 1;
else ct + 1;
slot = cats("slot",ct);
drop ct;
run;

proc report data=want;
column id _name_,slot n;
define id / group;
define _name_ / "" display;
define n / noprint;
define slot / "" across;
run;
Ksharp
Super User
data have;
infile cards expandtabs;
input ID	$ ( vr              	bd               	 fx                  	gh) (:mmddyy12.); 
format vr bd fx gh mmddyy10.;
cards;                 
A11	1/2/1999	.	 6/4/2012       	8/1/2021
A12	3/5/2021	.	.	.
A13	3/5/2000	.	.	4/17/1988
A14	10/24/1956	12/2/1967	1/6/2001	.
A15	7/1/1996	11/7/1990   	.	7/21/2001
;


data want;
if _n_=1 then do;
 length date 8 vname $ 40;
 declare hash h(multidata:'y',ordered:'y');
 declare hiter hi('h');
 h.definekey('date');
 h.definedata('vname');
 h.definedone();
end;
 set have;
 array x{*}  vr bd  fx gh;
 array y{*} $ 40 slot1-slot4;
 do i=1 to dim(x);
   date=x{i};vname=vname(x{i});
   if not missing(date) then h.add();
 end;
 do i=1 by 1 while(hi.next()=0);
   y{i}=vname;
 end;
 h.clear();
 drop i date vname;
run;

proc print;run;
FreelanceReinh
Jade | Level 19

Hello @alagiejatta,

 

Is it possible that duplicate dates occur for the same ID as in my sample data below for ID A99?

data have;
input ID $ (vr bd fx gh yz)(:mmddyy.);
format vr--yz mmddyy10.;
cards;
A11 1/2/1999 . 6/4/2012 8/1/2021 9/23/2019
A12 3/5/2021 . . . 3/5/2000
A13 3/5/2000 . . 4/17/1988 .
A14 10/24/1956 12/2/1967 1/6/2001 . .
A15 7/1/1996 11/7/1990 . 7/21/2001 6/10/2000
A99 9/23/2019 3/5/2000 4/17/1988 3/5/2000 1/2/1999
;

If not, you could populate the "slot" variables like this:

data want;
set have;
array _d[*] vr--yz;
array slot[5] $32;
do _n_=1 to n(of _d[*]);
  slot[_n_]=vname(_d[whichn(smallest(_n_, of _d[*]), of _d[*])]);
end;
run;

(Only the name of the first of two or more variables containing the same date would occur in the "slots:" slot3=slot4='bd' for ID A99 in the example above.)

 

That said, I agree with the others who recommend a "long" data structure.

hackathon24-white-horiz.png

The 2025 SAS Hackathon has begun!

It's finally time to hack! Remember to visit the SAS Hacker's Hub regularly for news and updates.

Latest Updates

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

SAS Training: Just a Click Away

 Ready to level-up your skills? Choose your own adventure.

Browse our catalog!

Discussion stats
  • 11 replies
  • 2036 views
  • 0 likes
  • 6 in conversation