BookmarkSubscribeRSS Feed
RandyStan
Fluorite | Level 6
Dear All My data is as follows Date ST IDa IDb Cat JUN012018 500 A AA I JUN022018 500 A AA I JUN032018 500 A AA I JUN012018 501 A AA JUN022018 501 A AA JUN032018 501 A AA JUN012018 502 A AA I JUN012018 502 A AA I JUN012018 500 A AAB JUN022018 500 A AAB JUN032018 500 A AAB JUN012018 501 A AAB D JUN022018 501 A AAB D JUN032018 501 A AAB D JUN012018 502 A AAB JUN012018 502 A AAB and so on The data set should look like this Date ST IDa IDb Cat JUN012018 500 A AA I JUN022018 500 A AA I JUN032018 500 A AA I JUN012018 501 A AA I JUN022018 501 A AA I JUN032018 501 A AA I JUN012018 502 A AA I JUN012018 502 A AA I JUN012018 500 A AAB D JUN022018 500 A AAB D JUN032018 500 A AAB D JUN012018 501 A AAB D JUN022018 501 A AAB D JUN032018 501 A AAB D JUN012018 502 A AAB D JUN012018 502 A AAB D Thanks Stan
11 REPLIES 11
RandyStan
Fluorite | Level 6

Putting this question again because the formatting seems to be messed up

 

Dear All:
My data set is as follows

Date                            ST                        IDa                         IDb              Cat
JUN012018               500                          A                           AA                I
JUN022018               500                          A                           AA                I
JUN032018               500                          A                           AA                I

JUN012018                501                         A                           AA 

JUN022018               501                          A                           AA

JUN032018                501                         A                           AA

JUN012018                502                         A                           AA                 I

JUN012018                502                          A                           AA                I

JUN012018               500                          A                           AAB               
JUN022018               500                          A                           AAB                
JUN032018               500                          A                           AAB                

JUN012018                501                         A                           AAB              D

JUN022018               501                          A                           AAB              D

JUN032018                501                         A                           AAB              D

JUN012018                502                         A                           AAB                 

JUN012018                502                          A                           AAB                

 

 

and so on

The data set should look like this

 

 

Date                            ST                        IDa                         IDb              Cat
JUN012018               500                          A                           AA                I
JUN022018               500                          A                           AA                I
JUN032018               500                          A                           AA                I

JUN012018                501                         A                           AA                I

JUN022018               501                          A                           AA                I

JUN032018                501                         A                           AA                I

JUN012018                502                         A                           AA                 I

JUN012018                502                          A                           AA                I

JUN012018               500                          A                           AAB             D              
JUN022018               500                          A                           AAB             D            
JUN032018               500                          A                           AAB             D   

JUN012018                501                         A                           AAB              D

JUN022018               501                          A                           AAB              D

JUN032018                501                         A                           AAB              D

JUN012018                502                         A                           AAB              D      

JUN012018                502                          A                           AAB             D

 

 

Thanks 

Stan          

  

pau13rown
Lapis Lazuli | Level 10

hat's the difference between the two?

RandyStan
Fluorite | Level 6

No difference.  I saw that the formatting was messed up, so I posted it again.  OI hope that the data and the question is clearer now.

Reeza
Super User

The start and end look the same to me. 

 

Here’s  some instructions on posting sample data. Also, don’t edit the info in the code block in the main editor, that’s messes up the line breaks for some reason. 

 


@RandyStan wrote:

No difference.  I saw that the formatting was messed up, so I posted it again.  OI hope that the data and the question is clearer now.


 

RandyStan
Fluorite | Level 6

Please check the CAT column.  In the first data set the variables are missing.  In the second, the data set I want, they column CAT is populated.

   Thanks so much

Reeza
Super User

Given the way the data is formatted it’s really hard to see that or what the logic is. 

 


@RandyStan wrote:

Please check the CAT column.  In the first data set the variables are missing.  In the second, the data set I want, they column CAT is populated.

   Thanks so much


 

Patrick
Opal | Level 21

@RandyStan

Below picture created by combining your have and want data.

Please explain the logic how you'd populate "CAT_WANT".

Capture.JPG

 

Below how you'd best post sample data if you want to motivate people to post answers with working code.

data have;
  infile datalines dlm=' ' truncover;
  format Date date9.;
  input _dt1 $1-3 _dt2 $4-5 _dt3 $6-9 (ST IDa IDb Cat_Have) ($);
  date=input(cats(_dt2,_dt1,_dt3),date9.);
  drop _dt:;
  datalines;
JUN012018 500 A AA I
JUN022018 500 A AA I
JUN032018 500 A AA I
JUN012018 501 A AA 
JUN022018 501 A AA
JUN032018 501 A AA
JUN012018 502 A AA I
JUN012018 502 A AA I
JUN012018 500 A AAB 
JUN022018 500 A AAB 
JUN032018 500 A AAB 
JUN012018 501 A AAB D
JUN022018 501 A AAB D
JUN032018 501 A AAB D
JUN012018 502 A AAB 
JUN012018 502 A AAB 
;
run;

 

PGStats
Opal | Level 21

Use a double DO UNTIL() loop.The first loop records the last non-missing Cat value in the IDb-group, the second loop forces that value where Cat is missing :

 

data have;
length Date $10 ST 8 IDa $1 IDb $3 Cat $1;
infile datalines truncover;
input Date ST IDa IDb Cat;
datalines;
JUN012018 500 A AA I
JUN022018 500 A AA I
JUN032018 500 A AA I
JUN012018 501 A AA 
JUN022018 501 A AA
JUN032018 501 A AA
JUN012018 502 A AA I
JUN012018 502 A AA I
JUN012018 500 A AAB 
JUN022018 500 A AAB 
JUN032018 500 A AAB 
JUN012018 501 A AAB D
JUN022018 501 A AAB D
JUN032018 501 A AAB D
JUN012018 502 A AAB 
JUN012018 502 A AAB 
;

data want;
do until(last.IDb);
    set have; by IDb notsorted;
    if not missing(Cat) then groupCat = Cat;
    end;
do until(last.IDb);
    set have; by IDb notsorted;
    if missing(Cat) then Cat = groupCat;
    output;
    end;
drop groupCat;
run;

proc print data=want noobs; run;

 

 

PG
mkeintz
PROC Star

This problem is very amenable to a self merge, where the first instance of HAVE has all the records and variables.  The second instance has only the BY-variable(s) (IDB in your case) and the new variable of interest (cat_want, renamed from cat_have), and it only has records with non-blank values of cat_want:

 

data want;
  merge have
        have (keep=idb cat_have rename=(cat_have=cat_want) where=(cat_want^=' '));
  by idb;
run;
--------------------------
The hash OUTPUT method will overwrite a SAS data set, but not append. That can be costly. Consider voting for Add a HASH object method which would append a hash object to an existing SAS data set

Would enabling PROC SORT to simultaneously output multiple datasets be useful? Then vote for
Allow PROC SORT to output multiple datasets

--------------------------
novinosrin
Tourmaline | Level 20
data have;
length Date $10 ST 8 IDa $1 IDb $3 Cat $1;
infile datalines truncover;
input Date ST IDa IDb Cat;
datalines;
JUN012018 500 A AA I
JUN022018 500 A AA I
JUN032018 500 A AA I
JUN012018 501 A AA 
JUN022018 501 A AA
JUN032018 501 A AA
JUN012018 502 A AA I
JUN012018 502 A AA I
JUN012018 500 A AAB 
JUN022018 500 A AAB 
JUN032018 500 A AAB 
JUN012018 501 A AAB D
JUN022018 501 A AAB D
JUN032018 501 A AAB D
JUN012018 502 A AAB 
JUN012018 502 A AAB 
;


proc sql;
create table want(drop=cat) as
select *,max(cat) as cat1
from have
group by idb
order by idb,st,date;
quit;

novinosrin
Tourmaline | Level 20
data have;
infile datalines truncover;
input Date : $10. ST  IDa $ IDb $ Cat $;
datalines;
JUN012018 500 A AA I
JUN022018 500 A AA I
JUN032018 500 A AA I
JUN012018 501 A AA 
JUN022018 501 A AA
JUN032018 501 A AA
JUN012018 502 A AA I
JUN012018 502 A AA I
JUN012018 500 A AAB 
JUN022018 500 A AAB 
JUN032018 500 A AAB 
JUN012018 501 A AAB D
JUN022018 501 A AAB D
JUN032018 501 A AAB D
JUN012018 502 A AAB 
JUN012018 502 A AAB 
;

data want;
set have;
if (_n_ = 1) then do;
if 0 then set have(keep=idb cat);
    declare hash h(dataset: "have(keep=idb cat where=(not missing(cat))", duplicate: "r");
    h.definekey('idb');
    h.definedata('cat');
    h.definedone();
 end;
 rc=h.find();
 drop rc;
 run;

sas-innovate-2024.png

Don't miss out on SAS Innovate - Register now for the FREE Livestream!

Can't make it to Vegas? No problem! Watch our general sessions LIVE or on-demand starting April 17th. Hear from SAS execs, best-selling author Adam Grant, Hot Ones host Sean Evans, top tech journalist Kara Swisher, AI expert Cassie Kozyrkov, and the mind-blowing dance crew iLuminate! Plus, get access to over 20 breakout sessions.

 

Register now!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 11 replies
  • 1190 views
  • 1 like
  • 7 in conversation