Hi.
I am trying to scan data columns for sites for each ID. I am looking for the following record (where there is no blank) and for that data value to be placed into a new column (Most Recent). I am having difficulty trying to get SAS to bypass the record rows where there is a blank and to pick the next row containing data. Can anyone shed some light?
The first table shows the data the second table shows how I would like the results .
DATA Once;
INPUT ID $ visit1 site1 ;
CARDS;
Inn342 null null
Inn342 Anc Sha
Inn342 Drr Loc
Pan322
Pan322 Cli Dha
Pan322 Omn Sha
;
RUN;
Just use a period as a placeholder in the data line for missing value (numeric or character).
data once;
input ID $ visit1 $ site1 $ ;
CARDS;
Inn342 . .
Inn342 Anc Sha
Inn342 Drr Loc
Pan322 . .
Pan322 Cli Dha
Pan322 Omn Sha
;
To move the data backwards in time you need make two passes thru the data.
You could do it with two separate SET statements.
data want;
do until(last.id);
set once;
by id;
length first_site $8;
first_site=coalescec(first_site,site1);
end;
do until(last.id);
set once;
by id;
output;
end;
run;
Or you could merge it with itself.
data want;
merge once
once(keep=id site1 rename=(site1=xxx) where=(not missing(xxx)))
;
by id;
if first.id then first_site=xxx;
retain first_site;
drop xxx;
run;
Your data step does not work because you try to read character data into numeric variables. Please fix that, either by making the variables character or providing the necessary code to convert the strings to numbers.
Once we know how your dataset really looks, we can help you.
@SannaSanna wrote:
Hi.
I am trying to scan data columns for sites for each ID. I am looking for the following record (where there is no blank) and for that data value to be placed into a new column (Most Recent). I am having difficulty trying to get SAS to bypass the record rows where there is a blank and to pick the next row containing data. Can anyone shed some light?
The first table shows the data the second table shows how I would like the results .
DATA Once;
INPUT ID $ visit1 site1 ;
CARDS;
Inn342 null null
Inn342 Anc Sha
Inn342 Drr Loc
Pan322
Pan322 Cli Dha
Pan322 Omn Sha
;
RUN;
Is the text string 'null' in the first row of Inn342 the same as the text string ' ' (empty string) in the first row of Pan322?
I was trying to insert blank data for the first row for the two IDs. The spreadsheet shows blank cell/empty/no data for Inn342 and Pan322 for their most recent record. I want to pick the next row containing data. Sorry- I was not able to insert the blank row for each customer in my cards statement. Can you still help?
Are we talking about a SAS dataset here, or an Excel spreadsheet?
If it is a SAS dataset, which type are visit1 and site1 (character or numeric)?
So then this DATA step will create your data:
data have;
infile datalines dsd truncover;
input ID $ visit1 $ site1 $;
datalines;
Inn342,,
Inn342,Anc,Sha
Inn342,Drr,Loc
Pan322,,
Pan322,Cli,Dha
Pan322,Omn,Sha
;
?
Just use a period as a placeholder in the data line for missing value (numeric or character).
data once;
input ID $ visit1 $ site1 $ ;
CARDS;
Inn342 . .
Inn342 Anc Sha
Inn342 Drr Loc
Pan322 . .
Pan322 Cli Dha
Pan322 Omn Sha
;
To move the data backwards in time you need make two passes thru the data.
You could do it with two separate SET statements.
data want;
do until(last.id);
set once;
by id;
length first_site $8;
first_site=coalescec(first_site,site1);
end;
do until(last.id);
set once;
by id;
output;
end;
run;
Or you could merge it with itself.
data want;
merge once
once(keep=id site1 rename=(site1=xxx) where=(not missing(xxx)))
;
by id;
if first.id then first_site=xxx;
retain first_site;
drop xxx;
run;
Tom,
I used this code:
data want;
do until(last.id);
set once;
by id;
length first_site $8;
first_site=coalescec(first_site,site1);
end;
do until(last.id);
set once;
by id;
output;
end;
run;
I was wondering how I could incorporate a county code for the patient to get the same output by county? The table output would look like this: patient Inn342 visited two counties with most recent in 01 is Sha and in county 19 is Dha.
updated code to include county and blanks. (thank you)
data once;
input ID $ county $ visit1 $ site1 $ ;
CARDS;
Inn342 01 . .
Inn342 01 Anc Sha
Inn342 01 Drr Loc
Pan322 19 . .
Pan322 19 Cli Dha
Pan322 19 Omn Sha
; run;
SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!
Learn the difference between classical and Bayesian statistical approaches and see a few PROC examples to perform Bayesian analysis in this video.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.