Hi,
need to hlep on data reformat as below.
i/p and o/p as below Excel.
SSID | Screening_Number | Date_of_Birth | date_screened | Date_Randomized | Date_Screen_failed | |
158018 | 840238004 | 23Dec1947 | 11Apr2016 | 11May2016 | ||
158018 | 840238004 | 23Dec1947 | 13Jul2016 | 27Jul2016 | ||
158018 | 840238019 | 23Dec1947 | 11Apr2016 | 11May2016 | ||
158018 | 840238019 | 23Dec1947 | 13Jul2016 | 27Jul2016 | ||
o/p | ||||||
SSID | Screening_Number | Date_of_Birth | date_screened | Date_Screen_failed | Date_Rescreened | Date_selected |
158018 | 840238019 | 23Dec1947 | 11Apr2016 | 13Jul2016 | 11May2016 | 27Jul2016 |
Thanks
It is not very clear what are the rules to get the desired output.
I seems you want one ouput record per SSID with:
- the last screening_number of SSID
- date_of_birth (may be from the last observation of SSID)
- the first date_screened of SSID
- assign last date_screened of SSID to date_screened_failed
- save the max(original date_screened_failed) to a new variable: Date_Rescreened
- save the last or max(date_randomized) to a new variable: Date_selected
You can do it using either SQL or RETAIN in a data step;
Try make your own code and attach it (or the log) if you enter difficulties.
You really need to describe, possibly in excrutiating detail, how Date_Screen_failed ends up with the value 13Jul2016 in the output when that is not the value for that variable in the input and the other values associated with 13Jul2016 do not indicate failure in any way that I can see.
i want max Screen_number for every SSID and if the min Date_Screened as Date_Screend dates take the Date_Screnned and MAX Date_Screened as Date_Rescreened. date screend failed & Date_Randimized.
Thanks.
Test next code, and change it in case I did not understand exactly what you mean:
Proc SQL;
create table want as select
SSID,
max(date_bitrh) as date_birth,
max(screen_number) as screen_number,
min(date_screened) as date_screened,
max(date_screened) as date_screened_failed,
max(date_screened_failed) as Date_Rescreened,
max(date_randomized) as Date_Selected
FROM have
GROUP by SSID;
Quit;
data have; infile cards expandtabs truncover; input (SSID Screening_Number Date_of_Birth date_screened) (:$20.) (Date_Randomized Date_Screen_failed ) (:date9.); new=coalesce(Date_Randomized,Date_Screen_failed); format Date_Randomized Date_Screen_failed new date9.; cards; 158018 840238004 23Dec1947 11Apr2016 . 11May2016 158018 840238004 23Dec1947 13Jul2016 27Jul2016 . 158018 840238019 23Dec1947 11Apr2016 . 11May2016 158018 840238019 23Dec1947 13Jul2016 27Jul2016 . ; run; proc sql; select max(n) into : n from (select count(*) as n from have group by ssid ,Screening_Number,Date_of_Birth); quit; proc summary data=have nway; class ssid Screening_Number Date_of_Birth; output out=want idgroup(out[&n] (date_screened new)=); run;
SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.