BookmarkSubscribeRSS Feed
Dhana18
Obsidian | Level 7

I am trying to merge five dofferent files with event id. Eventid are sorted transposed except visit dataset that has 2105 records.  After merging it should be 2105 observations, but it shows 2857. if i merge all four datasets except treat dataset it shows 2105 so i think there is something wrong in treat dataset, i could not tell. 

/*SORTING TREATMENT DATA BEFORE TRANSPOSING BY EVENT_ID*/
PROC SORT DATA=STDTELMT.TREAT18_SUBSET_1;
BY EVENT_ID DATETX;
RUN;

/*TRANSPOSING TREATMENT DATSET*/
proc transpose data=STDTELMT.TREAT18_SUBSET_1 out=STDTELMT.Treat18_TRANSPOSED;
by EVENT_ID DATETX;
ID RX;
var RX;
run;

 

NOTE: There were 2105 observations read from the data set STDTELMT.VISIT18_MERGED.
NOTE: There were 1894 observations read from the data set STDTELMT.LAB18_TRANSPOSED.
NOTE: There were 1075 observations read from the data set STDTELMT.SYMP18_SUBSET_TRANSPOSED.
NOTE: There were 752 observations read from the data set STDTELMT.TREAT_COUNT.
NOTE: There were 1948 observations read from the data set STDTELMT.TRANSPOSED_RISK_RENMD.
NOTE: The data set STDTELMT.MERGED has 2857 observations and 110 variables.
NOTE: DATA statement used (Total process time):
real time 0.24 seconds
cpu time 0.12 seconds

Where did I do wrong?

4 REPLIES 4
Reeza
Super User

Check your Treat data set and see if you have duplicate IDs.

 


@Dhana18 wrote:

I am trying to merge five dofferent files with event id. Eventid are sorted transposed except visit dataset that has 2105 records.  After merging it should be 2105 observations, but it shows 2857. if i merge all four datasets except treat dataset it shows 2105 so i think there is something wrong in treat dataset, i could not tell. 

/*SORTING TREATMENT DATA BEFORE TRANSPOSING BY EVENT_ID*/
PROC SORT DATA=STDTELMT.TREAT18_SUBSET_1;
BY EVENT_ID DATETX;
RUN;

/*TRANSPOSING TREATMENT DATSET*/
proc transpose data=STDTELMT.TREAT18_SUBSET_1 out=STDTELMT.Treat18_TRANSPOSED;
by EVENT_ID DATETX;
ID RX;
var RX;
run;

 

NOTE: There were 2105 observations read from the data set STDTELMT.VISIT18_MERGED.
NOTE: There were 1894 observations read from the data set STDTELMT.LAB18_TRANSPOSED.
NOTE: There were 1075 observations read from the data set STDTELMT.SYMP18_SUBSET_TRANSPOSED.
NOTE: There were 752 observations read from the data set STDTELMT.TREAT_COUNT.
NOTE: There were 1948 observations read from the data set STDTELMT.TRANSPOSED_RISK_RENMD.
NOTE: The data set STDTELMT.MERGED has 2857 observations and 110 variables.
NOTE: DATA statement used (Total process time):
real time 0.24 seconds
cpu time 0.12 seconds

Where did I do wrong?


 

Dhana18
Obsidian | Level 7

Hi Reeza,

I did order freq to find out if there are any duplicate ids and there are no duplicate ids

proc freq data=STDTELMT.Treat18_TRANSPOSED order=freq;
table event_id;
run;

the visit data set has 2105 observations, treat data set has 752 observation with common variable event_id. I sorted both data set by event_id and tried to merge and they are not merged by event id instead treat data set is stacked on top of visit data set. Could you please tell me what did do wrong?

 

Sorting visit data by event_id:

proc sort data=STDTELMT.VISIT18_MERGED;
by event_id;
run;

 

/*SORTING TREATMENT DATA BEFORE TRANSPOSING BY EVENT_ID*/
PROC SORT DATA=RX_MERGED;;
BY EVENT_ID;
RUN;
/*REMOVING DUPLICATES*/
data STDTELMT.TREAT_COUNT;
set RX_MERGED;
by event_id;
t_f=first.event_id;
t_l=last.event_id;
if t_f=1;
run;
/*TRANSPOSING TREATMENT DATSET*/
proc transpose data=STDTELMT.TREAT_COUNT out=STDTELMT.Treat18_TRANSPOSED;
by EVENT_ID ;
ID RX;
var RX;
run;

merging two (visit and treat) data sets

DATA STDTELMT.MERGED ;
MERGE STDTELMT.VISIT18_MERGED STDTELMT.Treat18_TRANSPOSED;
by event_id;
RUN;

the log says :

NOTE: There were 2105 observations read from the data set STDTELMT.VISIT18_MERGED.
NOTE: There were 752 observations read from the data set STDTELMT.TREAT18_TRANSPOSED.
NOTE: The data set STDTELMT.MERGED has 2857 observations and 41 variables.
NOTE: DATA statement used (Total process time):
real time 0.13 seconds
cpu time 0.01 seconds

 

Reeza
Super User
Your EVENT_ID's don't match for some reason then. That's why you're ending up with new records for all. You'll need to explicitly check why they don't match, possibly different lengths, trailing blanks or something along those lines.
Dhana18
Obsidian | Level 7
Thank you Reeza. Yes I figured that out. I think it was because treat dataset was csv files and visit was an excel. The csv file was saved in excel and imported as visit dataset, and when I imported treatment date was not numeric and so I converted that into numeric and into date format then it worked just fine.

hackathon24-white-horiz.png

The 2025 SAS Hackathon has begun!

It's finally time to hack! Remember to visit the SAS Hacker's Hub regularly for news and updates.

Latest Updates

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

SAS Training: Just a Click Away

 Ready to level-up your skills? Choose your own adventure.

Browse our catalog!

Discussion stats
  • 4 replies
  • 1223 views
  • 0 likes
  • 2 in conversation