About fengyuwuzu

fengyuwuzu · ‎02-17-2016

Thanks. Sort first is indeed necessary in the data step. I think in data step all other variables are automatically kept if I do not specify keep or drop. However, if I use the other method, ie proc sql, how to keep other varaibles which selecting the distinct combination of ID & date?

fengyuwuzu · ‎02-17-2016

I think you need to sort with the by variables first.

fengyuwuzu · ‎02-17-2016

Hello, I want to select distinct cases with two variables, distinct "combination" of the two variables. One way is using data step, but do I need to have a sort step before this? data new_data; set my_old_data; by ID date; if first.ID and first.date; run; another way is to use proc sql, but how to keep all other variables? proc sql; create new_table as select distinct ID, data from my_old_data; quit; Thanks

fengyuwuzu · ‎02-12-2016

yes, remove "else" works. if I do not remove "else", but change "_n_=1" to "_n_=0" it also works.

fengyuwuzu · ‎02-12-2016

Dear all, now I solve the problem. because there is no header in the csv files I changed if _n_ eq 1 or eov then do; to if _n_ eq 0 or eov then do; and the first row of each file can be correctly read in. Thank you for all your help. Every time I ask question here I learn extra stuff.

fengyuwuzu · ‎02-12-2016

all of my data files has no headers. the first row is data directly. when I include delete in the if condition block, the first row of each file is deleted in the merged file.

fengyuwuzu · ‎02-12-2016

without the "delete;", the first row of every file is blank -- all are missing values. With "delete;" in the do loop, the first row of every file is skipped, so in the final merged file, the number rows is 364 rows less compared to the merged file without "delete;". data import_all; *make sure variables to store file name are long enough; length filename txt_file_name $256; *keep file name from record to record; retain txt_file_name; *Use wildcard in input; infile "D:\data\Transaction2014*.csv" eov=eov filename=filename truncover delimiter = ',' MISSOVER DSD lrecl=32767; informat VAR1 yymmdd10. ; informat VAR2 $40. ; .... /* to save space */ informat VAR26 $4. ; informat VAR27 $19. ; informat VAR28 anydtdtm40. ; format VAR1 yymmdd10. ; format VAR2 $40. ; .... /* to save space */ format VAR26 $4. ; format VAR27 $19. ; format VAR28 datetime. ; *Input first record and hold line; input@; *Check if this is the first record or the first record in a new file; *If it is, replace the filename with the new file name and move to next line; if _n_ eq 1 or eov then do; txt_file_name = scan(filename, -1, "\"); eov=0; delete; /* with or without it, the first row will be either missing values or skipped */ end; *Otherwise go to the import step and read the files; else do; input /*Place input code here;*/ VAR1 VAR2 $ .... /* to save space */ VAR26 $ VAR27 $ VAR28 ; end; run; I found out the problem: the author says somewhere else, that if _n_ eq 1 or eov then do; txt_file_name = scan(filename, -1, "\"); eov=0; end; assumes that each file has column headers ans uses the EOV option to account for it. In my case, each file has no column headers. But, how should I change the code accordingly?

fengyuwuzu · ‎02-12-2016

Thank you. I removed the extra input and still got the first column as file names. But this time there is no column shift problem. However, the first row of the first file are all missing except first column, file name. Other rows were read in correctly

fengyuwuzu · ‎02-11-2016

I used the code like data import_all; *make sure variables to store file name are long enough; length filename txt_file_name $256; *keep file name from record to record; retain txt_file_name; *Use wildcard in input; infile "D:\data\Transaction2014*.csv" eov=eov filename=filename truncover delimiter = ',' MISSOVER DSD lrecl=32767; informat VAR1 yymmdd10. ; informat VAR2 $40. ; informat VAR3 best32. ; informat VAR4 best32. ; informat VAR5 best32. ; informat VAR6 best32. ; informat VAR7 $19. ; informat VAR8 $19. ; informat VAR9 $19. ; informat VAR10 $8. ; informat VAR11 $80. ; informat VAR12 $19. ; informat VAR13 $9. ; informat VAR14 $13. ; informat VAR15 $19. ; informat VAR16 $7. ; informat VAR17 $19. ; informat VAR18 $7. ; informat VAR19 $19. ; informat VAR20 $8. ; informat VAR21 $19. ; informat VAR22 $8. ; informat VAR23 $19. ; informat VAR24 $10. ; informat VAR25 $13. ; informat VAR26 $4. ; informat VAR27 $19. ; informat VAR28 anydtdtm40. ; format VAR1 yymmdd10. ; format VAR2 $40. ; format VAR3 best12. ; format VAR4 best12. ; format VAR5 best12. ; format VAR6 best12. ; format VAR7 $19. ; format VAR8 $19. ; format VAR9 $19. ; format VAR10 $8. ; format VAR11 $80. ; format VAR12 $19. ; format VAR13 $9. ; format VAR14 $13. ; format VAR15 $19. ; format VAR16 $7. ; format VAR17 $19. ; format VAR18 $7. ; format VAR19 $19. ; format VAR20 $8. ; format VAR21 $19. ; format VAR22 $8. ; format VAR23 $19. ; format VAR24 $10. ; format VAR25 $13. ; format VAR26 $4. ; format VAR27 $19. ; format VAR28 datetime. ; *Input first record and hold line; input@; *Check if this is the first record or the first record in a new file; *If it is, replace the filename with the new file name and move to next line; if _n_ eq 1 or eov then do; txt_file_name = scan(filename, -1, "\"); eov=0; end; *Otherwise go to the import step and read the files; else do; input /*Place input code here;*/ input VAR1 VAR2 $ VAR3 VAR4 VAR5 VAR6 VAR7 $ VAR8 $ VAR9 $ VAR10 $ VAR11 $ VAR12 $ VAR13 $ VAR14 $ VAR15 $ VAR16 $ VAR17 $ VAR18 $ VAR19 $ VAR20 $ VAR21 $ VAR22 $ VAR23 $ VAR24 $ VAR25 $ VAR26 $ VAR27 $ VAR28 ; end; run; However, in the import_all file, the first column is the file names with column name "txt_file_name", and this messed up other columns because format changed for other columns. How did this happen?

fengyuwuzu · ‎02-11-2016

They have the same data structure. I want to append (more data rows, keep the same columns).

fengyuwuzu · ‎02-11-2016

The variable names are stored in another csv file, only one row. There are 28 variables (I only showed 9 above to save space).

fengyuwuzu · ‎02-11-2016

the log has the code like this: informat VAR1 yymmdd10. ; informat VAR2 $40. ; informat VAR3 best32. ; informat VAR4 best32. ; informat VAR5 best32. ; informat VAR6 best32. ; informat VAR7 $19. ; informat VAR8 $19. ; informat VAR9 $19. ; format VAR1 yymmdd10. ; format VAR2 $40. ; format VAR3 best12. ; format VAR4 best12. ; format VAR5 best12. ; format VAR6 best12. ; format VAR7 $19. ; format VAR8 $19. ; format VAR9 $19. ; input VAR1 VAR2 $ VAR3 VAR4 VAR5 VAR6 VAR7 $ VAR8 $ VAR9 $ ; shall I replace only VAR1-9 in input, or all of them (informat, format, input)? Thanks

fengyuwuzu · ‎02-11-2016

my csv file has no header info. Data starts from row 1. when I imported to SAS, they were named var1 var2 var3, etc. I want to change the names to their real variable names. I can rdit the CSV file, insert a line on top of the data rows, and enter variables there. This runs fine. can I do another way, that I do not touch the original csv file, but add variable names in importing (data step using infile)? Thanks

fengyuwuzu · ‎02-11-2016

Thank you. Actually I have a number of questions about the code: I am confused by the "filename=filename" in: infile "Path\*.txt" eov=eov filename=filename truncover; also, "txt_file_name = scan(filename, -1, "\");" is to get the next file? where do I give the first file name?

fengyuwuzu · ‎02-11-2016

I have 364 files, all have the same data structure, and the file names are like commontext-2015-01-01.csv ....... commontext-2015-12-30.csv I want to import and merge them into a single file. I guess I need to import all of them first, and then I can merge them, right? can anyone give me some hints or direct me to an example ? Thank you very much! I found some code by searching: data import_all; *make sure variables to store file name are long enough; length filename txt_file_name $256; *keep file name from record to record; retain txt_file_name; *Use wildcard in input; infile "Path\*.txt" eov=eov filename=filename truncover; *Input first record and hold line; input@; *Check if this is the first record or the first record in a new file; *If it is, replace the filename with the new file name and move to next line; if _n_ eq 1 or eov then do; txt_file_name = scan(filename, -1, "\"); eov=0; end; *Otherwise go to the import step and read the files; else input *Place input code here; ; run; I will start from here.

Online Status	Offline
Date Last Visited	‎08-14-2024 10:40 PM

calculated estimated survival rate at a certain time point (say, 18 mo...

Stratified analysis in non ITT population

Re: Proc import: how to add back blank values due to the same values a...

Proc import: how to add back blank values due to the same values as pr...

Re: Puzzled: Hazard ratio is 0 when both arms have events

Puzzled: Hazard ratio is 0 when both arms have events

Re: median survival not reached in proc lifetest when there are only 2...

Re: median survival not reached in proc lifetest when there are only 2...

Re: median survival not reached in proc lifetest when there are only 2...

median survival not reached in proc lifetest when there are only 2 pat...

Re: count events in one dataset between two visits in another dataset

Re: count events in one dataset between two visits in another dataset

Re: count events in one dataset between two visits in another dataset

Re: separate a string into two variables

Re: separate a string into two variables

ERROR: Some character data was lost during transcoding

hide the time and date on the rtf generated by ODS

proc sort and dupout: how to get the pairs of duplicates

converting numeric to character by put statement: how to avoid "." for...

Re: LOCF with visit missing in the data set

Re: Selecting distinct Combinations of two variables

Re: Selecting distinct Combinations

Selecting distinct Combinations of two variables

Re: how to import and merge many csv files with the same struture and ...

Re: how to import and merge many csv files with the same struture and ...

Re: how to import and merge many csv files with the same struture and ...

Re: how to import and merge many csv files with the same struture and ...

Re: how to import and merge many csv files with the same struture and ...

Re: how to import and merge many csv files with the same struture and ...

Re: how to import and merge many csv files with the same struture and ...

Re: Provided Headers vs. Provided Dataset without Headers

Re: Provided Headers vs. Provided Dataset without Headers

Provided Headers vs. Provided Dataset without Headers

Re: how to import and merge many csv files with the same struture and ...

how to import and merge many csv files with the same struture and file...