I have 365 tab delimitered csv files, all have the same data structure (same variable names) and the file names are like
exp20180101.csv
.......
exp20181231.csv
The data files contains both numeric variables and character variables, the observations of character variables are of different length but all are less than 50 characters.
I want to import and merge them into a single file. I searched this community and find a similar question. I use the codes they provide,but the infiled character variables are not correct.
data import_all;
*make sure variables to store file name are long enough;
length filename txt_file_name $256;
*keep file name from record to record;
retain txt_file_name;
*Use wildcard in input;
infile "F:\source data\*.csv" eov=eov filename=filename truncover dlm='09'x;
*Input first record and hold line;
input@;
*Check if this is the first record or the first record in a new file;
*If it is, replace the filename with the new file name and move to next line;
if _n_ eq 1 or eov then do;
txt_file_name = scan(filename, -1, "\");
eov=0;
end;
*Otherwise go to the import step and read the files;
input x1-x5 (x6-x25) (:$50.) x26-x45;
run;
can anyone give me some hints about what's wrong with my codes?
Thank you very much!
You forgot the DSD option on the INFILE statement. That will allow it to read empty values properly.
Normally SAS uses space as a delimiter. And normally space delimited files use extra spaces to make the columns line up.
Name Age Height
Alfred 14 69.0
Alice 13 56.5
Barbara 13 65.3
So SAS will treat 2 or more spaces in a row as the same on one space in terms or indicating the end/start of a word to read.
But with a real delimited file you use adjacent delimiters to indicate when a value is missing.
Name,Age,Height
Alfred,14,69.0
Alice,,56.5
Please post the first 5 (or so) lines of one of the infiles. Copy/paste into a window opened with the {i} button.
Thanks for your help! KurtBremser
"719024869 20170101 201701 2017 2017.0027 AGR FARMER AGR CAN CANADA CAN 1 061 061 06 2 6.4 12 6 12 -2.8393711501275 2 Minnesota" " United States US USMN 45.7326 -93.9196 MN 1 Mexico MX MX 23 -102 MX 2 Minnesota" " United States US USMN 45.7326 -93.9196 MN 20180101 http://www.kansas.com/news/business/article192357844.html" "719024870 20170101 201701 2017 2017.0027 AGR FARMER AGR CAN CANADA CAN 1 061 061 06 2 6.4 36 6 36 -2.8393711501275 1 Mexico MX MX 23 -102 MX 1 Canada CA CA 60 -96 CA 1 Mexico MX MX 23 -102 MX 20180101 http://www.kansas.com/news/business/article192357844.html" "719024871 20170101 201701 2017 2017.0027 AGR FARMER AGR CAN CANADA CAN 1 061 061 06 2 6.4 12 6 12 -2.8393711501275 1 Mexico MX MX 23 -102 MX 1 Mexico MX MX 23 -102 MX 1 Mexico MX MX 23 -102 MX 20180101 http://www.kansas.com/news/business/article192357844.html" "719024872 20170101 201701 2017 2017.0027 AGR FARMER AGR MEX MEXICO MEX 1 061 061 06 2 6.4 12 6 12 -2.8393711501275 2 Minnesota" " United States US USMN 45.7326 -93.9196 MN 2 Minnesota" " United States US USMN 45.7326 -93.9196 MN 2 Minnesota" " United States US USMN 45.7326 -93.9196 MN 20180101 http://www.kansas.com/news/business/article192357844.html" "719024873 20170101 201701 2017 2017.0027 AGR FARMER AGR MEX MEXICO MEX 1 061 061 06 2 6.4 48 6 48 -2.8393711501275 1 Mexico MX MX 23 -102 MX 1 Mexico MX MX 23 -102 MX 1 Mexico MX MX 23 -102 MX 20180101 http://www.kansas.com/news/business/article192357844.html" "719024874 20170101 201701 2017 2017.0027 USA UNITED STATES USA USAGOV THE WHITE HOUSE USA GOV 1 130 130 13 3 -4.4 14 2 14 -4.7522003466853 3 White House" District of Columbia " United States US USDC 38.8951 -77.0364 531871 3 White House" District of Columbia " United States US USDC 38.8951 -77.0364 531871 3 White House" District of Columbia " United States US USDC 38.8951 -77.0364 531871 20180101 http://www.independent.co.uk/news/world/americas/president-donald-trump-white-house-first-year-inauguration-federal-bureaucracy-barack-obama-a8135921.html" "719024875 20171202 201712 2017 2017.9096 USA MISSOURI USA 1 043 043 04 1 2.8 36 6 36 -2.56410256410256 0 2 Missouri" " United States US USMO 38.4623 -92.302 MO 1 Mexico MX MX 23 -102 MX 20180101 http://www.TucsonNewsNow.com/story/37167033/dog-missing-for-a-year-turns-up-4-states-away" "719024876 20171202 201712 2017 2017.9096 COP POLICE COP AUS MELBOURNE AUS 0 130 130 13 3 -4.4 10 1 10 -9.765625 4 Brunswick West" Victoria " Australia AS AS07 -37.7833 144.933 -1562136 4 Melbourne" Victoria " Australia AS AS07 -37.8167 144.967 -1586844 4 Brunswick West" Victoria " Australia AS AS07 -37.7833 144.933 -1562136 20180101 http://www.heraldsun.com.au/news/victoria/three-men-burnt-while-using-illegal-fireworks-to-bring-in-2018/news-story/3f19242cc5fda5181931965d81fc1626?nk=0a819496d71c943b20ea7f7092959275-1514764002" "719024877 20171202 201712 2017 2017.9096 COP POLICE COP AUS MELBOURNE AUS 0 130 130 13 3 -4.4 46 4 46 -10.2798874684749 4 Melbourne" Victoria " Australia AS AS07 -37.8167 144.967 -1586844 4 Melbourne" Victoria " Australia AS AS07 -37.8167 144.967 -1586844 4 Melbourne" Victoria " Australia AS AS07 -37.8167 144.967 -1586844 20180101 http://www.heraldsun.com.au/news/victoria/three-men-burnt-while-using-illegal-fireworks-to-bring-in-2018/news-story/3f19242cc5fda5181931965d81fc1626?nk=0a819496d71c943b20ea7f7092959275-1514764002" "719024878 20171202 201712 2017 2017.9096 COP POLICE COP AUS MELBOURNE AUS 0 173 173 17 4 -5.0 10 1 10 -9.765625 4 Brunswick West" Victoria " Australia AS AS07 -37.7833 144.933 -1562136 4 Melbourne" Victoria " Australia AS AS07 -37.8167 144.967 -1586844 4 Melbourne" Victoria " Australia AS AS07 -37.8167 144.967 -1586844 20180101 http://www.heraldsun.com.au/news/victoria/three-men-burnt-while-using-illegal-fireworks-to-bring-in-2018/news-story/3f19242cc5fda5181931965d81fc1626?nk=0a819496d71c943b20ea7f7092959275-1514764002" "719024879 20171202 201712 2017 2017.9096 COP POLICE COP AUS MELBOURNE AUS 0 173 173 17 4 -5.0 46 4 46 -10.2798874684749 4 Melbourne" Victoria " Australia AS AS07 -37.8167 144.967 -1586844 4 Melbourne" Victoria " Australia AS AS07 -37.8167 144.967 -1586844 4 Melbourne" Victoria " Australia AS AS07 -37.8167 144.967 -1586844 20180101 http://www.heraldsun.com.au/news/victoria/three-men-burnt-while-using-illegal-fireworks-to-bring-in-2018/news-story/3f19242cc5fda5181931965d81fc1626?nk=0a819496d71c943b20ea7f7092959275-1514764002" "719024880 20171202 201712 2017 2017.9096 USA MISSOURI USA 1 042 042 04 1 1.9 36 6 36 -2.56410256410256 2 Missouri" " United States US USMO 38.4623 -92.302 MO 0 1 Mexico MX MX 23 -102 MX 20180101 http://www.TucsonNewsNow.com/story/37167033/dog-missing-for-a-year-turns-up-4-states-away" "719024881 20171225 201712 2017 2017.9726 RUS RUSSIA RUS 1 042 042 04 1 1.9 2 1 2 -6.41711229946524 0 3 Allegheny County" Pennsylvania " United States US USPA 40.6253 -80.1251 1213657 3 Allegheny County" Pennsylvania " United States US USPA 40.6253 -80.1251 1213657 20180101 http://www.njherald.com/article/20171231/AP/312319872" "719024882 20171225 201712 2017 2017.9726 GOV SECURITY COUNCIL GOV PRK PYONGYANG PRK 0 163 163 16 4 -8.0 2 1 2 -3.2258064516129 1 Russia RS RS 60 100 RS 4 Pyongyang" P'yongyang-si " North Korea KN KN12 39.0194 125.755 -183459 1 Russia RS RS 60 100 RS 20180101 https://www.channelnewsasia.com/news/world/after-trump-criticism--china-denies-selling-oil-illicitly-to-north-korea-9819368" "719024883 20171231 201712 2017 2017.9890 GOV PRESIDENT GOV 1 020 020 02 1 3.0 6 1 6 -2.40963855421687 0 4 Ibadan" Oyo " Nigeria NI NI32 7.38778 3.89639 -2010458 4 Ibadan" Oyo " Nigeria NI NI32 7.38778 3.89639 -2010458 20180101 https://blueprint.ng/2018-enough-of-blame-game-asuu-tells-buhari/"
data have;
input fname $80.;
infile dummy filevar=fname end=last dsd truncover dlm='09'x;
do while(not last);
input x1-x5 (x6-x25) (:$50.) x26-x45;
output;
end;
cards;
F:\source data\exp20180101.csv
.......
F:\source data\exp20181231.csv
;
Thanks Ksharp!
Your codes works! The only inconvenience is the long cards.
You forgot the DSD option on the INFILE statement. That will allow it to read empty values properly.
Normally SAS uses space as a delimiter. And normally space delimited files use extra spaces to make the columns line up.
Name Age Height
Alfred 14 69.0
Alice 13 56.5
Barbara 13 65.3
So SAS will treat 2 or more spaces in a row as the same on one space in terms or indicating the end/start of a word to read.
But with a real delimited file you use adjacent delimiters to indicate when a value is missing.
Name,Age,Height
Alfred,14,69.0
Alice,,56.5
Thanks a lot! Tom, you're so helpful!
Are you ready for the spotlight? We're accepting content ideas for SAS Innovate 2025 to be held May 6-9 in Orlando, FL. The call is open until September 25. Read more here about why you should contribute and what is in it for you!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.