BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
Phil_NZ
Barite | Level 11

Hi all SAS Users,

 

Today I want to do find how many distinct value of currency (curcdd)( a character variable)of a country. Because the dataset is large so I split them into many small files in one folder as in the picture below:

Phil_NZ_0-1616461163164.png

For example, my code successfully run for one file to retrieve the distinct currency for Romania (LOC='ROU') in file keepvar_1999_2001 is:

libname tests 'E:\Harv 27th_Feb_2021\data_cut';

proc sql;
   create table _1999_2001 as
   select distinct curcdd
   from tests.keepvar_1999_2001
   where LOC='ROU';
quit;

I am wondering how to code to get the distinct value of curcdd for all files in this folder for Romania (LOC='ROU'). 

 

I thought it would relate to the code below but I do not know how to merge them together:

proc contents data=tests._ALL_ out=contents;
run;
proc sql noprint;
	select distinct catx('.',libname,memname)
	into :dslist separated by ' '
	from contents
	;
quit;

 

 

Warm regards.

 

One further question is how to do the same things (distinct value of currency ) for LOC in array ('AUS','AUT','BEL','CAN','CYP','DNK','FIN','FRA','DEU',
'GRC','HKG','IRL','ISR','ITA','JPN','LUX','NLD','NZL','NOR','PRT',
'SGP','KOR','ESP','SWE','CHE','TWN','GBR','USA','ARG','BRA','CHL',
'CHN','COL','HRV','CZE','EGY','EST','HUN','IND','IDN','KEN','LTU',
'MYS','MEX','MAR','PAK','PER','PHL','POL','ROU','SVN','ZAF','LKA',
'THA','TUR','VEN','LVA','SVK','BGR') for all files in this folder.

 

 

 

 

Thank you for your help, have a fabulous and productive day! I am a novice today, but someday when I accumulate enough knowledge, I can help others in my capacity.
1 ACCEPTED SOLUTION

Accepted Solutions
Kurt_Bremser
Super User

If you want a table containing the distinct currency/location combinations for all datasets in your library, do this:

proc sql noprint;
select memname !! "(keep=curcdd loc)" into :dsnames separated by " "
from dictionary.tables
where libname = "TESTS";
quit;

data all_ds / view=all_ds;
set &dsnames.;
run;

data _null_;
set all_ds end=done;
if _n_ = 1
then do;
  declare hash out ();
  out.definekey("curcdd","loc");
  out.definedata("curcdd","loc");
  out.definedone();
end;
if out.find() ne 0 then rc = out.add();
if done then rc = out.output(dataset:"want");
run;

The first step retrieves all datasets in the library, the second creates the data step view, and the third builds a hash object with all distinct combinations and writes them out.

 

 

Edit: added a closing bracket for the keep= option

View solution in original post

5 REPLIES 5
Phil_NZ
Barite | Level 11

Hi @Kurt_Bremser 

Thank you for your suggestion. Originally, this file is 2.8TB in total, so the person who first deal with this file split it into many small files for uploading and downloading. The thing is, because the hard disk in our school is just around 120GB, so we need to do things on that way. 

 

The way he did is: for example, companies has daily observations from 1/1/1999 to 31/02/2021, so the file keep_var_1999_2001 will contain daily observation of all firms, file keep_var_2020_2021 will contain daily observation of all firms in 2020 and 2021.

 

The way I concatenate the dataset is:


proc contents data=tests._ALL_ out=contents;
run;

proc sql ;
create table variables as 
  select upcase(name) as name
          , sum(type=1) as n_num
         ,  sum(type=2) as n_char
         , max(length) as max_length
 from contents
  group by 1
;
quit;



proc sql noprint;
	select distinct catx('.',libname,memname)
	into :dslist separated by ' '
	from contents
	;
quit;

data ex_non_tradi / view=ex_non_tradi;
/*ex_non_tradi is the concatenated file*/
	length CIK $40 CSHOC 8 DATADATE 8 DOLLAR_VOL 8 EX_RATE 8
		   GVIIDKEY $200 ISIN $48 LOC $12 LOG_RAW_RETURN 8 PRCCD 8
		   PRCCD_ABS_ 8 PRCHD 8 PRCLD 8 RAW_RETURN_ABS 8 SEDOL $28 
		   AJEXDI 8 LAGAJEXDI 8 LAGTRFD 8 TRFD 8 CUSIP $36
	;
	set &dslist.;
run;

Please let me know if my explanation is still ambiguous.

Warmest regards.

Thank you for your help, have a fabulous and productive day! I am a novice today, but someday when I accumulate enough knowledge, I can help others in my capacity.
Kurt_Bremser
Super User

If you want a table containing the distinct currency/location combinations for all datasets in your library, do this:

proc sql noprint;
select memname !! "(keep=curcdd loc)" into :dsnames separated by " "
from dictionary.tables
where libname = "TESTS";
quit;

data all_ds / view=all_ds;
set &dsnames.;
run;

data _null_;
set all_ds end=done;
if _n_ = 1
then do;
  declare hash out ();
  out.definekey("curcdd","loc");
  out.definedata("curcdd","loc");
  out.definedone();
end;
if out.find() ne 0 then rc = out.add();
if done then rc = out.output(dataset:"want");
run;

The first step retrieves all datasets in the library, the second creates the data step view, and the third builds a hash object with all distinct combinations and writes them out.

 

 

Edit: added a closing bracket for the keep= option

Phil_NZ
Barite | Level 11

Hi @Kurt_Bremser 

Thank you very much for your help.

I do not know what happened when I tried to run the second set of code:

data all_ds / view=all_ds;
set &dsnames.;
run;

It is the log:

34         data all_ds / view=all_ds;
35         set &dsnames.;
ERROR: File WORK.KEEPVAR_1999_2001.DATA does not exist.
ERROR: File WORK.KEEPVAR_1999_2010_US.DATA does not exist.
ERROR: File WORK.KEEPVAR_2002_2004.DATA does not exist.
ERROR: File WORK.KEEPVAR_2005_2007.DATA does not exist.
ERROR: File WORK.KEEPVAR_2008_2010.DATA does not exist.
ERROR: File WORK.KEEPVAR_2011_2013.DATA does not exist.
ERROR: File WORK.KEEPVAR_2011_2020_US.DATA does not exist.
ERROR: File WORK.KEEPVAR_2014_2016.DATA does not exist.
ERROR: File WORK.KEEPVAR_2017_2019.DATA does not exist.
ERROR: File WORK.KEEPVAR_2020_2021.DATA does not exist.

Could you please let me know how to sort it out? It is quite strange to me because these datasets totally exist in this folder.

Warm regards.

Thank you for your help, have a fabulous and productive day! I am a novice today, but someday when I accumulate enough knowledge, I can help others in my capacity.
Kurt_Bremser
Super User

My bad. Change the SQL to

proc sql noprint;
select catx(".",libname,memname) !! "(keep=curcdd loc)" into :dsnames separated by " "
from dictionary.tables
where libname = "TESTS";
quit;

Forgot to add the library to the dataset identifier.

sas-innovate-2024.png

Available on demand!

Missed SAS Innovate Las Vegas? Watch all the action for free! View the keynotes, general sessions and 22 breakouts on demand.

 

Register now!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 5 replies
  • 526 views
  • 3 likes
  • 2 in conversation