BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
Navnath_Gajare
Calcite | Level 5

Hi Experts,

Hope you are doing well and safe in this Covid19 Pandamic. 

I want to read following csv files and create dataset. Please help me.

file:

student_id,mark1,mark2,mark3
1A,58,52,
85,2B,78,
43,,1C,58,98,65,4A,
87,99,95,5D,45,,43,6S,
23,
45,,7F,32,56,45,
1D,
32,45,67,2C,45,67,33,6G,43
37,87

 

 

Required Output:

student_idmark1mark2mark3
1A585285
2B78430
1C589865
4A879995
5D45043
6S23450

and so on.... for missing marks put 0.

please help to solve

Thanks in Advance

1 ACCEPTED SOLUTION

Accepted Solutions
novinosrin
Tourmaline | Level 20

This one avoids looping thus helps understanding so much more easier

data want;
 infile '\\corp\sites\RIB1001\HLSCreditRisk\NaveenS\file.csv' firstobs=2 dsd end=l;
 length student_id $2;
 array t(3) mark1-mark3;
 retain student_id t;
 input @;
 if char(_infile_,lengthn(_infile_))=',' then 
  _infile_=substr(_infile_,1,findc(_infile_,',','b')-1);
 input _temp $ @@;
 if _n_>1 then n+1;
 _temp2=mod(n,4);
 if _temp2=0 then student_id=_temp;
 else t(_temp2)=ifn(_temp>' ',input(_temp,32.),0);
 if nmiss(of t(*))=0 ;
 output;
 call missing(of t(*));
 keep student_id mark:;
run;
student_id mark1 mark2 mark3
1A 58 52 85
2B 78 43 0
1C 58 98 65
4A 87 99 95
5D 45 0 43
6S 23 45 0
7F 32 56 45
1D 32 45 67
2C 45 67 33
6G 43 37 87

 

View solution in original post

7 REPLIES 7
Reeza
Super User
data want;
infile cards flowover;
input student_id $2. mark1 mark2 mark3 @@;
cards;
1A,58,52,
85,2B,78,
43,,1C,58,98,65,4A,
87,99,95,5D,45,,43,6S,
23,
45,,7F,32,56,45,
1D,
32,45,67,2C,45,67,33,6G,43
37,87
;;;;

@Navnath_Gajare wrote:

Hi Experts,

Hope you are doing well and safe in this Covid19 Pandamic. 

I want to read following csv files and create dataset. Please help me.

file:

student_id,mark1,mark2,mark3
1A,58,52,
85,2B,78,
43,,1C,58,98,65,4A,
87,99,95,5D,45,,43,6S,
23,
45,,7F,32,56,45,
1D,
32,45,67,2C,45,67,33,6G,43
37,87

 

 

Required Output:

student_id mark1 mark2 mark3
1A 58 52 85
2B 78 43 0
1C 58 98 65
4A 87 99 95
5D 45 0 43
6S 23 45 0

and so on.... for missing marks put 0.

please help to solve

Thanks in Advance


 

Navnath_Gajare
Calcite | Level 5

Thanks for reply but not getting required output

Reeza
Super User
data want;
infile cards flowover dsd;
length student_id $2.;
input student_id  mark1 mark2 mark3 @@;
cards;
1A,58,52,
85,2B,78,
43,,1C,58,98,65,4A,
87,99,95,5D,45,,43,6S,
23,
45,,7F,32,56,45,
1D,
32,45,67,2C,45,67,33,6G,43
37,87
;;;;
;;;;

 

This gets you closer. What EOL characters do you have in your file?

 

ballardw
Super User

Were these files ever processed by spreadsheet software?

Did people make manual entries or edits?

 

The real question is "why are there irregular placement of end of line characters?"

Very few systems do that unless people get involved and manually do stuff. If there have been manual edits it may be possible to get the data before such intervention took place and have something that is cleaner to read.

novinosrin
Tourmaline | Level 20

Hi @Navnath_Gajare  It's a simple solution once @Reeza  gave us the right trigger i.e. Flowover. The rest is a breeze making flowover flow through the loop.  Have fun!


data want;
 infile '\\corp\sites\RIB1001\HLSCreditRisk\NaveenS\file.csv' firstobs=2 dsd end=l;
 length student_id $2;
 array t(3) mark1-mark3;
 do n=0 by 1 until(l and temp2=3);
  input @;
  if char(_infile_,lengthn(_infile_))=',' then 
  _infile_=substr(_infile_,1,findc(_infile_,',','b')-1);
  input _temp $ @;
  _temp2=mod(n,4);
  if _temp2=0 then student_id=_temp;
  else t(_temp2)=ifn(_temp>' ',input(_temp,32.),0);
  if nmiss(of t(*))=0 then do;
   output;
   call missing(of t(*));
  end;
 end;
 keep student_id mark:;
run;
student_id mark1 mark2 mark3
1A 58 52 85
2B 78 43 0
1C 58 98 65
4A 87 99 95
5D 45 0 43
6S 23 45 0
7F 32 56 45
1D 32 45 67
2C 45 67 33
6G 43 37 87
novinosrin
Tourmaline | Level 20

This one avoids looping thus helps understanding so much more easier

data want;
 infile '\\corp\sites\RIB1001\HLSCreditRisk\NaveenS\file.csv' firstobs=2 dsd end=l;
 length student_id $2;
 array t(3) mark1-mark3;
 retain student_id t;
 input @;
 if char(_infile_,lengthn(_infile_))=',' then 
  _infile_=substr(_infile_,1,findc(_infile_,',','b')-1);
 input _temp $ @@;
 if _n_>1 then n+1;
 _temp2=mod(n,4);
 if _temp2=0 then student_id=_temp;
 else t(_temp2)=ifn(_temp>' ',input(_temp,32.),0);
 if nmiss(of t(*))=0 ;
 output;
 call missing(of t(*));
 keep student_id mark:;
run;
student_id mark1 mark2 mark3
1A 58 52 85
2B 78 43 0
1C 58 98 65
4A 87 99 95
5D 45 0 43
6S 23 45 0
7F 32 56 45
1D 32 45 67
2C 45 67 33
6G 43 37 87

 

Navnath_Gajare
Calcite | Level 5
Thanks for your help.

hackathon24-white-horiz.png

2025 SAS Hackathon: There is still time!

Good news: We've extended SAS Hackathon registration until Sept. 12, so you still have time to be part of our biggest event yet – our five-year anniversary!

Register Now

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

SAS Training: Just a Click Away

 Ready to level-up your skills? Choose your own adventure.

Browse our catalog!

Discussion stats
  • 7 replies
  • 1666 views
  • 4 likes
  • 4 in conversation