Algorithm Share: computer programming

Showing posts with label computer programming. Show all posts

Thursday, January 1, 2009

Parsing words

Suppose, a user inputs the line in Code 1 (without the quotations).

Code 1. Sample input by user.

“  My name           is   Earl ”

And you need to parse the words “My”, “name”, “is” and “Earl”. How do you do that? Note that the spaces and/or tabs between words, or before or after each word may be zero or more spaces and/or tabs. This just makes parsing more complicated, doesn’t it?

People have different ways of doing things. Let me present my way in Code 2.

Code 2. Sample code for parsing words.
001 #include "stdio.h"
002
003 int main() {
004 __char line[100], ctmp[100];
005 __int i, j;
006
007 __fgets(line, 100, stdin);
008
009 __for (i = 0; 1; i += 1) {
010 ____if (line[i] == '\n' line[i] == 0) {
011 ______break;
012 ____}
013
014 ____if (line[i] == ' ' line[i] == '\t') {
015 ______continue;
016 ____}
017
018 ____for (j = 0; 1 ; i += 1, j += 1) {
019 ______if ( line[i] == ' '
020 _________ line[i] == '\t'
021 _________ line[i] == '\n'
022 _________ line[i] == 0
023 ______) {
024 ________break;
025 ______}
026 ______ctmp[j] = line[i];
027 ____}
028 ____ctmp[j] = 0;
029
030 ____printf("B%sE\n", ctmp);
031 __}
032 }

Of course, there are easier ways like I could use strtok or other more sophisticated methods available. But somehow, I never learned how to use them, had a hard time trying to configure them, and eventually, I ended up just using the code in Code 2 after tiring frustrations one after the other.

The code in Code 2 uses the fgets function in line 007 to get an input line from the user. The for loop in line 009 is then used to process the input character by character.

The for loop starts by checking if the current character it is processing is an end character which can either be a carriage return or a null character (line 010). If the current character is an end character, the loop breaks and processing of the input line is finished.

If the loop determines that the current character is not an end character, it does a another check to see if the current character being processed is a white space which could be a literal space character or a tab character (line 014). If it is a white space character, the loop restarts with the next character.

So, what checks 010 and 014 do is to find a character that is not an end character and at the same time, not a white space also. If a character passes these two initial checks, then we know that we have come across a character that is the beginning of a word because the character is neither an end character or a white space.

The loop then enters into a second for loop (line 018). The second for loop first checks if the character is an end character or a white space (line 019). If it is, it means that we have reached the end of a word and so we break out of the second for loop to break further processing of the current word being parsed out.

In the second for loop, if the current character passes the check of line 019, meaning that the current character is a valid character for a word – valid meaning that it could be alphanumeric or any other character as long as it is not an end character or a white space, then the current character is added into the current word being formed (ctmp).

So the second for loop keeps looping, concatenating one character after another until the end of the word is detected through a character that is either an end character or a white space. A null character is then added to the end of the word to signify the end of the character string (line 028).

After we have formed a word, we can pretty much do anything we want to do with the word parsed out from the input line. In Code 2, we simply print it (line 030).

So we go back to the first for loop and repeat the process all over again until all characters in the input line are processed.

Figure 1 shows a sample output of Code 2 when it is ran with the input of Code 1.

Figure 1. Output when Code 2 is inputted with the input of Code 1.

Monday, December 29, 2008

UVa problem 10851: sample code


001 #include "stdio.h"
002 #include "string"
003 #include "map"
004 #include "iostream"
005 
006 using namespace std;
007 
008 int main() {
009   char   line[100]
010        , ctmp[100]
011        , dtmp[100]
012        , input[10][100]
013        ;
014   int   num_msg
015       , i
016       , ii
017       , j
018       , k
019       , b
020       , c
021       , m
022       , n
023       , p
024       , msg_len;
025       ;
026 
027   map imap;
028   map::iterator iter;
029 
030   for (c = 0; c < 256; c += 1) {
031     ctmp[0] = '/';
032     ctmp[9] = '/';
033 
034     for (i = 1; i < 9; i += 1) {
035       ii = i - 1;
036 
037       b = c >> ii;
038       b &= 1;
039 
040       ctmp[i] = b == 0 ? '/' : '\\';
041     }
042 
043     ctmp[10] = 0;
044 
045     imap.insert(pair(ctmp, c));
046   }
047 
048   fgets(line, 100, stdin);
049   sscanf(line, "%d\n", &num_msg);
050 
051   for (i = 0; i < num_msg; ++i) {
052     for (j = 0; j < 10; ++j) {
053       fgets(line, 100, stdin);
054       sscanf(line, "%s\n", &input[j]);
055     }
056 
057     fgets(line, 100, stdin);
058 
059     msg_len = strlen(input[0]) - 2;
060 
061     for (m = 1, p = 0; m <= msg_len; m += 1, p += 1) {
062       for (n = 0; n < 10; n += 1) {
063         ctmp[n] = input[n][m];
064       }
065       ctmp[n] = 0;
066 
067       iter = imap.find(ctmp);
068       dtmp[p] = iter->second;
069     }
070     dtmp[p] = 0;
071 
072     printf("%s\n", dtmp);
073   }
074 
075   return 0;
076 }

Sunday, December 28, 2008

UVa Online Judge

The UVa Online Judge (http://icpcres.ecs.baylor.edu/onlinejudge/) is a website that archives thousands of problems used for computer programming competitions. It also provides a facility for validating code submissions to the problems published. Hence, it offers an excellent venue for testing algorithms.

Each problem in the archive is tagged with an ID number for easy identification. Refer to Figure 1.

Figure 1. Problems listing in the archive.

In Figure 1, the problem “Moscow time” has an ID number of 505.

Codes can be submitted for online validation can be written in C, Java, C++ or Pascal. Figure 2 shows the form for online submission.

Figure 2. Form for online code submission.

To search for a problem, one can opt to use the Google search bar as shown in Figure 3.

Figure 3. Google bar for searching problems.

Simply type in the problem ID in the Google bar. Be sure to tick the icpcres.ecs.baylor.edu button.

Welcome!

Welcome to Algorithm Share!

This blog is about sharing algorithms with others.

As for now, most of the problems to which algorithms will be discussed will come from the UVa Online Judge: http://icpcres.ecs.baylor.edu/onlinejudge/

I encourage you to post comments if you have algorithms to share as well. Most of the algorithms I will share will be my own ideas, but I believe that there are a thousand more algorithms out there that are more optimized than my own for a given problem. So, please do share your algorithms for me to learn as well.