Back to DFS's Pascal Page


Character Frequency Graph Problem II

To analyze language, you need a reasonably large sample. In the first Character Frequency Graph Problem, you hard set the text in an array. To prevent the entering of the data from becoming overly onerous, we used a very short text, O Canada. In this follow-up problem, you will read the text from a file, one line at a time.

Since the program is handling the text one line at a time, you can process very large texts. In this problem, you will tally the frequencies of the letters in Lewis Carroll's Alice's Adventures in Wonderland. The file, which is downloadable using this link, was originally downloaded from Project Gutenberg, which is a repository of over 20,000 free books. The file is 153KB long.

Use the following pseudocode to guide your program development.

  1. Initialize frequency array
  2. Introduce program
  3. Get file name from user
  4. Process file
  5. Find largest frequency
  6. Determine ratio to be used for bar graph
  7. Print stats
  8. Print graph

Notes

Compare these frequencies with the points on the tiles used in the game Scrabble.

  1. What surprises do you find?
  2. Why do you think there are discrepancies?

Compare the two graphs you have generated.

What surprises do you find?


© DFStermole 2008
Created 25 Mar 08
Modified 30 Mar 08