Basic Text Analysis Concepts, using Ruby
We will use Ruby to understand some basic ideas behind how textual documents are parsed and analyzed to do things like auto-tagging, document categorization, and document similarity matching.
This class will assume knowledge of Ruby basics - methods, arrays and hashes, regular expressions, writing simple classes, and reading files. It will not require any prior knowledge of statistics or text mining.
The code for this class is in the
text_statistics folder of this Github repo