Changes

Jump to navigation Jump to search
==Data Preprocessing==
For the data preprocessing, we adopt the same standard as in the [http://ai.stanford.edu/~amaas/data/sentiment/ the IMDB]dataset.
# '''To general users:''' your input (usually a single ".txt" file contains many examples) will be split into a training set (80% by default) and a testing set (20% by default). The target labels you want to predict will be the sub-folder names. The description of each example will go into a separate ".txt" file and the name of the file can be determined by the user. To process your own dataset, you basically need to specify the file name, expected columns, content index and label index.
78

edits

Navigation menu