Text Analysis
Text Analysis (텍스트 분석)
- Sentimental Analysis
- Analyzing Sentiment: 긍/부정 여부 혹은 글 속에 있는 감정들 (Positive/Negative or Feelings that are in documents)
- Data for Review (about movies, dramas, … any digital contents)
use RNN method
Time series is important in sentimental analysis. 시간적/순서적인 정보가 담겨있음.
Data is trained through hidden layter in serial order, then when this data has the output, the next data with the output begin training. This sequence keeps repetition.
Sequence = X-features
when preprocessing the words, the unused words(unnecessary words) are first removed, then encoded by onehotencoding method and vectorized about the relation between words.use LSTM method
pad_sequence: in the Simple RNN method case, the hidden layers are needed a lot, so the relation between words in a document are gradually lower correlated. The modeling performance would be underestimated, so the vacant space (because all of the words in sentences do not have the same length) would be padded with any number(generally 0) to cover this flaws.
RNN & LSTM = non-linear model