2018 IEEE International Conference on Data Mining Workshops (ICDMW), pp. 191-198, Nov. 17, 2018
The 1st International Workshop on Cross-disciplinary Data Exchange and Collaboration (CDEC) in 2018 IEEE International Conference on Data Mining Workshops (ICDMW)
We propose a method to select and rank stocks related to a given theme. The proposed method has two flows; obtaining related words, and selecting related stocks based on obtained related words. First, on the basis of the given theme word, the proposed method selects words with high similarity using an ensemble of word2vec models. Then, we modify the similarity based on the results of the word matches in information from companies including investor relations documents and homepages. Second, the top-10 similar words are matched to the company data, and we extract sentences related to the given theme from the data of each company. We then calculate company similarity by summing the modified similarity of related words in the extracted sentences as a final similarity measure of each company. Finally, we select the top-n related stocks based on the obtained final similarity. Targeting the Japanese documents, companies, and stocks, we achieved 0.49 accuracy (precision, recall, and F1-value), which is better than the result of randomly selecting. In addition, by comparing the results obtained using a completely different theme, we verified that the proposed method works correctly and can filter related stocks effectively.
Mutual fund; Stocks; Text mining;
@inproceedings{Hirano2018-cdec1, title={{Selection of Related Stocks using Financial Text Mining}}, author={Masanori HIRANO and Hiroki SAKAJI and Shoko KIMURA and Kiyoshi IZUMI and Hiroyasu MATSUSHIMA and Shintaro NAGAO and Atsuo KATO}, booktitle={2018 IEEE International Conference on Data Mining Workshops (ICDMW)}, issn={2375-9259}, isbn={978-1-5386-9288-2}, pages={191-198}, doi={10.1109/ICDMW.2018.00036}, year={2018} }