LOG IN⠴ݱâ

  • ȸ¿ø´ÔÀÇ ¾ÆÀ̵ð¿Í Æнº¿öµå¸¦ ÀÔ·ÂÇØ ÁÖ¼¼¿ä.
  • ȸ¿øÀÌ ¾Æ´Ï½Ã¸é ¾Æ·¡ [ȸ¿ø°¡ÀÔ]À» ´­·¯ ȸ¿ø°¡ÀÔÀ» ÇØÁֽñ⠹ٶø´Ï´Ù.

¾ÆÀ̵ð ÀúÀå

   

¾ÆÀ̵ð Áߺ¹°Ë»ç⠴ݱâ

HONGGIDONG ˼
»ç¿ë °¡´ÉÇÑ È¸¿ø ¾ÆÀ̵ð ÀÔ´Ï´Ù.

E-mail Áߺ¹È®ÀÎ⠴ݱâ

honggildong@naver.com ˼
»ç¿ë °¡´ÉÇÑ E-mail ÁÖ¼Ò ÀÔ´Ï´Ù.

¿ìÆí¹øÈ£ °Ë»ö⠴ݱâ

°Ë»ö

SEARCH⠴ݱâ

ºñ¹Ð¹øÈ£ ã±â

¾ÆÀ̵ð

¼º¸í

E-mail

ÇмúÀÚ·á °Ë»ö

Text-based industry classification by Autoencoder

  • Kyounghun Bae Hanyang University
  • Daejin Kim Ulsan National Institute of Science and Technology
  • Rocku Oh Ulsan National Institute of Science and Technology
Industry classification has been one of the crucial issues in financial analysis. However, classical industry classification systems have several limitations. Several studies have been progressed to overcome the limitations by using the text information that firms use to describe their business process and products. In this paper, we propose an industry classification methodology based on their business descriptions by reducing high dimensions using autoencoder to avoid a high dimensionality problem in vector space. The main contribution of this paper is first, we overcome the limitation of cosine similarity measure where the word vector is large and highly sparse by reducing the dimension of word vector utilizing the autoencoder. Second, we are able to visualize the relative industry relation of the firms based on the lower dimensional information extracted from the business description text. The relative location can also describe the industry-level relationship as well as the position of individual firms which were originally involved in conflicting assignment problem in terms of the classical classification scheme.

  • Kyounghun Bae
  • Daejin Kim
  • Rocku Oh
Industry classification has been one of the crucial issues in financial analysis. However, classical industry classification systems have several limitations. Several studies have been progressed to overcome the limitations by using the text information that firms use to describe their business process and products. In this paper, we propose an industry classification methodology based on their business descriptions by reducing high dimensions using autoencoder to avoid a high dimensionality problem in vector space. The main contribution of this paper is first, we overcome the limitation of cosine similarity measure where the word vector is large and highly sparse by reducing the dimension of word vector utilizing the autoencoder. Second, we are able to visualize the relative industry relation of the firms based on the lower dimensional information extracted from the business description text. The relative location can also describe the industry-level relationship as well as the position of individual firms which were originally involved in conflicting assignment problem in terms of the classical classification scheme.