Reddit and the WallStreetBet subreddit has become a very hot topic on the capital
market since the beginning of 2021. The discussions on these forums show the
potential to influence the stock market. My project is to build a model to forecast
the market movement based on the rich text data from Reddit. Specifically, I
have explored sentence embedding, document embedding, CNN-based model,
and sentiment analysis methods to leverage the sentence of posts & comments
information for market forecasting. This project has tested and compared several
types of model architectures. So far, the performance shows that the model could
slightly improve performance from the naive forecasting method.