In this report, we discuss the various ways of data pre-processing and feature engineering for a text classification task. We first start by giving an overview of the classification task, the model used, and the given baseline implementation in Section 2. Then we iterate on top that version guided by the project documentation to use TF-IDF for token weighting to achieve better accuracy, detailed in Section 3. Finally we present our various approaches for feature extraction and pre-processing, such as BPE [2] and Word2Vec [1] in Section 4. We will discuss the accuracy and other performance metrics of the above approaches in Section 5, will conclude the paper in Section 6.