Palak Gupta👋
Turning data into insights with my Strategic Data Analysis
Turning data into insights with my Strategic Data Analysis
Portfolio Project 10:
The Text Classifier Model project involved building a machine learning system capable of categorizing text into predefined categories. The aim was to automate text classification tasks such as news categorization, product review tagging, sentiment analysis, or topic detection, using natural language processing (NLP) and supervised learning techniques.
Research:The project started by exploring real-world applications of text classification across domains—content moderation, customer feedback analysis, and news curation. A labeled dataset was selected based on the use case (e.g., news articles categorized by topic, or product reviews classified by sentiment).
Information Architecture: Text data was preprocessed with steps such as lowercasing, punctuation removal, stop word filtering, tokenization, and lemmatization. Features were extracted using TF-IDF vectorization, and for deeper models, word embeddings like Word2Vec or GloVe were considered
Wireframing and Prototyping:The system was designed to allow input of raw text and return the predicted category. A prototype was developed using Streamlit to simulate real-time classification. Various algorithms—Logistic Regression, Random Forest, SVM, and LSTM (for sequential data)—were trained and evaluated for performance comparison.
The text classifier achieved over 90% accuracy and a macro F1-score of 0.88 on the selected dataset. It effectively categorized input text into topics like business, politics, tech, and sports (for news classification) or positive/neutral/negative (for sentiment tasks). The model was integrated into a user-friendly Streamlit interface, making it accessible for non-technical users. This project demonstrated how text classification can streamline decision-making, enhance customer insights, and power intelligent content systems.