PHASE 3 · অধ্যায় 15

সেন্টিমেন্ট বিশ্লেষণ

Sentiment Analysis

Text-এ positive না negative emotion আছে detect করা।

ভূমিকা

একটা product review পড়ে আপনি বুঝতে পারেন user খুশি না রাগী। কিন্তু Amazon এ daily কোটি কোটি review আসে — মানুষ পড়ে শেষ করতে পারবে না। Sentiment Analysis এর কাজ এখানেই — text এর emotion automatic detect করা।

ধারণা

Sentiment Analysis হলো text classification এর একটি বিশেষ form, যেখানে category হলো emotion/opinion — সাধারণত Positive, Negative, Neutral। কিছু advanced system fine-grained emotion (joy, anger, sadness, fear) ও detect করতে পারে। Aspect-based sentiment analysis specific feature (camera, battery) এর উপর opinion বের করে।

সহজ ব্যাখ্যা

ভাবুন একটা restaurant review: 'খাবার ভালো ছিল কিন্তু service খুব slow'। Overall mixed — কিন্তু 'food' aspect এ positive, 'service' aspect এ negative। Sentiment analysis word level pattern (good, bad, love, hate, terrible) থেকে শুরু করে context (not good = negative) পর্যন্ত বুঝতে পারে।

বাস্তব ব্যবহার

Amazon, Daraz এ product review summarization।
Twitter/X এ brand monitoring — কোম্পানি সম্পর্কে public opinion।
Election prediction — social media সংলাপ থেকে।
Stock market prediction — financial news sentiment।
Customer support — angry customer auto-detect এবং priority।

ধাপে ধাপে বিশ্লেষণ

Step 1 — Labeled review collect

IMDB, Amazon review — positive/negative label সহ।

Step 2 — Preprocess

Lowercase, punctuation remove, stopword filter।

Step 3 — Vectorize

TF-IDF বা pre-trained embedding।

Step 4 — Model train

Classifier বা pre-trained sentiment model।

Step 5 — Production use

নতুন review তে sentiment predict।

Python কোড

from transformers import pipeline

sentiment = pipeline("sentiment-analysis")

reviews = [
    "This phone has an amazing camera and great battery life!",
    "Worst purchase ever. Broke within a week. Total waste of money.",
    "It is okay, nothing special but does the job.",
    "Absolutely love it! Best decision I made this year.",
]

for review in reviews:
    result = sentiment(review)[0]
    print(f"Text: {review}")
    print(f"  -> {result['label']} (score: {result['score']:.4f})\n")

ব্যাখ্যা

HuggingFace এর pipeline() function pre-trained sentiment model load করে (default: distilbert-base-uncased-finetuned-sst-2-english)। প্রতিটা text এ POSITIVE বা NEGATIVE label এর সাথে confidence score দেয়। কোনো training লাগে না — production ready।

সাধারণ ভুল

Sarcasm detect করা কঠিন: 'Oh great, another bug!' — model positive ভাবতে পারে।
Negation handling: 'not bad' আসলে positive, কিন্তু simple model negative ভাবে।
Domain mismatch: movie review এ trained model product review এ ভালো নাও কাজ করতে পারে।
Bangla এর জন্য English model — multilingual model বা bangla-specific লাগবে।

অনুশীলন

IMDB dataset এ নিজের sentiment classifier train করুন।
VADER (NLTK) দিয়ে rule-based sentiment compare করুন।
Bangla review তে multilingual BERT apply করুন।
Aspect-based: 'camera ভালো, battery খারাপ' — দুটো aspect আলাদা score।

ছোট প্রজেক্ট

Product Review Sentiment Dashboard

একটি Python script যা CSV ফাইল থেকে product reviews পড়ে, HuggingFace pipeline দিয়ে sentiment predict করে, এবং final report এ দেখায় কতগুলো positive/negative/neutral, এবং top 5 most negative review highlight করে।

সারাংশ

Sentiment Analysis = text এর emotion auto detect।
Positive / Negative / Neutral — basic 3 class।
Rule-based (VADER) vs ML (LogReg) vs Transformer (BERT)।
HuggingFace pipeline = zero-training sentiment।
Real-world: brand monitoring, review summary, social listening।

পূর্ববর্তী · অধ্যায় 14

টেক্সট ক্লাসিফিকেশন

পরবর্তী · অধ্যায় 16

স্প্যাম ডিটেকশন