AI Scholarium

NLP Tools

Experiment with tokenisation, simple sentiment rules, and text classification sandboxes. Learn how AI systems interpret and structure language.

Tokenization Playground

Paste any English text below. This tool will:
• Lowercase the text
• Strip basic punctuation
• Split into whitespace-delimited tokens
• Compute token count and simple frequency statistics

Tokenised Output:

Tokens: — | Unique: —

Sentiment Analyzer (Rule-Based)

This is a simple lexical sentiment model: it counts positive and negative words using a handcrafted dictionary, then labels the text as Positive, Negative or Neutral.

Overall Sentiment: —

Positive score: 0
Negative score: 0
Net score: 0

Matched Words:

Simple Text Classifier

This rule-based classifier tries to guess a topic label (Tech, Health, Finance) using keyword matching. This simulates the idea behind text categorisation.

Predicted Category:

Word Similarity Sandbox

This demo assigns random 3-dimensional vectors to words and computes a cosine similarity score (simulating how embeddings work conceptually).

Similarity Score: —

Vector 1: —

Vector 2: —