top of page

AI and Product Management

This part is really just me sharing what I have picked up from different YouTube videos, webinars, courses, and blogs.

We are not here to do a deep dive into machine learning or AI theory, and honestly, I am not a scientist either. This blog is all about the stuff a product manager actually needs to know about AI.

 

The AI Landscape​​

Artificial Intelligence (AI)
Software that mimics human smarts spotting patterns, predicting outcomes, or generating fresh content.

Generative AI
The buzziest slice of AI; it creates net-new text, code, images, or audio instead of just analysing old data. ChatGPT, Midjourney, and GitHub Copilot all live here

 

 

 

 

 

 

 

 

​​

​​​​​​​​​​​​​​​​​​​​​​​​​​​​​​​

​​​​

 

Machine-Learning Basics Every PM Should Name-Drop​​​​​​​​​​​​​​​​​​​​​​​​​​​​​​​​​​​​​​​​​​​​​​​​​

If you're a PM diving into AI, here’s a quick rundown of some terms you’ll want to know:​​​​​​​​​​​​​​​​​​​​​​​​​​​​​​​​

Screenshot 2025-07-27 at 9.30_edited.jpg

​​

​​​Neural Networks: Think of these like a simplified version of the human brain. They're made up of little interconnected nodes (neurons) that help process data, spot patterns, and make predictions.

Transformers: These are a special type of neural network built specifically to handle sequential data—like words in a sentence. They use something called “self-attention,” which basically lets them figure out which words matter most, even if they're far apart.

 

LLMs (Large Language Models): These are big models built using transformer architecture. They're trained on huge datasets, allowing them to understand context, create human-like text, and respond naturally. During training, they combine different learning methods to become really good at handling language tasks.​​​​​​​​​​​

How Neural Networks Work (60-Second Version)

  1. Input → Features
    Each data point (pixel, word, etc.) is turned into numbers.

  2. Layers & Weights
    Numbers flow through stacked layers; weights decide which paths matter.

  3. Activation Functions
    Fancy math (ReLU, Tanh) lets the net learn complex, non-linear stuff.

  4. Loss Function & Back-Propagation
    The network guesses, compares against truth, then tweaks its weights to improve next time

 

​If you're intrigued, you can experiment with the TensorFlow Playground.

Transformers have completely changed the AI game, especially when it comes to handling sequences like text and speech. Since the core idea works similarly across formats, let's stick with text examples for simplicity.

Imagine we're translating the sentence "My name is Anubhav Singhmaar" from English into Italian.​​​​​​​​​​​​​​​​​​​​​​​​​​​

Step 1: Tokenization

The first thing a model does is chop the sentence into tiny units called tokens. Most examples you see online say each word is a token, but that is not the full story. In real life, tokens can be full words, parts of words, or sometimes just a single character. It all depends on how the tokenizer decides to break things down. For example:​​​​​​​​​​​​​​​​​​​

​​​​​Step 2: Positional Encoding

Because transformers look at all the tokens at once, they still need a way to know which word comes first, second, and so on. To solve this, they give each token a little tag that marks its position in the sentence.

  • #1 “My”

  • #2 “name”

  • #3 “is”

  • #4 “An”

  • #5 “ub”

  • #6 “hav”

  • #7 “Singh”

  • #8 “maar”

​This way, even though the model processes all the tokens at the same time, it still understands the order they appear in, which one comes first, which one follows, and so on

Step 3: Encoder
Once the tokens have their positional tags, they move into the encoder. Think of the encoder as a multi layered processor that takes all these tagged tokens and transforms them into something the model can understand.

The big idea here is something called self attention. With self attention, every token can pay attention to every other token in the sentence. Basically, each word can check out all the other words to understand the full context, even words far apart from each other.

After self attention does its thing, the results pass through another step called a feedforward neural network. This helps the model pick up more detailed patterns by applying complex, non linear transformations.

Step 4: Decoder
After the encoder finishes processing, the decoder takes over. Its job is to turn the processed tokens into the final translated text. It does this step by step, one token at a time, again using self attention and another feedforward network to generate each word.

There's also a clever twist here called masked attention. Masked attention ensures the model can only look at the tokens it already created, not future ones. So, as the model builds a sentence, it only uses words it has already translated, keeping each new token accurate and relevant to what came before

​​​​​​​​​​​​​​​​

​​​​​​​​​​​​​​​​​​You might have heard that transformers and the human brain share some similarities because transformers use neural networks inspired by how our brains work. But let’s take a look at how they're actually different​​​​​​​​

Screenshot 2025-07-28 at 11.12.39 PM.png

Anubhav Singhmaar 

bottom of page