Good Data vs Bad Data 🗑️

The Golden Rule of AI

💡

"Garbage in, garbage out" 🗑️

If you feed an AI bad data, you'll get a bad AI. Simple as that!

😂 What Happens with Bad Data?

Story 1: The Racist Chatbot

In 2016, Microsoft released a chatbot called Tay on Twitter. It learned from Twitter conversations.

Within 16 hours, users had taught it to say horrible things. Microsoft shut it down.

The problem? Bad quality, unfiltered training data!

Story 2: The Confused Self-Driving Car

A self-driving car was trained mostly on US roads. When tested in the UK (where people drive on the LEFT), it got very confused.

The problem? Not enough diverse data!

Story 3: The Hiring AI

Amazon built an AI to screen job applications. Most historical applications were from men, so the AI learned to prefer male candidates and downgrade women's applications.

The problem? Biased data!

What Makes Data GOOD?

| Quality | What it means | |---------|--------------| | ✅ Accurate | Information is correct | | ✅ Complete | No important bits missing | | ✅ Diverse | Represents all different cases | | ✅ Recent | Not outdated or old | | ✅ Relevant | Actually useful for the AI's task |

🎮 Real Example: Training a Dog Detector

Bad data 🗑️: Only photos of golden retrievers on sunny days

Fails at: pugs, black dogs, dogs in snow

Good data ✅: Photos of 200+ breeds, all ages, all weathers, all angles

Works for: ALL dogs in ANY situation!

💡 How Much Data Do Big AIs Need?

ChatGPT: Trained on 570 GB of text (about 1.3 million books worth!) 📚
Google Photos: Trained on billions of labelled images
Spotify: Analyses 100 million songs

Data collection is one of the most important (and expensive!) parts of building AI.

Good Data vs Bad Data 🗑️

Good Data vs Bad Data 🗑️

The Golden Rule of AI

😂 What Happens with Bad Data?

Story 1: The Racist Chatbot

Story 2: The Confused Self-Driving Car

Story 3: The Hiring AI

What Makes Data GOOD?

🎮 Real Example: Training a Dog Detector

💡 How Much Data Do Big AIs Need?

Quick check