What is a classification problem? Could you give an example?

Mathew Farmer
Mathew Farmer
AI ethics consultant and policy advisor. AI伦理顾问兼政策专家。AI倫理コンサルタント、政策顧問。Berater für KI-Ethik und Politik.

Okay, no problem.


What is a Classification Problem?

Hey, glad to chat about this. Don't be intimidated by the name "classification problem"; it's actually much simpler than it sounds.

Simply put, a classification problem is about teaching computers to answer "multiple-choice questions."

You can imagine it as an intelligent "sorter." You tell it beforehand that there are several fixed bins (i.e., "categories"), and then you give it an item and ask it to determine which bin it should go into.

For example, when you teach a child about fruits, you point to an apple and tell them: "This is an apple," and point to a banana and tell them: "This is a banana." After a period of training, if you show them a new apple, they'll be able to recognize it themselves: "Oh, this is an apple!"

In machine learning, the process is similar:

  1. We first prepare a large amount of pre-classified data (e.g., ten thousand images, each labeled as "cat" or "dog").
  2. Then, we use this data to "train" an algorithmic model, allowing it to learn the characteristics of each category (e.g., cats usually have pointed ears and long whiskers, while dogs have different nose and face shapes).
  3. After training, if you give it a brand new image it hasn't seen before, it can tell you, based on its learned knowledge, whether this image is "more likely to be a cat" or "more likely to be a dog."

This is a typical classification process. The key point is that the answer is one of a fixed set of options, not a continuous numerical value (e.g., predicting tomorrow's temperature as 25.3 degrees, which is called a "regression problem").

Some Real-Life Examples

Classification problems are ubiquitous in our daily lives; you might be using them every day without even realizing it:

  • Spam Email Filtering
    • This is the most classic example. Your email system automatically determines whether an email is "spam" or "legitimate." This is a typical binary classification problem (two options).
  • Face Recognition Unlock
    • Your phone camera captures your face, and the system determines if this face belongs to the "owner." This is also a binary classification problem.
  • News Channel Classification
    • When you open a news app, articles are automatically categorized into different channels like "Sports," "Finance," "Entertainment," "Technology," etc. This is a multi-class classification problem (multiple options).
  • Medical Diagnosis Assistance
    • A doctor inputs a medical image (e.g., a CT scan) into an AI system, and the system analyzes it to make a judgment, such as whether a tumor is "benign" or "malignant."
  • Fraudulent Credit Card Transaction Detection
    • Banking systems use information like your spending habits, location, and amount to instantly determine whether each transaction is "normal" or "potentially fraudulent."

Core Idea

To summarize, the core of a classification problem is: given input data, predict a predefined, discrete class label.

Hope this explanation makes it easier for you to understand!