Last updated: June 22, 2026, 7:49 AM
Un-tit-led

Elements of AI Course

Part of learning is making mistakes and learning from them:

Chapter 1.

How should we define AI

AI in self-driving cars: Humans will move to supervisory role.
AI in Content Recommendation: Filter bubbles, echo chambers, troll factories, fake news and new forms of propaganda.
AI in Media Processing: Seeing on a display is no longer believing.

What is and isn't AI:

AI - Autonomy and Adaptivity:
Autonomy:- Perform tasks in complex environment without constant guidance by a user.
Adaptivity:- Improve performance by learning from experience.

Suitcase words: - term that carry a whole bunch of different meanings that come along even if we intend only one of them. Using suitcase words increases the risk of misinterpretations like saying a computer system understand image because it is able to segment an image into distinct objects like cars, pedestrians, buildings, roads and so on but if the person is wearing a photo of a road printed on it, it's not okay to drive over that person.

Different AI systems cannot be compared to a single axis or dimension in terms of their intelligence. AI intelligence is narrow. Being able to solve one problem tells us nothing about the ability to solve another, different problem.

AI is a scientific discipline like mathematics or biology. AI is a collection of concepts, problems, and methods for solving them. "AI" is not a countable noun.

It's better to talk about AIness instead of whether something is AI or not. Say an AI method instead of an AI.

II

Related fields

CS>AI>Machine Learning>Deep Learning
Machine learning: Systems that improve their performance in a given task with more and more experience or data.
"Depth" of Deep learning refers to the complexity of a mathematical model, and that the increased computing power of modern computers has allowed researchers to increase their complexity to reach levels that are quantitatively and qualitatively different.

Science often involves a number of progressively more special subfields, subfields of subfields, and so on. This enables researchers to zoom into particular topic to follow up with the increasing amount of knowledge, produce new and correct old knowledge.

Data Science: Covers several sub-disciplines that include machine learning and statistics, certain aspects of computer science including algorithms, data storage, and web application development. It also requires understanding of the domain in which it is applied in, basic assumptions and constraints.

Robotics: Building and Programming Robots so they can operate in complex, real-world scenarios. Robotics is the ultimate challenge of AI.
Many of the robotics-related AI problems are best approached by machine learning which makes machine learning a central branch of AI for robots.

Robot: Machine comprising of sensors and actuators (act on the environment) that can be programmed to perform a sequence of actions.

Any kind of vehicle with some level of autonomy and sensors and actuators are counted as robotics. However, software-based solutions such as customer service chatbot even if they are called "software robots" aren't counted as (real) robotics.

Taxonomy is a scheme for classifying many things that may be special cases with one another(think of concentric circles).

III

philosophy of AI

Turing Test: A human can't distinguish between an AI and a real human.

A few chatbots have already passed this test to some extent. One criticism is that the computer doesn't actually have to be intelligent but just behaves like a human or "appears" intelligent.

Chinese Room problem by John Searle: A non-chinese speaking person is locked in a room with a manual on what to respond to chinese phrases and does so.
The person inside(algorithm) doesn't actually know chinese even if the person outside the room gets the impression it is so.

Even if a machine behaves in an intelligent manner(conscious), by passing the turing test, it doesn't follow that it has a mind in the way that a human has.

A self-driving car is an example of an element of intelligence(driving a car). Chinese Room Argument suggests it isn't really intelligent thinking but it just looks like one.
This automated car doesn't see or understand the environments, and it doesn't know how to drive safely, in the way a human being does.

According to Searle this means that the intelligent behavior of the system is fundamentally different from actually being intelligent.

Strong vs. Weak AI: Being intelligent and acting intelligently. Strong AI is a mind that is genuinely intelligent and self-conscious. Weak AI is what we actually have, namely systems that exhibit intelligent behaviors despite being "mere" computers.

Just like no one gives two shits about science philosophy in science research, AI philosophies won't matter.

AGI(artificial General Intelligence) a machine that can handle any intellectual task, whereas a narrow AI is that handles just one task.

AGI has been abandoned by the AI researchers.

Chapter 2.

AI problem Solving

Search algorithms don't feel cool but they can be used to solve tasks that require intelligence like navigation and playing chess.

I. Search and problem solving

  • State Space: Set of possible solutions.
  • Transitions: Possible Moves between one state and another.
    Note: Sequence of multiple transitions is a path.
  • Costs: Prefer cheaper transitions, and not always in terms of money.

II. Solving problems with AI

  • "Anything that can be computed(=calculated using either numbers or symbols) can be automated."
    - Alan Turing Insight
  • Any intelligence can be broken down into small steps so that each step is so "mechanical" that it can be written down as a computer program.
    - John McCarthy - Father of AI.
    That' statement is still a conjecture, which means we can't really prove it to be true.
  • McCarthy wanted to bypass Searle's Chinese Room: intelligence is intelligence even if the system that implements it is just a computer that mechanically follows a program.
  • Games provided a convenient restricted domain that could be formalized easily. That's why games and search became central in AI research.

    III. Search and games
  • Different state of a game are represented by nodes in a game tree.
  • if the next level is going to be a single outcome we can pull that outcome a level up. Sounds pretty logical.
  • Also if all the outcomes of the deeper level is going to be the same we can give that level the same outcome as the deeper levels as long as they are all the same.
  • Careful about who turn it is, Min or Max. That determines the actual value assigned. It's not always 50-50 as min wants to minimize and max wants to maximize.
  • One of the assumption is that both players choose what is best for them and that what is best for one is the worst for the other(so called "zero-sum game")
  • The Minimax algorithm:
    A minimax algorithm[5] is a recursive algorithm for choosing the next move in an n-player game, usually a two-player game. A value is associated with each position or state of the game. This value is computed by means of a position evaluation function and it indicates how good it would be for a player to reach that position. The player then makes the move that maximizes the minimum value of the position resulting from the opponent's possible following moves. If it is A's turn to move, A gives a value to each of their legal moves.
  • If we can afford to only explore a small part of a game tree, we need a way to stop the minimax algorithm before reaching an end-nod. This is achieved by heuristic(useful although not optimal) evaluation function.
  • The limitations of plain search:
    • The number of states even in moderately complex real-world grows out of hand and we can't find a solution by exhaustive search("brute force") or even by using clever heuristics.
    • Transitions are not always deterministic. There are factors outside our control that are often unknown to us.

Chapter 3.

Real world AI

The reason why modern AI methods actually work in the real world now is the ability to deal with uncertainty.

    I. Odds and probability
  • Noise: Inherent errors in sensor data.
  • Fuzzy logic was for a while the best approach to handle uncertain and imprecise information and used. However, probability turned out to be the best approach for reasoning under uncertainty.
    Currently almost all AI applications are based to some degree on probabilities.
  • Probability: Ability to think of uncertainty as a thing that can be quantified at least in principle.
  • It is usually not possible to draw conclusions about whether a particular number was right or wrong based on a single observation.
  • Uncertainty is not beyond the scope of rational thinking and discussion, and probability provides a systematics way of doing just that.
  • In gambling terms, the odds are given from the bookmakers point of view so 3:1 is your chances of winning are 1:3, or three odds to one in gambling terms.
  • It has been found that people make more mistakes with percentages than natural frequencies or odds.
  • Odds 1:5 mean you'd have to play the game 1+5=6 times to get one win on the average.
    The probability 20% means that you'd have to play the game five times to get one win on average.
    II. The Bayes rule
  • posterior odds = likelihood ratio x prior odds
  • Purpose of the formula is to update the odds when new information becomes available, to obtain the posterior("post") odds.
  • Likelihood Ratio: Probability of the observation in case the even of interest, divided by the one of no event.
  • Base Rate Fallacy: Our intuition is not well geared towards weighing different pieces of evidence. This is true especially when the pieces of evidence conflict with each other.
    Our brain tends to choose one of these pieces of evidence and ignore the other. It is typically the low base rate that is ignored.
    Knowing the Bayes rule is the best cure against it.
    III. Naive Bayes classification
  • The Bayes classifier is a machine learning technique to classify objects into two or more classes.
    The classifier is trained by analyzing a set of training data, for which the correct classes are given.
    The naive Bayes classifier can be used to determine the probabilities of the classes given a number of different observations.
  • spam(or "junk email") vs. ham("legitimate message")
  • naive: Using spam as example: The dependency of words and the order of the words have no significance. That is each word can be processed independently.
  • "All models are wrong, but some are useful"
    - George E.P. Box
  • One problem with estimating probabilities directly from counts is getting 0/0. Instead use a small lower bound like 1/1000000

Chapter 4.

Machine Learning

Learning is a key element of intelligence.

    I. The types of machine learning
  • MNIST Dataset: Modified National Institute of Standards and Technology.
  • The roots of machine learning are in statistics: Extracting knowledge from data.
    1. Three types of machine learning:
    2. Supervised learning: Predict correct output based on input data.
    3. Unsupervised learning: There are no correct output. Task is to discover the structure of the data.
    4. Reinforcement learning: Feedback about good or bad choice is available with some delay.
  • Supervised Learning.
  • Caveat: Careful with machine learning algorithm
    Avoid Big Mistakes by splitting your data set into two parts: training data & test data.
  • A model might be very good on training data but it is no proof that it can generalize any other data. Test data is used here.
  • Overfitting: Trying to be too smart or an ego problem where you fail to admit you might be wrong.
    Like trying to keep adding rules to fit the new set of data every time which might make it fine on this iteration but worse for the next.
  • Machine learning is prone to overfitting because they can try a huge number of different "rules" until one that first the training data perfectly. Especially, methods that are very flexible and can adopt to almost any pattern in the data can overfit unless the amount of data is enormous.
    This is also why neural networks can require massive amounts of data before they produce reliable prediction.
  • Learning to avoid overfitting and choose a model that is not too restricted, nor too flexible, is one of the most essential skills of a data scientist.
  • Unsupervised Learning
  • In unsupervised learning there is no correct answer which the model can try to fit to.
    II. The nearest neighbor classifier
    III. Regression

Chapter 5.

Neural Networks

    I. Neural Networks basics
    II. How neural networks are built
    III. Advanced neural network techniques

Chapter 6.

Implications

    I. About predicting the future
    II. The societal implications of AI
    III. Summary