Building an AI Agent from Scratch

Ima Miri
4 min readOct 7, 2024

--

In this blog, we’ll dive into AI agents — smart assistants that understand and act based on conversations, gaining popularity in 2024 due to models like GPT-4.

The video tutorial of this article is available here.

What is an AI Agent?

An AI agent is like a knowledgeable friend that listens to your questions and takes action based on them. With advanced models, these agents offer natural, helpful responses, revolutionizing industries by handling repetitive tasks, increasing efficiency, and seamlessly integrating with modern platforms.

Why AI Agents?

In 2024, AI agents are more accessible and customizable. Businesses can now easily build solutions that meet specific needs, such as automating customer service, managing lead generation, or providing instant information. These capabilities make AI agents indispensable for improving business processes and engagement.

Building an AI Agent

In this tutorial, we built an AI agent using Python, the OpenAI API, and a structured loop of Thought, Action, Pause, Response. Here’s the simplified flow:

  • Thought: The agent processes the query.
  • Action: It executes an action, like finding a city’s population.
  • PAUSE: The action is completed.
  • Response: The agent provides the final result.

We also used a simple Python script, which helps break tasks into manageable steps, integrating external data sources.

The Code

Here’s an overview of the code we used:

  • We initialized the OpenAI API and created an agent class.
prompt = """
You will follow a loop of Thought, Action, PAUSE and Response. The goal is to get an answer at the end.
Thought: Think about how to answer the question.
Action: Use one of the available actions to get information - then return PAUSE.
Response: After the action is complete, respond with the result.

Available actions:
lookup_capital: Finds the capital city of a country. Example: lookup_capital: France, Result: Returns the capital city (e.g., "Paris").
find_population: Looks up the population of a country or city. Example: find_population: Tokyo, Result: Returns the population (e.g., "14 million").

Example conversation:

Question: What is the population of capital city of France?

Thought: I should find the capital city of France.
Action: lookup_capital: Paris
PAUSE

Then you'll find the population of the capital city:

Response: The population of Paris is approximately 2.1 million people.

Finally, respond:

Answer: The population of Paris is approximately 2.1 million people.
""".strip()

2. The agent uses predefined functions like lookup_capital (finding a city) and find_population (returning population data).

def lookup_capital(country):
capitals = {
"France": "Paris",
"Germany": "Berlin",
"Japan": "Tokyo",
"USA": "Washington, D.C.",
"Canada": "Ottawa",
}
return capitals.get(country, "Capital not found for {}".format(country))

def find_population(city):
populations = {
"Tokyo": "14 million",
"New York": "8.3 million",
"London": "9 million",
"Paris": "2.1 million",
"Sydney": "5.3 million",
}
return populations.get(city, "Population data not found for {}".format(city))

known_actions = {
"find_population": find_population,
"lookup_capital": lookup_capital
}

action_re = re.compile('^Action: (\w+): (.*)$')

3. The conversation flows in a loop until the agent reaches a conclusion or stops when no further questions are posed.

Example:

You ask, “What’s the population of Tokyo?” The agent first identifies the capital city and then returns, “The population of Tokyo is approximately 14 million.” It continues until the user exits.

Sample output

Welcome to the Custom AI agent! Type ‘exit’ to end the conversation.

You: What’s the population of the capital city of Japan?

AI: Thought: I need to find the capital city of Japan first.

Action: lookup_capital: Japan

PAUSE AI: Response: The capital city of Japan is Tokyo.

Thought: Now I need to find the population of Tokyo.

Action: find_population: Tokyo

PAUSE

AI: Response: The population of Tokyo is approximately 14 million people.

Answer: The population of Tokyo is approximately 14 million people.

If you have any more questions or need further assistance, feel free to ask.

You: What’s the population of capital city of France?

AI: Thought: I need to find the capital city of France first.

Action: lookup_capital: France

PAUSE

AI: Response: The capital city of France is Paris.

Thought: Now I need to find the population of Paris.

Action: find_population: Paris

PAUSE

AI: Response: The population of Paris is approximately 2.1 million people.

Answer: The population of Paris is approximately 2.1 million people.

If you have any more questions or need further assistance, feel free to ask.

You: exit

Goodbye!

Benefits and Future Potential

As AI agents evolve, they will become even more advanced and useful in various sectors, automating complex tasks, improving customer service, and driving business innovation. By leveraging platforms like Python and OpenAI, creating your AI agent is now more accessible than ever.

--

--