AI Agents: Enabling Greater Automation And Autonomy For End-Users
How AI Agents Work And Support End-Users
The last five quarters have seen a rapid evolution of Generative AI. New approaches and techniques for getting the best results when using this technology have evolved quickly—from prompting to fine-tuning, retrieval-augmented generation, and knowledge graphs. Most of these concepts existed for some time before the current AI hype, and they build upon and improve each other. One concept has the potential to be a real game-changer and not just an incremental improvement: agents.
Reason enough to look at them in more detail over the coming weeks and posts…
Increase Automation And Autonomy With Agents
Until a few years ago, software developers had to describe a program's logic and what it should do. With the emergence of machine learning, applications have been able to detect patterns in data, learn from this information, and apply it to new data they process. However, AI-driven automation and autonomy were limited to narrowly defined tasks, such as recommending products that are frequently bought together or forecasting demand for your products. In these examples, a model is constrained to one task or predicting one kind of information.
Agents allow developers to take this idea of automation and autonomy several steps further. Agents promise to automate broadly defined tasks and goals, such as “Search the cheapest flight from New York to Frankfurt on May 15,” instead of narrowly defined tasks (like above).
This means that although the objective that a user wants to reach is clear, the exact steps to complete this objective are not explicitly defined. An agent can look up information from external data sources and complete the task. Generally speaking, agents process signals of their environment and manipulate/ interact with them. But how do they do that?
Agents are typically comprised of four components1:
Core: Instructions and persona of the agent that govern the agent’s operation and behavior.
Example: [Prompt]: “You are a friendly and helpful customer service representative.”
Planning: Disseminate a problem and divide it into several sub-questions that can be answered in parallel.
Example: Break the problem into smaller manageable pieces. How would you execute the task?
Tools: Available resources, tools, and access to sites to obtain additional information.
Example: You have access to the following sites […].
Memory: Store information about the task, the steps to pursue it, interim results, and the history of user interactions over time.
Example: Store the history of the last conversations.
Best-Suited Problems For Agents
Agents can perform relatively complex tasks under uncertainty. That means a user wants their application to perform a task but doesn’t (need to) provide step-by-step instructions on achieving that objective. What characterizes these tasks is that they require more than a straightforward step. The agent must understand and disseminate the request (reason what to do), split it into multiple subtasks (so-called experts), and get and assemble the answers to complete the original objective.
The first wave of agents will be a big technological accomplishment as AI reasons over the best course of action. The scope will likely be limited, and lower-risk scenarios (e.g., information retrieval) will be prioritized over more complex and riskier ones (e.g., negotiation).
Based on the adoption of new technologies in the past, business users will likely be skeptical of agents unless they can track and trace what an agent is doing (and why). Depending upon the industry and scenario, it might even be impossible for compliance reasons to use agents.
Expected Challenges Where Agents Fall Short (For Now)
Although all that sounds very promising, agents will hit roadblocks. The agent design uses LLMs to understand and reason, RAG to add external data, and APIs to communicate with external websites and services. Tasks within an ecosystem of systems (where APIs are available) will most likely work like a breeze. Security and authorization can be applied, and agents will be integrated into software natively. But just like we have seen with Robotic Process Automation (RPA), agents will likely hit a roadblock as soon as tasks span multiple systems and APIs are unavailable.
IT organizations will resort back to UI-level automation. But even if that’s the case, how can Generative AI help in this scenario?
Stay tuned. More thoughts on that to come…
Summary
Agents are the latest implementation of increasingly autonomous software. Instead of the end-user defining the individual steps to achieving an outcome, the agent breaks down the goal into subtasks that it then executes and re-aggregates to form the solution.
This new technological advancement promises further increase the level of automation and autonomy of applications including their scope and usefulness for an end-user.
Explore related articles
Become an AI Leader
Join my bi-weekly live stream and podcast for leaders and hands-on practitioners. Each episode features a different guest who shares their AI journey and actionable insights. Learn from your peers how you can lead artificial intelligence, generative AI & automation in business with confidence.
Join us live
April 16 - T. Scott Clendaniel, VP & AI Instructor at Analytics-Edge, will share how you can use Generative AI to improve the user experience.
April 30 - Elizabeth Adams, Leader, Responsible AI, will share findings from her research on increasing employee engagement for responsible AI.
May 14 - Randy Bean, Founder of Data & AI Leadership Exchange, will will join when we discuss how you can move beyond quick-win use cases for Generative AI.
May 28 - Philippe Rambach, Chief AI Officer at Schneider Electric, will discuss how AI leadership can drive sustainability and energy efficiency in manufacturing.
Watch the latest episodes or listen to the podcast
Follow me on LinkedIn for daily posts about how you can lead AI in business with confidence. Activate notifications (🔔) and never miss an update.
Together, let’s turn HYPE into OUTCOME. 👍🏻
—Andreas
NVIDIA. Introduction to LLM Agents. Published: 30 November 2023. Last accessed: 09 April 2024. https://developer.nvidia.com/blog/introduction-to-llm-agents/