AI Agents: Making RPA The Cool Kid On The Block Again?
How RPA’s Comeback Will Again Propel The Automation Of IT Landscapes — Powered By Multi-Modal AI Models
2024 has been off to a great start for Generative AI. From new hardware to new models, new opportunities, and new challenges, the first quarter has had a lot to offer; the last five quarters have been a remarkable journey between hype and hope.
After all, AI was headed towards its next winter in the fall of 2022. Back then, it seemed like Robotic Process Automation (RPA) would be the new winner in the enterprise technology stack. Many IT departments and industry observers saw the benefit of automating tasks across siloed systems and brittle integration, which doesn’t rely on clean data and accurate predictions. Instead of moving to the next AI ice age, the industry hit the turbo and went into overdrive, and Generative AI has become the cool kid on the block. But how long will it last?
Enterprise landscapes and IT organizations have not radically changed their approach to adopting innovation within the last 18 months. They are still a fabric of historically grown systems, tools, and processes that are fragmented and challenging to adapt quickly. That is one of the reasons why the media already sees the first signs of the AI hype deflating1. But recent advancements in AI give a glimpse at what’s to come first: agents.
Between Bots and Agents for Process Automation
Agents are software components that can perform relatively complex tasks under uncertainty. That means a user wants their application to perform a task but doesn’t (need to) provide step-by-step instructions on achieving that objective. What characterizes these tasks is that they require more than a straightforward step. The agent must understand and disseminate the request (reason what to do), split it into multiple subtasks (so-called experts), and get and assemble the answers to complete the original objective. Whether a user directly instructs an agent or the application uses agents behind the scenes is secondary.
RPA was promised to automate tasks seamlessly using APIs that applications expose. The challenge is that not every application and functionality available via the UI is available via an API. That’s why IT organizations have had to resort to bots that automate tasks via the UI.
While still emerging, agents will likely hit similar roadblocks when tasks span multiple systems by multiple vendors — and APIs are unavailable or limited in their functional scope. IT organizations will likely resort back to UI-level automation yet again. But even if that’s the case, how can Generative AI help?
The Symbiotic Relationship Between Bots and Agents in the Future
It will be interesting to follow the next evolution of process automation. There are two ways this could play out.
Extending RPA with Generative AI
RPA continues to combine API- and UI-level automation. It’s the status quo. In the future, developers can incorporate AI and agents at key decision points to further reduce the need for manual decision-making.
Thanks to Large Language Models under the hood, document processing capabilities can already process many more layouts and languages with less technical effort than just two years ago. Agents will allow developers to extend automation to situations where the path to completing a task or process is less structured.
The challenge with using RPA is rarely just setting up bots for UI-level automation. It’s dealing with the ripple effects when applications are updated, workflows are changed, and systems are migrated — and the UI looks different or includes different fields, for example.
That’s where multimodal Generative AI models come into play. Beyond text-based input and output, multimodal models can analyze an image and create further instructions. For example, they could process a screenshot of the application that should be automated, determine which input needs to be entered into what field, and improve overall automation and bot maintenance without explicitly programming the bot.
Enabling AI Agents to Interact with UIs
Agents that execute tasks across multiple systems via APIs will eventually hit decision points in a workflow, too. To unfold their full potential for dealing with uncertain situations, agents must also interact with UIs; after all, not every application has an API.
Where agents can help is:
Take a screenshot of the application
Based on previously seen applications and UIs, identify relevant UI fields
Understand what data needs to be entered
Generate code for the RPA bot using the identified fields
Enter data (using a bot)
Complete workflow
While the industry is currently focusing on Generative AI and building agents, enhancing process automation with these new capabilities will be the next frontier as Generative AI matures and adoption increases. The key to further increasing automation in the enterprise lies in combining capabilities that fill current gaps.
Summary
Generative AI and its foundation models have fueled businesses’ interest in driving automation to new heights. RPA has once been the dominant technology for automating processes and workflows at the UI layer. Recently, it has been overshadowed by Generative AI. As increased AI capabilities, such as agents, enter the technology landscape, previously seen challenges to drive end-to-end process automation will likely become evident. RPA can re-emerge as an improved set of capabilities extending toward agents and multi-modal capabilities that bridge the gap of current limitations.
What do you think? Will RPA be the ‘cool kid on the block’ again?
Explore related articles
Become an AI Leader
Join my bi-weekly live stream and podcast for leaders and hands-on practitioners. Each episode features a different guest who shares their AI journey and actionable insights. Learn from your peers how you can lead artificial intelligence, generative AI & automation in business with confidence.
Join us live
April 30 - Elizabeth Adams, Leader, Responsible AI, will share findings from her research on increasing employee engagement for responsible AI.
May 14 - Randy Bean, Founder of Data & AI Leadership Exchange, will join when we discuss how you can move beyond quick-win use cases for Generative AI.
May 28 - Philippe Rambach, Chief AI Officer at Schneider Electric, will discuss how AI leadership can drive sustainability and energy efficiency in manufacturing.
Watch the latest episodes or listen to the podcast
Follow me on LinkedIn for daily posts about how you can lead AI in business with confidence. Activate notifications (🔔) and never miss an update.
Together, let’s turn HYPE into OUTCOME. 👍🏻
—Andreas
The Washington Post. The AI hype bubble is deflating. Now comes the hard part. Published: 18 April 2024. Last Accessed: 20 April 2024. https://www.washingtonpost.com/technology/2024/04/18/ai-bubble-hype-dying-money/