I recently met with Dr. Nick Fuller, Vice President, Distributed Cloud, at IBM Research for a discussion about IBM’s long-range plans and strategy for artificial intelligence and machine learning at the edge.
Dr. Fuller is responsible for providing AI and platform–based innovation for enterprise digital transformation spanning edge computing and distributed cloud management. He is an IBM Master Inventor with over 75 patents and co-author of 75 technical publications. Dr. Fuller obtained his Bachelor of Science in Physics and Math from Morehouse College and his PhD in Applied Physics from Columbia University.
Edge In, not Cloud Out
In general, Dr. Fuller told me that IBM is focused on developing an “edge in” position versus a “cloud out” position with data, AI, and Kubernetes-based platform technologies to scale hub and spoke deployments of edge applications.
A hub plays the role of a central control plane used for orchestrating the deployment and management of edge applications in a number of connected spoke locations such as a factory floor or a retail branch, where data is generated or locally aggregated for processing.
“Cloud out” refers to the paradigm where cloud service providers are extending their cloud architecture out to edge locations. In contrast, “edge in” refers to a provider-agnostic architecture that is cloud-independent and treats the data-plane as a first-class citizen.
IBM’s overall architectural principle is scalability, repeatability, and full stack solution management that allows everything to be managed using a single unified control plane.
IBM’s Red Hat platform and infrastructure strategy anchors the application stack with a unified, scalable, and managed OpenShift-based control plane equipped with a high-performance storage appliance and self-healing system capabilities (inclusive of semi-autonomous operations).
IBM’s strategy also includes several in-progress platform-level technologies for scalable data, AI/ML runtimes, accelerator libraries for Day-2 AI operations, and scalability for the enterprise.
It is an important to mention that IBM is designing its edge platforms with labor cost and technical workforce in mind. Data scientists with PhDs are in high demand, making them difficult to find and expensive to hire once you find them. IBM is designing its edge system capabilities and processes so that domain experts rather than PhDs can deploy new AI models and manage Day-2 operations.
Why edge is important
Advances in computing and storage have made it possible for AI to process mountains of accumulated data to provide solutions. By bringing AI closer to the source of data, edge computing is faster and more efficient than cloud. While Cloud data accounts for 60% of the world’s data today, vast amounts of new data is being created at the edge, including industrial applications, traffic cameras, and order management systems, all of which can be processed at the edge in a fast and timely manner.
Public cloud and edge computing differ in capacity, technology, and management. An advantage of edge is that data is processed and analyzed at / near its collection point at the edge. In the case of cloud, data must be transferred from a local device and into the cloud for analytics and then transferred back to the edge again. Moving data through the network consumes capacity and adds latency to the process. It’s easy to see why executing a transaction at the edge reduces latency and eliminates unnecessary load on the network.
Increased privacy is another benefit of processing data at the edge. Analyzing data where it originates limits the risk of a security breach. Most of the communications between the edge and the cloud is then confined to such things as reporting, data summaries, and AI models, without ever exposing the raw data.
IBM at the Edge
In our discussion, Dr. Fuller provided a few examples to illustrate how IBM plans to provide new and seamless edge solutions for existing enterprise problems.
Example #1 – McDonald’s drive-thru
Dr. Fuller’s first example centered around Quick Service Restaurant’s (QSR) problem of drive-thru order taking. Last year, IBM acquired an automated order-taking system from McDonald’s. As part of the acquisition, IBM and McDonald’s established a partnership to perfect voice ordering methods using AI. Drive-thru orders are a significant percentage of total QSR orders for McDonald’s and other QSR chains.
McDonald’s and other QSR restaurants would like every order to be processed as quickly and accurately as possible. For that reason, McDonald’s conducted trials at ten Chicago restaurants using an edge-based AI ordering system with NLP (Natural Language Processing) to convert spoken orders into a digital format. It was found that AI had the potential to reduce ordering errors and processing time significantly. Since McDonald’s sells almost 7 million hamburgers daily, shaving a minute or two off each order represents a significant opportunity to address labor shortages and increase customer satisfaction.
Example #2 – Boston Dynamics and Spot the agile mobile robot
According to an earlier IBM survey, many manufacturers have already implemented AI-driven robotics with autonomous decision-making capability. The study also indicated that over 80 percent of companies believe AI can help improve future business operations. However, some companies expressed concern about the limited mobility of edge devices and sensors.
To develop a mobile edge solution, IBM teamed up with Boston Dynamics. The partnership created an agile mobile robot using IBM Research and IBM Sustainability Software AI technology. The device can analyze visual sensor readings in hazardous and challenging industrial environments such as manufacturing plants, warehouses, electrical grids, waste treatment plants and other hazardous environments. The value proposition that Boston Dynamics brought to the partnership was Spot the agile mobile robot, a walking, sensing, and actuation platform. Like all edge applications, the robot’s wireless mobility uses self-contained AI/ML that doesn’t require access to cloud data. It uses cameras to read analog devices, visually monitor fire extinguishers, and conduct a visual inspection of human workers to determine if required safety equipment is being worn.
IBM was able to show up to a 10X speedup by automating some manual tasks, such as converting the detection of a problem into an immediate work order in IBM Maximo to correct it. A fast automated response was not only more efficient, but it also improved the safety posture and risk management for these facilities. Similarly, some factories need to thermally monitor equipment to identify any unexpected hot spots that may show up over time, indicative of a potential failure.
IBM is working with National Grid, an energy company, to develop a mobile solution using Spot, the agile mobile robot, for image analysis of transformers and thermal connectors. As shown in the above graphic, Spot also monitored connectors on both flat surfaces and 3D surfaces. IBM was able to show that Spot could detect excessive heat build-up in small connectors, potentially avoiding unsafe conditions or costly outages. This AI/ML edge application can produce faster response times when an issue is detected, which is why IBM believes significant gains are possible by automating the entire process.
IBM market opportunities
Drive-thru orders and mobile robots are just a few examples of the millions of potential AI applications that exist at the edge and are driven by several billion connected devices.
Edge computing is an essential part of enterprise digital transformation. Enterprises seek ways to demonstrate the feasibility of solving business problems using AI/ML and analytics at the edge. However, once a proof of concept has been successfully demonstrated, it is a common problem for a company to struggle with scalability, data governance, and full-stack solution management.
Challenges with scaling
“Determining entry points for AI at the edge is not the difficult part,” Dr. Fuller said. “Scale is the real issue.”
Scaling edge models is complicated because there are so many edge locations with large amounts of diverse content and a high device density. Because large amounts of data are required for training, data gravity is a potential problem. Further, in many scenarios, vast amounts of data are generated quickly, leading to potential data storage and orchestration challenges. AI Models are also rarely “finished.” Monitoring and retraining of models are necessary to keep up with changes the environment.
Through IBM Research, IBM is addressing the many challenges of building an all-encompassing edge architecture and horizontally scalable data and AI technologies. IBM has a wealth of edge capabilities and an architecture to create the appropriate platform for each application.
IBM AI entry points at the edge
IBM sees Edge Computing as a $200 billion market by 2025. Dr. Fuller and his organization have identified four key market entry points for developing and expanding IBM’s edge compute strategy. In order of size, IBM believes its priority edge markets to be intelligent factories (Industry 4.0), telcos, retail automation, and connected vehicles.
IBM and its Red Hat portfolio already have an established presence in each market segment, particularly in intelligent operations and telco. Red Hat is also active in the connected vehicles space.
There have been three prior industrial revolutions, beginning in the 1700s up to our current in-progress fourth revolution, Industry 4.0, that promotes a digital transformation.
Manufacturing is the fastest growing and the largest of IBM’s four entry markets. In this segment, AI at the edge can improve quality control, production optimization, asset management, and supply chain logistics. IBM believes there are opportunities to achieve a 4x speed up in implementing edge-based AI solutions for manufacturing operations.
For its Industry 4.0 use case development, IBM, through product, development, research and consulting teams, is working with a major automotive OEM. The partnership has established the following joint objectives:
- Increase automation and scalability across dozens of plants using 100s of AI / ML models. This client has already seen value in applying AI/ML models for manufacturing applications. IBM Research is helping with re-training models and implementing new ones in an edge environment to help scale even more efficiently. Edge offers faster inference and low latency, allowing AI to be deployed in a wider variety of manufacturing operations requiring instant solutions.
- Dramatically reduce the time required to onboard new models. This will allow training and inference to be done faster and allow large models to be deployed much more quickly. The quicker an AI model can be deployed in production; the quicker the time-to-value and the return-on-investment (ROI).
- Accelerate deployment of new inspections by reducing the labeling effort and iterations needed to produce a production-ready model via data summarization. Selecting small data sets for annotation means manually examining thousands of images, this is a time-consuming process that will result in – labeling of redundant data. Using ML-based automation for data summarization will accelerate the process and produce better model performance.
- Enable Day-2 AI operations to help with data lifecycle automation and governance, model creation, reduce production errors, and provide detection of out-of-distribution data to help determine if a model’s inference is accurate. IBM believes this will allow models to be created faster without data scientists.
Maximo Application Suite
IBM’s Maximo Application Suite plays an important part in implementing large manufacturers’ current and future IBM edge solutions. Maximo is an integrated public or private cloud platform that uses AI, IoT, and analytics to optimize performance, extend asset lifecycles and reduce operational downtime and costs. IBM is working with several large manufacturing clients currently using Maximo to develop edge use cases, and even uses it within its own Manufacturing.
IBM has research underway to develop a more efficient method of handling life cycle management of large models that require immense amounts of data. Day 2 AI operations tasks can sometimes be more complex than initial model training, deployment, and scaling. Retraining at the edge is difficult because resources are typically limited.
Once a model is trained and deployed, it is important to monitor it for drift caused by changes in data distributions or anything that might cause a model to deviate from original requirements. Inaccuracies can adversely affect model ROI.
Day-2 AI Operations (retraining and scaling)
Day-2 AI operations consist of continual updates to AI models and applications to keep up with changes in data distributions, changes in the environment, a drop in model performance, availability of new data, and/or new regulations.
IBM recognizes the advantages of performing Day-2 AI Operations, which includes scaling and retraining at the edge. It appears that IBM is the only company with an architecture equipped to effectively handle Day-2 AI operations. That is a significant competitive advantage for IBM.
A company using an architecture that requires data to be moved from the edge back into the cloud for Day-2 related work will be unable to support many factory AI/ML applications because of the sheer number of AI/ML models to support (100s to 1000s).
“There is a huge proliferation of data at the edge that exists in multiple spokes,” Dr. Fuller said. “However, all that data isn’t needed to retrain a model. It is possible to cluster data into groups and then use sampling techniques to retrain the model. There is much value in federated learning from our point of view.”
Federated learning is a promising training solution being researched by IBM and others. It preserves privacy by using a collaboration of edge devices to train models without sharing the data with other entities. It is a good framework to use when resources are limited.
Dealing with limited resources at the edge is a challenge. IBM’s edge architecture accommodates the need to ensure resource budgets for AI applications are met, especially when deploying multiple applications and multiple models across edge locations. For that reason, IBM developed a method to deploy data and AI applications to scale Day-2 AI operations utilizing hub and spokes.
The graphic above shows the current status quo methods of performing Day-2 operations using centralized applications and a centralized data plane compared to the more efficient managed hub and spoke method with distributed applications and a distributed data plane. The hub allows it all to be managed from a single pane of glass.
Data Fabric Extensions to Hub and Spokes
IBM uses hub and spoke as a model to extend its data fabric. The model should not be thought of in the context of a traditional hub and spoke. IBM’s hub provides centralized capabilities to manage clusters and create multiples hubs that can be aggregated to a higher level. This architecture has four important data management capabilities.
- First, models running in unattended environments must be monitored. From an operational standpoint, detecting when a model’s effectiveness has significantly degraded and if corrective action is needed is critical.
- Secondly, in a hub and spoke model, data is being generated and collected in many locations creating a need for data life cycle management. Working with large enterprise clients, IBM is building unique capabilities to manage the data plane across the hub and spoke estate – optimized to meet data lifecycle, regulatory & compliance as well as local resource requirements. Automation determines which input data should be selected and labeled for retraining purposes and used to further improve the model. Identification is also made for atypical data that is judged worthy of human attention.
- The third issue relates to AI pipeline compression and adaptation. As mentioned earlier, edge resources are limited and highly heterogeneous. While a cloud-based model might have a few hundred million parameters or more, edge models can’t afford such resource extravagance because of resource limitations. To reduce the edge compute footprint, model compression can reduce the number of parameters. As an example, it could be reduced from several hundred million to a few million.
- Lastly, suppose a scenario exists where data is produced at multiple spokes but cannot leave those spokes for compliance reasons. In that case, IBM Federated Learning allows learning across heterogeneous data in multiple spokes. Users can discover, curate, categorize and share data assets, data sets, analytical models, and their relationships with other organization members.
In addition to AI deployments, the hub and spoke architecture and the previously mentioned capabilities can be employed more generally to tackle challenges faced by many enterprises in consistently managing an abundance of devices within and across their enterprise locations. Management of the software delivery lifecycle or addressing security vulnerabilities across a vast estate are a case in point.
Multicloud and Edge platform
In the context of its strategy, IBM sees edge and distributed cloud as an extension of its hybrid cloud platform built around Red Hat OpenShift. One of the newer and more useful options created by the Red Hat development team is the Single Node OpenShift (SNO), a compact version of OpenShift that fits on a single server. It is suitable for addressing locations that are still servers but come in a single node, not clustered, deployment type.
For smaller footprints such as industrial PCs or computer vision boards (for example NVidia Jetson Xavier), Red Hat is working on a project which builds an even smaller version of OpenShift, called MicroShift, that provides full application deployment and Kubernetes management capabilities. It is packaged so that it can be used for edge device type deployments.
Overall, IBM and Red Hat have developed a full complement of options to address a large spectrum of deployments across different edge locations and footprints, ranging from containers to management of full-blown Kubernetes applications from MicroShift to OpenShift and IBM Edge Application Manager.
Much is still in the research stage. IBM’s objective is to achieve greater consistency in terms of how locations and application lifecycle is managed.
First, Red Hat plans to introduce hierarchical layers of management with Red Hat Advanced Cluster Management (RHACM), to scale by two to three orders of magnitude the number of edge locations managed by this product. Additionally, securing edge locations is a major focus. Red Hat is continuously expanding platform security features, for example by recently including Integrity Measurement Architecture in Red Hat Enterprise Linux, or by adding Integrity Shield to protect policies in Red Hat Advanced Cluster Management (RHACM).
Red Hat is partnering with IBM Research to advance technologies that will permit it to protect platform integrity and the integrity of client workloads through the entire software supply chains. In addition, IBM Research is working with Red Hat on analytic capabilities to identify and remediate vulnerabilities and other security risks in code and configurations.
Telco network intelligence and slice management with AL/ML
Communication service providers (CSPs) such as telcos are key enablers of 5G at the edge. 5G benefits for these providers include:
- Reduced operating costs
- Improved efficiency
- Increased distribution and density
- Lower latency
The end-to-end 5G network comprises the Radio Access Network (RAN), transport, and core domains. Network slicing in 5G is an architecture that enables multiple virtual and independent end-to-end logical networks with different characteristics such as low latency or high bandwidth, to be supported on the same physical network. This is implemented using cloud-native technology enablers such as software defined networking (SDN), virtualization, and multi-access edge computing. Slicing offers necessary flexibility by allowing the creation of specific applications, unique services, and defined user groups or networks.
An important aspect of enabling AI at the edge requires IBM to provide CSPs with the capability to deploy and manage applications across various enterprise locations, possibly spanning multiple end-to-end network slices, using a single pane of glass.
5G network slicing and slice management
Network slices are an essential part of IBM’s edge infrastructure that must be automated, orchestrated and optimized according to 5G standards. IBM’s strategy is to leverage AI/ML to efficiently manage, scale, and optimize the slice quality of service, measured in terms of bandwidth, latency, or other metrics.
5G and AI/ML at the edge also represent a significant opportunity for CSPs to move beyond traditional cellular services and capture new sources of revenue with new services.
Communications service providers need management and control of 5G network slicing enabled with AI-powered automation.
Dr. Fuller sees a variety of opportunities in this area. “When it comes to applying AI and ML on the network, you can detect things like intrusion detection and malicious actors,” he said. “You can also determine the best way to route traffic to an end user. Automating 5G functions that run on the network using IBM network automation software also serves as an entry point.”
In IBM’s current telecom trial, IBM Research is spearheading the development of a range of capabilities targeted for the IBM Cloud Pak for Network Automation product using AI and automation to orchestrate, operate and optimize multivendor network functions and services that include:
- End-to-end 5G network slice management with planning & design, automation & orchestration, and operations & assurance
- Network Data and AI Function (NWDAF) that collects data for slice monitoring from 5G Core network functions, performs network analytics, and provides insights to authorized data consumers.
- Improved operational efficiency and reduced cost
Future leverage of these capabilities by existing IBM Clients that use the Cloud Pak for Network Automation (e.g., DISH) can offer further differentiation for CSPs.
5G radio access
Open radio access networks (O-RANs) are expected to significantly impact telco 5G wireless edge applications by allowing a greater variety of units to access the system. The O-RAN concept separates the DU (Distributed Units) and CU (Centralized Unit) from a Baseband Unit in 4G and connects them with open interfaces.
O-RAN system is more flexible. It uses AI to establish connections made via open interfaces that optimize the category of a device by analyzing information about its prior use. Like other edge models, the O-RAN architecture provides an opportunity for continuous monitoring, verification, analysis, and optimization of AI models.
The IBM-telco collaboration is expected to advance O-RAN interfaces and workflows. Areas currently under development are:
- Multi-modal (RF level + network-level) analytics (AI/ML) for wireless communication with high-speed ingest of 5G data
- Capability to learn patterns of metric and log data across CUs and DUs in RF analytics
- Utilization of the antenna control plane to optimize throughput
- Primitives for forecasting, anomaly detection and root cause analysis using ML
- Opportunity of value-added functions for O-RAN
IBM Cloud and Infrastructure
The cornerstone for the delivery of IBM’s edge solutions as a service is IBM Cloud Satellite. It presents a consistent cloud-ready, cloud-native operational view with OpenShift and IBM Cloud PaaS services at the edge. In addition, IBM integrated hardware and software Edge systems will provide RHACM – based management of the platform when clients or third parties have existing managed as a service models. It is essential to note that in either case this is done within a single control plane for hubs and spokes that helps optimize execution and management from any cloud to the edge in the hub and spoke model.
IBM’s focus on “edge in” means it can provide the infrastructure through things like the example shown above for software defined storage for federated namespace data lake that surrounds other hyperscaler clouds. Additionally, IBM is exploring integrated full stack edge storage appliances based on hyperconverged infrastructure (HCI), such as the Spectrum Fusion HCI, for enterprise edge deployments.
As mentioned earlier, data gravity is one of the main driving factors of edge deployments. IBM has designed its infrastructure to meet those data gravity requirements, not just for the existing hub and spoke topology but also for a future spoke-to-spoke topology where peer-to-peer data sharing becomes imperative (as illustrated with the wealth of examples provided in this article).
Edge is a distributed computing model. One of its main advantages is that computing, and data storage and processing is close to where data is created. Without the need to move data to the cloud for processing, real-time application of analytics and AI capabilities provides immediate solutions and drives business value.
IBM’s goal is not to move the entirety of its cloud infrastructure to the edge. That has little value and would simply function as a hub to spoke model operating on actions and configurations dictated by the hub.
IBM’s architecture will provide the edge with autonomy to determine where data should reside and from where the control plane should be exercised.
Equally important, IBM foresees this architecture evolving into a decentralized model capable of edge-to-edge interactions. IBM has no firm designs for this as yet. However, the plan is to make the edge infrastructure and platform a first-class citizen instead of relying on the cloud to drive what happens at the edge.
Developing a complete and comprehensive AI/ML edge architecture – and in fact, an entire ecosystem – is a massive undertaking. IBM faces many known and unknown challenges that must be solved before it can achieve success.
However, IBM is one of the few companies with the necessary partners and the technical and financial resources to undertake and successfully implement a project of this magnitude and complexity.
It is reassuring that IBM has a plan and that its plan is sound.
Paul Smith-Goodson is Vice President and Principal Analyst for quantum computing, artificial intelligence and space at Moor Insights and Strategy. You can follow him on Twitter for more current information on quantum, AI, and space.
Note: Moor Insights & Strategy writers and editors may have contributed to this article.
Moor Insights & Strategy, like all research and tech industry analyst firms, provides or has provided paid services to technology companies. These services include research, analysis, advising, consulting, benchmarking, acquisition matchmaking, and speaking sponsorships. The company has had or currently has paid business relationships with 8×8, Accenture, A10 Networks, Advanced Micro Devices, Amazon, Amazon Web Services, Ambient Scientific, Anuta Networks, Applied Brain Research, Applied Micro, Apstra, Arm, Aruba Networks (now HPE), Atom Computing, AT&T, Aura, Automation Anywhere, AWS, A-10 Strategies, Bitfusion, Blaize, Box, Broadcom, C3.AI, Calix, Campfire, Cisco Systems, Clear Software, Cloudera, Clumio, Cognitive Systems, CompuCom, Cradlepoint, CyberArk, Dell, Dell EMC, Dell Technologies, Diablo Technologies, Dialogue Group, Digital Optics, Dreamium Labs, D-Wave, Echelon, Ericsson, Extreme Networks, Five9, Flex, Foundries.io, Foxconn, Frame (now VMware), Fujitsu, Gen Z Consortium, Glue Networks, GlobalFoundries, Revolve (now Google), Google Cloud, Graphcore, Groq, Hiregenics, Hotwire Global, HP Inc., Hewlett Packard Enterprise, Honeywell, Huawei Technologies, IBM, Infinidat, Infosys, Inseego, IonQ, IonVR, Inseego, Infosys, Infiot, Intel, Interdigital, Jabil Circuit, Keysight, Konica Minolta, Lattice Semiconductor, Lenovo, Linux Foundation, Lightbits Labs, LogicMonitor, Luminar, MapBox, Marvell Technology, Mavenir, Marseille Inc, Mayfair Equity, Meraki (Cisco), Merck KGaA, Mesophere, Micron Technology, Microsoft, MiTEL, Mojo Networks, MongoDB, MulteFire Alliance, National Instruments, Neat, NetApp, Nightwatch, NOKIA (Alcatel-Lucent), Nortek, Novumind, NVIDIA, Nutanix, Nuvia (now Qualcomm), onsemi, ONUG, OpenStack Foundation, Oracle, Palo Alto Networks, Panasas, Peraso, Pexip, Pixelworks, Plume Design, PlusAI, Poly (formerly Plantronics), Portworx, Pure Storage, Qualcomm, Quantinuum, Rackspace, Rambus, Rayvolt E-Bikes, Red Hat, Renesas, Residio, Samsung Electronics, Samsung Semi, SAP, SAS, Scale Computing, Schneider Electric, SiFive, Silver Peak (now Aruba-HPE), SkyWorks, SONY Optical Storage, Splunk, Springpath (now Cisco), Spirent, Splunk, Sprint (now T-Mobile), Stratus Technologies, Symantec, Synaptics, Syniverse, Synopsys, Tanium, Telesign,TE Connectivity, TensTorrent, Tobii Technology, Teradata,T-Mobile, Treasure Data, Twitter, Unity Technologies, UiPath, Verizon Communications, VAST Data, Ventana Micro Systems, Vidyo, VMware, Wave Computing, Wellsmith, Xilinx, Zayo, Zebra, Zededa, Zendesk, Zoho, Zoom, and Zscaler. Moor Insights & Strategy founder, CEO, and Chief Analyst Patrick Moorhead is an investor in dMY Technology Group Inc. VI, Dreamium Labs, Groq, Luminar Technologies, MemryX, and Movandi.