We use cookies to distinguish you from other users and to provide you with a better experience on our websites. Close this message to accept cookies or find out how to manage your cookie settings.
To save content items to your account,
please confirm that you agree to abide by our usage policies.
If this is the first time you use this feature, you will be asked to authorise Cambridge Core to connect with your account.
Find out more about saving content to .
To save content items to your Kindle, first ensure [email protected]
is added to your Approved Personal Document E-mail List under your Personal Document Settings
on the Manage Your Content and Devices page of your Amazon account. Then enter the ‘name’ part
of your Kindle email address below.
Find out more about saving to your Kindle.
Note you can select to save to either the @free.kindle.com or @kindle.com variations.
‘@free.kindle.com’ emails are free but can only be saved to your device when it is connected to wi-fi.
‘@kindle.com’ emails can be delivered even when you are not connected to wi-fi, but note that service fees apply.
This chapter introduces the background, development history, and typical applications of edge learning. It also specifies the main challenges faced by edge learning from the aspects of data, communication, and computation.
In this chapter, we first provide convergence results of Stochastic Gradient Descent (SGD) methods that are usually adopted to solve the machine learning problem. Then, we introduce advanced training algorithms including momentum SGD, Hyper-parameter-based algorithms, and optimization algorithms for deep learning models. At last, we give theoretical frameworks about how to deal with the staleness gradient incurred by ASP or SSP.
This chapter first focuses on model compression and hardware acceleration for edge learning. It covers many aspects, including the learning algorithms, learning-oriented communication, distributed machine learning with hardware adaptation, TEE-based privacy protection, algorithm, and hardware joint optimization, etc. The essential objective is to implement an integrated algorithm-hardware platform, to optimize the implementation of emerging machine learning algorithms, to fully explore the potential of modern computation hardware, and to promote novel intelligent applications for sophisticated services. Then, we introduce straggler tolerance schemes that can avoid the overall training performance seriously degraded by faulty nodes, and can adequately utilize the computation power of slow nodes. At last, we introduce computation acceleration technologies for inference at the edge.
How to build a benign ecosystem for sustainable development of edge learning is a crucial issue. This chapter first introduces incentive mechanisms for edge learning to motivate edge nodes to contribute model training. Specifically, in parameter server architecture, we introduce a deep reinforcement learning-based (DRL) incentive mechanism to determine the optimal pricing strategy for the parameter server and the optimal training strategies for edge nodes. Finally, we discuss future directions.
Big data and AI are enabling technologies for smart decision-making, automation, and resource optimization. These technologies collectively promote intelligent services from concepts to practical applications. It is widely recognized that Intelligent Services meet the strategic development of emerging industries, meanwhile enrich people’s lifestyle and make a convenient and efficient life. This chapter introduces the popular programming frameworks for Edge Learning. Then, we give some examples of emerging intelligent applications in the edge, e.g., smart health, self-driving, smart surveillance, and smart transportation.
Edge learning has enabled the training of large-scale machine learning models on a big dataset by implementing data parallelism in multiple nodes. However, the iterative interaction generated by multiple learning nodes together with the considerable quantity of communication data on each interaction yields huge communication overhead, which greatly hinders the scalability of Edge Learning. In this chapter, we introduce the mainstream approaches to achieve communication efficiency of edge training, including compressing communication data, reducing the synchronous frequency, overlapping computation and communication, and optimizing the transmission network. Specifically, we propose two hybrid mechanisms for communication-efficient Edge Learning. The first one is QOSP that integrates gradient quantization for communication compression and overlap synchronization parallel for simultaneous computation and communication. The second mechanism improves communication efficiency during the aggregation of client-side updates by quantizing the gradients and exploiting the inherent superposition of radio frequency signals. Finally, we discuss the future directions of communication-efficient edge learning.
In a cloud-edge environment, data are generated by different types of devices, and these devices have various computation capabilities and storage sizes. It is unrealistic to execute all the tasks in the cloud, instead, putting some work into edge servers that are close to end-users would be more reasonable. Edge Learning is a powerful paradigm for big data analytics in the cloud-edge environment. Edge Learning exploits pervasive data generated not only by user devices but also by other sensing devices and those stored in the cloud/edge servers (e.g., data from social networks). Moreover, EL leverages various computing entities (all the devices with computing capabilities ranging from cloud, edge servers, to various edge devices) in an efficient, reliable, and robust manner.
In this chapter, we first introduce the deep learning models that are widely used in Edge Learning. Then, we introduce the basic machine learning algorithms, architectures, and synchronization mode for Edge Learning.
Conventional distributed machine learning manages the training data in a centralized mode without considering the privacy and security problems during training or inference. With the rapid development and wide deployment of artificial intelligence technology these days, privacy protection has gained more and more attention. Moreover, EL participants usually are small devices (e.g., smartphones, sensors) that have weak defense ability and can be easily compromised under possible attacks. In this chapter, we first introduce a security guarantee mechanism in Edge Learning including the defense methods for data-oriented attacks and model-oriented attacks. Then, we summarize the mainstream methods of privacy protection including differential privacy, secure multi-party computation, and homomorphic encryption. Finally, we discuss future directions in this field.
With the growth of model complexity and computational overhead, modern ML applications are usually handled by the distributed systems, where the training procedure is conducted in parallel. Basically, the datasets and models are partitioned to different workers in data parallelism and model parallelism, respectively.
In this chapter, we present the details of these two schemes. Moreover, considering some latest researches that handle distributed training via multiple primitives, we also discuss the extension of training parallelism, i.e., learning frameworks and efficient synchronization mechanisms over the hierarchical architecture.
In Edge Learning, training data are non-independently and identically distributed (non-IID). Applying the same learning strategy for all workers fails to work efficiently. In this chapter, we introduce federated learning, where the training data are always non-IID due to data isolation. Then, we summarize enabling technologies for efficient training with non-IID data. We also propose a reinforcement learning-based method that takes non-IID property and resource-constrain into consideration and adjusts the hyper-parameters to accelerate the loss descent efficiency.
Data assimilation is a hugely important mathematical technique, relevant in fields as diverse as geophysics, data science, and neuroscience. This modern book provides an authoritative treatment of the field as it relates to several scientific disciplines, with a particular emphasis on recent developments from machine learning and its role in the optimisation of data assimilation. Underlying theory from statistical physics, such as path integrals and Monte Carlo methods, are developed in the text as a basis for data assimilation, and the author then explores examples from current multidisciplinary research such as the modelling of shallow water systems, ocean dynamics, and neuronal dynamics in the avian brain. The theory of data assimilation and machine learning is introduced in an accessible and unified manner, and the book is suitable for undergraduate and graduate students from science and engineering without specialized experience of statistical physics.
Discover this multi-disciplinary and insightful work, which integrates machine learning, edge computing, and big data. Presents the basics of training machine learning models, key challenges and issues, as well as comprehensive techniques including edge learning algorithms, and system design issues. Describes architectures, frameworks, and key technologies for learning performance, security, and privacy, as well as incentive issues in training/inference at the network edge. Intended to stimulate fruitful discussions, inspire further research ideas, and inform readers from both academia and industry backgrounds. Essential reading for experienced researchers and developers, or for those who are just entering the field.
In the past decade or so, (deep) neural networks have captured people’s imagination through their empirical success in learning problems involving real-world high-dimensional data such as images, speech, and text [LBH15]. Nevertheless, there is quite a bit of mystery as to how deep networks achieve such striking results. Modern deep networks are typically designed through trial and error.
In the previous theoretical Part I of the book, we showed that under fairly broad conditions on the number of measurements needed, many important classes of structured signals can be recovered via computationally tractable optimization problems, such as ℓ1 minimization for recovering sparse signals and nuclear norm minimization for recovering low-rank matrices.