Optimized Model Transmission
20230148106 · 2023-05-11
Inventors
Cpc classification
International classification
Abstract
The invention relates to a method for operating a policy control entity in a cellular network, the method comprising: determining a quality of service parameter for a data packet session in which one trained model from a plurality of trained models is downloaded to a mobile entity, determining at least one capacity parameter of the mobile entity, determining a network transmission parameter of the cellular network, determining said one trained model from the plurality of different trained models based on a dataset which maps different capacity parameters and transmission capabilities to the plurality of trained models, on the determined capacity parameter, and on the network transmission parameter, determining routing information indicating where said one trained model is accessible for a transmission to the mobile entity, transmitting the routing information to a session management entity configured to manage the data packet session in the cellular network.
Claims
1-24. (canceled)
25. A method for operating a policy control entity in a cellular network, the method comprising: determining a quality of service parameter for a data packet session in which one trained model from a plurality of trained models is downloaded to a mobile entity, wherein the plurality of different trained models differ from one another by a data size and a number of features used by the corresponding model to carry out a certain task; determining at least one capacity parameter of the mobile entity describing a processing capacity of the mobile entity; determining a network transmission parameter of the cellular network describing transmission capabilities of the cellular network for transmitting the one trained model; determining said one trained model from the plurality of different trained models based on a dataset which maps different capacity parameters and transmission capabilities to the plurality of trained models, on the determined capacity parameter, and on the network transmission parameter; determining routing information indicating where said one trained model is accessible for a transmission to the mobile entity; and transmitting the routing information to a session management entity configured to manage the data packet session in the cellular network.
26. The method according to claim 25, wherein determining the quality of service parameter and capacity parameter comprises: transmitting a first request to the session management entity requesting the quality of service parameter and the capacity parameter from the session management entity; and receiving a response to the first request from the session management entity, the response comprising the quality of service parameter and the capacity parameter.
27. The method according to claim 25, wherein determining the routing information comprises determining address information at an application entity at which said one trained model can be accessed for a download to the mobile entity.
28. The method according to claim 25, wherein: the plurality of trained models differ from one another by the number of features and by an amount of compression of the features with which the corresponding features of the plurality of trained models are transmitted through the cellular network; and the dataset indicates the compression parameter in dependence on the network transmission parameter and the capacity parameter.
29. The method according to claim 28, wherein: different importance levels in at least some of the plurality of trained models are associated with different features; and the dataset indicates that features with a higher importance level are to be transmitted to the mobile entity with a lower compression compared to the features with a lower importance level.
30. The method according to claim 28, wherein: in each of the plurality of trained models, the features are weighted with a corresponding weighting factor; and the dataset indicates the compression for the weighting factors in dependence on the capacity parameter and/or the transmission capabilities.
31. The method according to claim 25, wherein in a bootstrapping phase, before said one trained model is determined, the defined data set is received and stored by the policy control entity such that the defined data set is accessible to the policy control entity.
32. The method according to claim 31, wherein: the defined data set is received from the application entity providing the plurality of trained models; the routing information is additionally received from the application entity; and each of the plurality of trained models is accessible at the application entity.
33. A method for operating a user plane entity configured to handle a data packet session in a cellular network in which one trained model from a plurality of trained models is downloaded to a mobile entity, wherein the plurality of different trained models differ from one another by a data size and a number of features used by the corresponding model to carry out a certain task, the method comprising: receiving a handling request to handle the data packet session for transmitting said one trained model to the mobile entity, the request comprising routing information where said one trained model is accessible at an application entity for a transmission to the mobile entity; receiving a download request from the mobile entity requesting transmission of said one trained model to the mobile entity; and transmitting a second request to the application entity requesting transmission of said one trained model based on the received routing information.
34. The method according to claim 33, wherein the handling request is received from a session management entity and includes as routing information a network address and a directory where said one trained model is accessible at the application entity.
35. A policy control entity comprising a memory and at least one processing circuit, the memory containing instructions executable by said at least one processing circuit, wherein the mobile entity is configured to: determine a quality of service parameter for a data packet session in which one trained model from a plurality of trained models is downloaded to a mobile entity, wherein the plurality of different trained models differ from one another by a data size and a number of features used by the corresponding model to carry out a certain task; determine at least one capacity parameter of the mobile entity describing a processing capacity of the mobile entity; determine a network transmission parameter of the cellular network describing transmission capabilities of the cellular network for transmitting the one trained model; determine said one trained model from the plurality of different trained models based on a dataset which maps different capacity parameters and transmission capabilities to the plurality of trained models, on the determined capacity parameter, and on the network transmission parameter; determine routing information indicating where said one trained model is accessible for a transmission to the mobile entity; and transmit the routing information to a session management entity configured to manage the data packet session in the cellular network.
36. The policy control entity according to claim 35, wherein to determine the quality of service parameter and capacity parameter, the policy control entity is further operative to: transmit a first request to the session management entity requesting the quality of service parameter and the capacity parameter from the session management entity; and receive a response to the first request from the session management entity, the response comprising the quality of service parameter and the capacity parameter.
37. The policy control entity according to claim 35, wherein to determine the routing information, the policy control entity is operative to determine an address information at an application entity at which said one trained model can be accessed for a download to the mobile entity.
38. The policy control entity according to claim 35, wherein: the plurality of trained models differ from one another by the number of features and by an amount of compression of the features with which the corresponding features of the plurality of trained models are transmitted through the cellular network; and the dataset indicates the compression parameter in dependence on the network transmission parameter and the capacity parameter.
39. The policy control entity according to claim 38, wherein: different importance levels in at least some of the plurality of trained models are given to the different features; and the dataset indicates that features with a higher importance level are to be transmitted to the mobile entity with a lower compression compared to the features with a lower importance level.
40. The policy control entity according to claim 38, wherein: in each of the trained models, the features are weighted with a corresponding weighting factor; and the dataset indicates the compression for the weighting factors in dependence on the capacity parameter and/or the transmission capabilities.
41. The policy control entity according to claim 35, wherein in a bootstrapping phase, before said one trained model is determined, the policy control entity is operative to receive the defined data set and to store the defined data set such that it is accessible to the policy control entity.
42. The policy control entity according to claim 41, further configured to: receive the defined data set from the application entity providing the plurality of trained models; and receive the routing information from the application entity, where each of the plurality of trained models is accessible at the application entity.
43. A user plane entity configured to handle data packet sessions in a cellular network in which one trained model from a plurality of trained models is downloaded to a mobile entity, wherein the plurality of different trained models differ from one another by a data size and a number of features used by the corresponding model to carry out a certain task, the user plane entity comprising a memory and at least one processing circuit, the memory comprising instructions executable by said at least one processing circuit, wherein the user plane entity is configured to: receive a handling request to handle the data packet session for transmitting said one trained model to the mobile entity, the handling request comprising routing information in which said one trained model is accessible at an application entity or a transmission to the mobile entity; receive a download request from the mobile entity requesting transmission of said one trained model to the mobile entity; and transmit a second request to the application entity requesting transmission of said one trained model based on the received routing information.
44. The user plane entity according to claim 43, wherein the user plane entity is further configured to receive the handling request from a session management entity which includes as routing information a network address and a directory where said one trained model is accessible at the application entity.
Description
BRIEF DESCRIPTION OF THE DRAWINGS
[0039] The foregoing and additional features and effects of the application will become apparent from the following detailed description when read in conjunction with the accompanying drawings in which like reference numerals refer to like elements.
[0040]
[0041]
[0042]
[0043]
[0044]
[0045]
[0046]
[0047]
[0048]
[0049]
[0050]
[0051]
DETAILED DESCRIPTION OF EMBODIMENTS
[0052] In the following, embodiments of the invention will be described in detail with reference to the accompanying drawings. It is to be understood that the following description of embodiments is not to be taken in a limiting sense. The scope of the invention is not intended to be limited by the embodiments described hereinafter or by the drawings, which are to be illustrative only.
[0053] The drawings are to be regarded as being schematic representations, and elements illustrated in the drawings are not necessarily shown to scale. Rather, the various elements are represented such that their function in general purpose becomes apparent to a person skilled in the art. Any connection or coupling between functional blocks, devices, components of physical or functional units shown in the drawings and described hereinafter may also be implemented by an indirect connection or coupling. A coupling between components may be established over a wired or wireless connection. Functional blocks may be implemented in hardware, software, firmware, or a combination thereof.
[0054] Within the context of the present application, the term “mobile entity” or “user equipment/UE” refers to a device for instance used by a user for his or her personal communication. It can be a telephone type of device, cellular telephone, mobile station, cordless phone or personal digital assistant type of device like laptop, notebook, notepad or tablet equipped with a wireless data connection. It can also be an embedded device like a microcontroller inside a vehicle etc. The UE may also be associated with nonhumans like animals, plants or machines. The UE may be equipped with a subscriber identity module, SIM, comprising unique identities associated with the user using the UE. The presence of the SIM within a UE customizes the UE uniquely with a subscriber of the user.
[0055] As will be explained below, the transfer of the model and of the model weights is improved by progressively transfer feature data of a model and the model weights by setting different resolutions that is adapted according to the network channel capacity, the capacity of the mobile entity receiving the model. By way of example, integers can be transferred without conversion or compression by transferring selected bits directly.
[0056] An operating scenario for the present application is as follows:
[0057] A trained model should be downloaded to a mobile entity, UE, such as UE60 shown in
[0058] As will be explained below, the network now decides which of the trained models is selected for the download to the UE, the network owns the authorization to dynamically decide the resolution of the model and the features used and the resolution of the weights used in the model depending on the channel capacity, the capacity of the mobile entity at that time.
[0059] The application discussed below comprises three main parts: [0060] The first part in which a data set is created which describes the importance level and the transfer level mappings. This data set is also called IT map hereinafter. The data set is a predefined data set which describes based on the importance level of a feature of a model, based on the network status and based on the UE capacity how the data of the model is transmitted to the UE, especially whether it is transmitted without compression or with which level of compression. The data set is determined in advance and stored in the network such as the policy control entity in the form of rules or policies. The data set can be dynamically adapted or may be fixed. [0061] This data set is then shared in a bootstrapping phase. A communication protocol between the application servers providing the different trained models and the cellular network is used, and the data set is then stored in the policy control function, PCF, or policy control entity in the form of rules or policies. [0062] At runtime, a suitable model is selected using the data set, the transport status of the cellular network and the capacity or capability of the UE, in order to exchange PDU session information and a UE capacity. The result is the selection of one of a plurality of trained models for a deep learning application carried out at the UE. The selection of the trained model decides how may features and at which compression rate the data are transmitted to the UE.
[0063] It is possible to first identify and recognize the most important parts or features of the model and used by the model and to use this knowledge to transfer the most or more important parts or features of the model with a lower compression or a higher resolution compared to features or parts of the model having a lower importance. Furthermore, during inference, it is possible, while transferring the models, that the more important weights used in the model may be transferred with a higher resolution such as 32- or 64-bits, whereas the less important weights or features are transferred with a lower resolution such as 8- or 16-bits, 4-bits or even binary.
[0064] For integers, the resolution sending is easier as only the important bytes or bits may be sent, and the receiving side, the UE could reconstruct the integer by simply filling the rest of the bytes or bits with zeros. When transferring the weights or the features, there is a predefined way to match the order of transmission with the ones that are transmitted. An importance level may be assigned to each feature for the data transfer and for each weight for the model transfer. An importance level can define the total number of digits in a value and how many of them should be transmitted when a specific transfer level is specified. The min-max scaler of integers can be used to reduce the amount of rules that are needed, though it increases the predefined meta information used in the IT map. The transmission procedure for floats is more complex. One solution is to min-max float to integers and then transmit them as integers. Another solution is based on the separation of the bits. The sign and the exponent parts of a float may be more important or mandatory, while the fraction part may be separated according to the specified importance level and separate parts maybe transferred according to the importance level.
[0065]
[0066]
[0067] As shown in
[0068] In the following, the bootstrapping phase is discussed in more detail.
[0069]
[0070] In step S11, a request to access the control plane data is sent to the Network Exposure Function NEF, which, in step S12, informs the authorized application function of the access to the cellular network, so that, in step S13, the fact that access is granted to the control plane is transmitted to the application server. In step S14, AF grants the application server access to the central plane and the PCF is exposed to the external application server, so that, in step S15, the data set is transmitted to the PCF. In addition to the data set or IT map, routing information is transmitted to the PCF indicating where the different compressed models can be accessed, including e.g. the application server IP address, a port number and the directory where each of the models can be accessed at the application server. In step S16, the PCF stores the IT map and the routing information. The PCF can store the IT map as its policy along with the routing information about where to access the different models using address information such as the IP address of the application server, the port number and any directory where the model and different models are stored.
[0071]
[0072] The application servers like Facebook are required to authorize themselves through the application function using the NEF as shown in
[0073]
[0074] In step S21, a request is sent from the UE to the AMF to establish a PDU session. The PDU session establishment request is made by the UE for a data exchange between the access network, AN, and the core network. The request can include quality of service parameter such as the QoS class identifier, QCI, (4G/LTE) or 5QI (in 5G) requirements like latency throughput or other network parameters, and can furthermore include the UE capacity like the processing and storing capacity, the battery status, etc. In step S22, the AMF selects the SMF and transmits the parameters it has received from the UE. In the core network, the SMF is responsible to request the policy information, network slice instance and UPF selection for the application, so that the UE is able to communicate with the application server which may be provided in the cloud. In step S23, the SMF request the policy information needed to create the UPF instance. Here, the SMF requests the PCF node to receive the policy information that is needed to create UPF. In step S24, the PCF requests the quality of service parameters, the UE capabilities with the battery status from the SMF. This information is needed in order to choose a suitable policy according to the current capacity parameter of the UE and the network status. In step S25, the SMF transmits the request information including the quality of service parameter, the UE capability, the battery status, etc. to the PCF. In step S26, the PCF chooses a suitable policy at runtime based on the received quality of service parameters the UE capacity using the IT map. Furthermore, it also receives the routing information about the application server with the address including e.g. a port number and directory for the chosen policy. Accordingly, in step S26, the PCF selects one of the models from the plurality of models provided on the application server. In step S27, the PCF sends the routing information such as the IP port and the directory using a particular template, so that the session management function is informed where the selected model is stored at the application server. In steps S28, the SMF creates the UPF instance according to the received template for the user plane traffic. It also sends the routing information to the application server where the models are stored and where the chosen model can be found to the UPF. Accordingly, in step S28, the information where the selected model can be accessed at the application server is transmitted to the user plane entity handling the traffic. In step S29, a PDU session is established between the UE entity and SMF for the application, e.g. using an Nsmf-PDU session service. In step S30, the UE sends a request for the model to the UPF to download the model and the corresponding weights through the user plane session. In step S31, the UPF reroutes the request to the correct address at the application server, by way of example using the information as received from the SMF in step S28. Finally, in step S32, the application server sends the required selected model to the UE.
[0075]
[0076] As far as the UPF or user plane entity is concerned, some of the steps carried out in the message exchange in
[0077]
[0078]
[0079]
[0080]
[0081] From the above, said some general conclusions can be drawn.
[0082] As far as the policy control entity is concerned, when the quality of service parameter is determined by the policy control entity, it is possible to transmit a first request to the session management entity requesting the quality of service parameter and the capacity parameter from the session management entity. Furthermore, a response is received to this request from the session management entity and the response comprises the quality of service parameter and capacity parameter. This was discussed above in connection with steps S24 and S25.
[0083] When the routing information is determined, it is possible to determine an address information at the application entity at which said one trained model can be accessed for the download to the mobile entity. The address information may comprise an IP address or a port number where the model can be accessed at the application server. The plurality of different trained models may differ from one another by the number of features and amount of compression of the features with which the corresponding features of the plurality of trained models are transmitted through the cellular network.
[0084] The database or IT map may indicate the compression parameter in dependence on the network transmission parameter and the capacity parameter of the mobile entity. It is possible to give in at least some of the plurality of trained models different importance levels to the different features and the data set or IT map indicates that features with a higher importance level are to be transmitted to the mobile entity with a lower compression compared to the features with a lower importance level. Furthermore, it is possible that, in each of the models, the features are weighted with a corresponding rating factor and the data set furthermore indicates the compression for the weighting factors in dependence on the capacity parameter and/or in dependence on the transmission capabilities.
[0085] Furthermore, in a bootstrapping phase, before the trained model is determined, the defined data set is received and stored by the policy control entity such that it is accessible to the policy control entity.
[0086] As discussed in connection with
[0087] As far as the user plane entity is concerned, the handling request may be received from the session management entity and includes as routing information, the network address and a directory where the trained model which was elected is accessible at the application entity.
[0088] The above-discussed solution has several advantages.
[0089] One advantage is that the model will be available for inference within a shorter time. In case the mobile entity running the model is not very fast and the use case requires real-time data, it might be advantageous to run a simpler model. With less data, the data transfer and feature extraction will be probably faster and the model itself might also be able to do the inference faster.
[0090] Especially, a training of the model might gain from this, since the amount of data needed for training might be large and the data transport time high when the training is carried out at the UE, however the training may also be done at the server. A training could take hours or days on large amounts of data. The training could be faster when small amount of data are used when the training is occurring at the UE. During use, the transmission of the most important resolution bytes of the model might help to reduce the data size to be transmitted. When the capacity allows, more data could be sent and used accordingly.
[0091] The application could also be applied for assisting individual UEs having fitted deep learning models according to its own capabilities without deploying models for different UEs with no network awareness. The decision of the deploying model accuracy could be migrated into the network other than the application or cloud server. The application is furthermore suitable for UEs with varying computing capability, or when running in a battery saving mode or when multiple applications are running concurrently.