METHOD FOR DECODING CODEWORD IN WIRELESS COMMUNICATION SYSTEM AND TERMINAL THEREFOR
20220385406 · 2022-12-01
Inventors
Cpc classification
H04L1/002
ELECTRICITY
H04L25/067
ELECTRICITY
International classification
Abstract
Disclosed is a method by which a terminal decodes a codeword in a wireless communication system. Specifically, the method may comprise: receiving a plurality of codewords; and decoding the plurality codewords on the basis of successive interference cancellation (SIC). In particular, the SIC may be performed on the basis of a decoding policy for decoding the plurality of codewords. In particular, the decoding policy may be determined by a neural network trained on the basis of a state and a reward related to the plurality of codewords.
Claims
1. A method of decoding a codeword by a user equipment (UE) in a wireless communication system, the method comprising: receiving a plurality of codewords; and decoding the plurality of codewords based on successive interference cancellation (SIC), wherein the SIC is performed based on a decoding policy for decoding the plurality of codewords, and wherein the decoding policy is determined by a neural network trained based on a state and a reward related to the plurality of codewords.
2. The method of claim 1, wherein the state includes channel quality of each of a first codeword and a second codeword, and the reward includes whether decoding of each of the first codeword and the second codeword is success.
3. The method of claim 1, wherein the decoding policy includes i) an order of decoding the codewords and ii) whether each codeword is combined with a log likelihood ratio (LLR) value calculated in previous transmission of each codeword, stored in a hybrid automatic repeat and request (HARQ) buffer, and wherein the neural network is trained based on decoding results of the codewords based on the decoding policy.
4. The method of claim 1, wherein the state further includes an interference relationship in a time domain and a frequency domain of the codewords, and wherein the neural network is trained based further on the interference relationship.
5. The method of claim 1, further comprising managing a hybrid automatic repeat and request (HARQ) buffer using log likelihood ratio (LLR) values calculated for the respective codewords upon failing to decode the plurality of codewords based on the decoding policy.
6. The method of claim 5, wherein the managing the HARQ buffer includes i) adding the LLR values calculated for the respective codewords to previous LLR values stored in the HARQ buffer, (ii) replacing the previous LLR values stored in the HARQ buffer with the LLR values calculated for the respective codewords, or (iii) dropping the LLR values calculated for the respective codewords.
7. The method of claim 6, wherein the managing the HARQ buffer includes adding only LLR values having a threshold value or more among the LLR values calculated for the respective codewords to the previous LLR values stored in the HARQ buffer.
8. A user equipment (UE) for decoding a codeword in a wireless communication system, the UE comprising: a transceiver configured to receive a plurality of codewords; and a processor configured to decode the plurality of codewords based on successive interference cancellation (SIC), wherein the processor performs the SIC based on a decoding policy for decoding the plurality of codewords, and determine the decoding policy through a neural network trained based on a state and a reward related to the plurality of codewords.
9. The UE of claim 8, wherein the state includes channel quality of each of a first codeword and a second codeword, and the reward includes whether decoding of each of the first codeword and the second codeword is success.
10. The UE of claim 8, wherein the decoding policy includes i) an order of decoding the codewords and ii) whether each codeword is combined with a log likelihood ratio (LLR) value calculated in previous transmission of each codeword, stored in a hybrid automatic repeat and request (HARQ) buffer, and wherein the neural network is trained based on decoding results of the codewords based on the decoding policy.
11. The UE of claim 8, wherein the state further includes an interference relationship in a time domain and a frequency domain of the codewords, and wherein the processor trains the neural network based further on the interference relationship.
12. The UE of claim 8, wherein, upon failing to decode the plurality of codewords based on the decoding policy, the processor manages a hybrid automatic repeat and request (HARQ) buffer using log likelihood ratio (LLR) values calculated for the respective codewords.
13. The UE of claim 12, wherein the processor i) adds the LLR values calculated for the respective codewords to previous LLR values stored in the HARQ buffer, (ii) replaces the previous LLR values stored in the HARQ buffer with the LLR values calculated for the respective codewords, or (iii) drops the LLR values calculated for the respective codewords.
14. The UE of claim 13, wherein the processor adds only LLR values having a threshold value or more among the LLR values calculated for the respective codewords to the previous LLR values stored in the HARQ buffer.
Description
DESCRIPTION OF DRAWINGS
[0020] The accompanying drawings, which are included to provide a further understanding of the present disclosure and are incorporated in and constitute a part of this application, illustrate embodiments of the present disclosure and together with the description serve to explain the principle of the present disclosure. In the drawings:
[0021]
[0022]
[0023]
[0024]
[0025]
[0026]
[0027]
[0028]
[0029]
[0030]
[0031]
BEST MODE FOR CARRYING OUT THE DISCLOSURE
[0032] The embodiments of the present disclosure described hereinbelow are combinations of elements and features of the present disclosure. The elements or features may be considered selective unless otherwise mentioned. Each element or feature may be practiced without being combined with other elements or features. Further, an embodiment of the present disclosure may be constructed by combining parts of the elements and/or features. Operation orders described in embodiments of the present disclosure may be rearranged. Some constructions or features of any one embodiment may be included in another embodiment and may be replaced with corresponding constructions or features of another embodiment.
[0033] In the embodiments of the present disclosure, a description is made, centering on a data transmission and reception relationship between a base station (BS) and a user equipment (UE). The BS is a terminal node of a network, which communicates directly with a UE. In some cases, a specific operation described as performed by the BS may be performed by an upper node of the BS.
[0034] Namely, it is apparent that, in a network comprised of a plurality of network nodes including a BS, various operations performed for communication with a UE may be performed by the BS or network nodes other than the BS. The term ‘BS’ may be replaced with the term ‘fixed station’, ‘Node B’, ‘evolved Node B (eNode B or eNB)’, ‘Access Point (AP)’, etc. The term ‘relay’ may be replaced with the term ‘relay node (RN)’ or ‘relay station (RS)’. The term ‘terminal’ may be replaced with the term ‘UE’, ‘mobile station (MS)’, ‘mobile subscriber station (MSS)’, ‘subscriber station (SS)’, etc. The term “cell”, as used herein, may be applied to transmission and reception points such as a base station (eNB), a sector, a remote radio head (RRH), and a relay, and may also be extensively used by a specific transmission/reception point to distinguish between component carriers.
[0035] Specific terms used for the embodiments of the present disclosure are provided to help the understanding of the present disclosure. These specific terms may be replaced with other terms within the scope and spirit of the present disclosure.
[0036] In some cases, to prevent the concept of the present disclosure from being ambiguous, structures and apparatuses of the known art will be omitted, or will be shown in the form of a block diagram based on main functions of each structure and apparatus. Also, wherever possible, the same reference numbers will be used throughout the drawings and the specification to refer to the same or like parts.
[0037] The embodiments of the present disclosure may be supported by standard documents disclosed for at least one of wireless access systems, Institute of Electrical and Electronics Engineers (IEEE) 802, 3rd Generation Partnership Project (3GPP), 3GPP long term evolution (3GPP LTE), LTE-advanced (LTE-A), and 3GPP2. Steps or parts that are not described to clarify the technical features of the present disclosure may be supported by those documents. Further, all terms as set forth herein may be explained by the standard documents.
[0038] Techniques described herein may be used in various wireless access systems such as code division multiple access (CDMA), frequency division multiple access (FDMA), time division multiple access (TDMA), orthogonal frequency division multiple access (OFDMA), single carrier-frequency division multiple access (SC-FDMA), etc. CDMA may be implemented as a radio technology such as universal terrestrial radio access (UTRA) or CDMA2000. TDMA may be implemented as a radio technology such as global system for mobile communications (GSM)/general packet radio service (GPRS)/Enhanced Data Rates for GSM Evolution (EDGE). OFDMA may be implemented as a radio technology such as IEEE 802.11 (Wi-Fi), IEEE 802.16 (WiMAX), IEEE 802.20, evolved-UTRA (E-UTRA) etc. UTRA is a part of universal mobile telecommunications system (UMTS). 3GPP LTE is a part of Evolved UMTS (E-UMTS) using E-UTRA. 3GPP LTE employs OFDMA for downlink and SC-FDMA for uplink. LTE-A is an evolution of 3GPP LTE. WiMAX may be described by the IEEE 802.16e standard (wireless metropolitan area network (WirelessMAN)-OFDMA Reference System) and the IEEE 802.16m standard (WirelessMAN-OFDMA Advanced System). For clarity, this application focuses on the 3GPP LTE and LTE-A systems. However, the technical features of the present disclosure are not limited thereto.
[0039] LTE/LTE-A Resource Structure/Channel
[0040] With reference to
[0041] In a cellular orthogonal frequency division multiplexing (OFDM) wireless packet communication system, uplink and/or downlink data packets are transmitted in subframes. One subframe is defined as a predetermined time period including a plurality of OFDM symbols. The 3GPP LTE standard supports a type-1 radio frame structure applicable to frequency division duplex (FDD) and a type-2 radio frame structure applicable to time division duplex (TDD).
[0042]
[0043] The number of OFDM symbols in one slot may vary depending on a cyclic prefix (CP) configuration. There are two types of CPs: extended CP and normal CP. In the case of the normal CP, one slot includes 7 OFDM symbols. In the case of the extended CP, the length of one OFDM symbol is increased and thus the number of OFDM symbols in a slot is smaller than in the case of the normal CP. Thus when the extended CP is used, for example, 6 OFDM symbols may be included in one slot. If channel state gets poor, for example, during fast movement of a UE, the extended CP may be used to further decrease inter-symbol interference (ISI).
[0044] In the case of the normal CP, one subframe includes 14 OFDM symbols because one slot includes 7 OFDM symbols. The first two or three OFDM symbols of each subframe may be allocated to a physical downlink control channel (PDCCH) and the other OFDM symbols may be allocated to a physical downlink shared channel (PDSCH).
[0045]
[0046] The above-described radio frame structures are purely exemplary and thus it is to be noted that the number of subframes in a radio frame, the number of slots in a subframe, or the number of symbols in a slot may vary.
[0047]
[0048]
[0049]
[0050] Reference Signal (RS)
[0051] In a wireless communication system, a packet is transmitted on a radio channel. In view of the nature of the radio channel, the packet may be distorted during the transmission. To receive the signal successfully, a receiver should compensate for the distortion of the received signal using channel information. Generally, to enable the receiver to acquire the channel information, a transmitter transmits a signal known to both the transmitter and the receiver and the receiver acquires knowledge of channel information based on the distortion of the signal received on the radio channel This signal is called a pilot signal or an RS.
[0052] In the case of data transmission and reception through multiple antennas, knowledge of channel states between transmission (Tx) antennas and reception (Rx) antennas is required for successful signal reception. Accordingly, an RS should be transmitted through each Tx antenna.
[0053] RSs may be divided into downlink RSs and uplink RSs. In the current LTE system, the uplink RSs include:
[0054] i) Demodulation-reference signal (DM-RS) used for channel estimation for coherent demodulation of information delivered on a PUSCH and a PUCCH; and
[0055] ii) Sounding reference signal (SRS) used for an eNB or a network to measure the quality of an uplink channel in a different frequency.
[0056] The downlink RSs are categorized into:
[0057] i) Cell-specific reference signal (CRS) shared among all UEs of a cell;
[0058] ii) UE-specific RS dedicated to a specific UE;
[0059] iii) DM-RS used for coherent demodulation of a PDSCH, when the PDSCH is transmitted;
[0060] iv) Channel state information-reference signal (CSI-RS) carrying CSI when downlink DM-RSs are transmitted;
[0061] v) Multimedia broadcast single frequency network (MBSFN) RS used for coherent demodulation of a signal transmitted in MBSFN mode; and
[0062] vi) Positioning RS used to estimate geographical position information about a UE.
[0063] RSs may also be divided into two types according to their purposes: RS for channel information acquisition and RS for data demodulation. Since its purpose lies in that a UE acquires downlink channel information, the former should be transmitted in a broad band and received even by a UE that does not receive downlink data in a specific subframe. This RS is also used in a situation like handover. The latter is an RS that an eNB transmits along with downlink data in specific resources. A UE may demodulate the data by measuring a channel using the RS. This RS should be transmitted in a data transmission area.
[0064] Modeling of Multiple-Input Multiple-Output (MIMO) System
[0065]
[0066] As shown in
R.sub.i=min(N.sub.T,N.sub.R) [Equation 1]
[0067] For instance, in an MIMO communication system, which uses four Tx antennas and four Rx antennas, a transmission rate four times higher than that of a single antenna system may be obtained.
[0068] In order to explain a communicating method in a MIMO system in detail, mathematical modeling may be represented as follows. It is assumed that there are N.sub.TTx antennas and N.sub.RRx antennas.
[0069] Regarding a transmitted signal, if there are N.sub.TTx antennas, the maximum number of pieces of information that may be transmitted is N.sub.T. Hence, the transmission information may be represented as shown in Equation 2.
s=└s.sub.1,s.sub.2, . . . ,s.sub.N.sub.
[0070] Meanwhile, transmit powers may be set different from each other for individual pieces of transmission information s.sub.1,s.sub.2, . . . ,s.sub.N.sub.
{circumflex over (s)}=[{circumflex over (s)}.sub.1,{circumflex over (s)}.sub.2, . . . ,{circumflex over (s)}.sub.N.sub.
[0071] In addition, Ŝ may be represented as Equation 4 using diagonal matrix P of the transmission power.
[0072] Assuming a case of configuring N.sub.T transmitted signals x.sub.1,x.sub.2, . . . ,x.sub.N.sub.
[0073] In Equation 5, .sub.ij denotes a weight between an i.sup.th Tx antenna and j.sup.th information. W is also called a precoding matrix.
[0074] If the N.sub.RRx antennas are present, respective received signals y.sub.1,y.sub.2, . . . ,y.sub.N.sub.
y =[y.sub.1,y.sub.2, . . . ,y.sub.N.sub.
[0075] If channels are modeled in the MIMO wireless communication system, the channels may be distinguished according to Tx/Rx antenna indexes. A channel from the Tx antenna j to the Rx antenna i is denoted by h.sub.ij. In h.sub.ij, it is noted that the indexes of the Rx antennas precede the indexes of the Tx antennas in view of the order of indexes.
[0076]
h.sub.i.sup.T=[h.sub.i1,h.sub.i2, . . . ,h.sub.iN.sub.
[0077] Accordingly, all channels from the N.sub.TTx antennas to the N.sub.RRx antennas may be expressed as follows.
[0078] An AWGN (Additive White Gaussian Noise) is added to the actual channels after a channel matrix H. The AWGN n.sub.1,n.sub.2, . . . ,n.sub.N.sub.
n=[n.sub.1,n.sub.2, . . . ,n.sub.N.sub.
[0079] Through the above-described mathematical modeling, the received signals may be expressed as follows.
[0080] Meanwhile, the number of rows and columns of the channel matrix H indicating the channel state is determined by the number of Tx and Rx antennas. The number of rows of the channel matrix H is equal to the number N.sub.R of Rx antennas and the number of columns thereof is equal to the number N.sub.T of Tx antennas. That is, the channel matrix H is an N.sub.R×N.sub.T matrix.
[0081] The rank of the matrix is defined by the smaller of the number of rows and the number of columns, which are independent from each other. Accordingly, the rank of the matrix is not greater than the number of rows or columns. The rank rank(H) of the channel matrix H is restricted as follows.
rank(H)≤min(N.sub.T,N.sub.R) [Equation 11 ]
[0082] Additionally, the rank of a matrix may also be defined as the number of non-zero Eigen values when the matrix is Eigen-value-decomposed. Similarly, the rank of a matrix may be defined as the number of non-zero singular values when the matrix is singular-value-decomposed. Accordingly, the physical meaning of the rank of a channel matrix may be the maximum number of channels through which different pieces of information may be transmitted.
[0083] Proposal of Efficient Decoding Method Using Reinforcement Learning
[0084] In a MIMO HARQ scenario, a transceiver of a UE receives signals including a plurality of CWs through multiple antennas. In this case, the UE may use an SIC reception method to secure performance. The UE using the SIC reception method i) sequentially decodes the CWs, ii) re-encodes a successfully decoded CW, and iii) removes the re-encoded CW from received signals, thereby raising decoding performance of the next CW. However, upon failing to decode a CW due to a channel environment etc., the UE stores an LLR value corresponding to each CW in a HARQ buffer and requests retransmission. Then, the UE attempts to decode a CW using a combination of a newly received signal and an LLR value pre-stored in the HARQ buffer.
[0085] If a large number of CWs is transmitted through multiple antennas or if multiple retransmissions occur, a large number of HARQ buffers may be required. In addition, in order to design a decoder with high performance, since a combination of a plurality of CW decoding orders should be considered, complexity may increase. Accordingly, a method of efficiently decoding a plurality of CWs is needed.
[0086] To achieve the above technical object, the present disclosure proposes a HARQ reception method of a feed-forward scheme based on a decoding policy determined by a receiver through reinforcement learning.
[0087]
[0088] Referring to
[0089] Upon failing to decode a CW, the MIMO SIC receiver transmits information about the state and the reward to the agent, and the agent determines the action based on the information. The above-mentioned state, reward, and action may be information described below.
[0090] The state may include at least one of channel quality information of each CW, an average signal-to-noise ratio (SNR) of each CW, the number of retransmissions of each CW, a code rate of each CW, a modulation and coding scheme (MCS) index of each CW, layer mapping information of each CW, a received average SNR of each CW, a received average SNR of each layer, information about an interference relationship between CWs, or the total number of CWs.
[0091] The reward may include at least one of decoding success or failure (ACK or NACK) for each CW, throughput of data upon which decoding is successful, or
[0092] The action may include at least one of decoding order of CWs, combination or non-combination of a CW with a HARQ buffer during decoding of each CW, demodulation order of layers, a HARQ buffer update policy (add/replace/drop), or a size threshold value of an LLR of each CW.
[0093] The proposed HARQ reception method may include performing reinforcement learning to determine an action according to a state and a reward through training data, and transmitting a decoding policy according to the state based on learned information.
[0094] For convenience of description below, notations are defined as follows. First, CW.sub.n denotes CW.sub.n at a current reception timing, CW.sub.n.sup.−1 denotes an LLR value calculated from transmissions prior to CW.sub.n stored in the HARQ buffer, and CW.sub.n+CW.sub.n.sup.−1 denotes the sum of the currently received CW.sub.n and LLR information pre-stored in the HARQ buffer. Meanwhile, the following description is given based on Q-learning among reinforcement learning methods. However, it is obvious that the following description may be applied to various reinforcement learning methods such as deep Q-network and multi-armed bandit methods other than Q-learning.
[0095] Implementation 1) In Case of CW-Level SIC
[0096] Assume that the receiving UE uses a CW SIC reception method in a 4×4 MIMO environment, and the transmitting UE transmits two CWs. First, steps of training an agent through training data will be described. For reinforcement learning of the agent, a state, a reward, and an action need to be defined first. According to an example or implementation of the present disclosure, the state and the reward may be defined as shown in [Table 1].
TABLE-US-00001 TABLE 1 State Channel quality (e.g., CQI) of CW1, average SNR of CW1, code rate of CW1, number of retransmissions of CW1, channel quality (e.g., CQI) of CW2, average SNR of CW2, code rate of CW2, and/or number of retransmissions of CW2 Reward 0 for NACK and 1 for ACK, or 0 for NACK and 1/number of retransmissions for ACK
[0097] The action or the decoding policy may be defined as follows.
[0098] 1-1) When a CW of a current reception timing and an LLR value of a HARQ buffer are considered individually or in combination, the agent may determine the decoding policy as follows. In this case, the number of HARQ buffers in which an LLR value for each CW is stored may be one.
[0099] Policy #1: CW.sub.1−>CW.sub.2
[0100] Policy #2: CW.sub.2−>CW.sub.1
[0101] Policy #3: CW.sub.1+CW.sub.1.sup.−1−>CW.sub.2+CW.sub.2.sup.−1
[0102] Policy #4: CW.sub.2+CW.sub.2.sup.−1−>CW.sub.1+CW.sub.1.sup.−1
[0103] 1-2) When the CW of the current reception timing and the LLR value of the HARQ buffer are always added, the agent may determine the decoding policy as follows.
[0104] Policy #1: CW.sub.1+CW.sub.1.sup.−1−>CW.sub.2+CW.sub.2.sup.−1
[0105] Policy #2: CW.sub.2+CW.sub.2.sup.−1−>CW.sub.1+CW.sub.1.sup.−1
[0106] The receiving UE performs reinforcement learning using training data based on the defined state, reward, and action. For example, when the transmitting UE transmits two CWs, the MIMO SIC receiver of the receiving UE transmits a channel quality indicator (CQI) of each CW through channel measurement, an average SNR, a code rate of each CW, the number of retransmissions of each CW, and the reward to the agent, in order to learn the decoding policy of 1-1). The agent transmits a decoding policy with the highest Q value among decoding policies #1 to #4 to the MIMO SIC receiver based on a Q-table thereof. The MIMO SIC receiver performs decoding based on the received decoding policy and transmits reward ‘1’ for success or reward ‘0’ for failure back to the agent together with the state. The agent learns the Q-table through the above process. The Q value may be defined as follows.
[0107] The agent may transmit a decoding policy that maximizes the Q value based on the learned Q-table and the state to the MIMO SIC receiver.
[0108]
[0109] According to an example or implementation of the present disclosure, even in the step of determining the action based on the pre-learned Q-function, the agent may continuously perform Q-function updating by receiving a reward. While the flowcharts illustrated in
[0110] Implementation 2) In Case of Symbol-Level SIC
[0111] Unlike CW-level SIC in which re-encoding is performed through CRC after performing channel decoding, the receiving UE may perform, in symbol-level SIC, SIC by demodulating a symbol without performing channel decoding. Therefore, symbol-level SIC has an advantage that recursive decoding is not needed, whereas symbol-level SIC has a disadvantage that there may be reliability loss.
[0112] Assume that the receiving UE receives two CWs through four layers in a 4×4 MIMO environment. That is, each CW may be received through two layers. For example, CW-to-layer mapping may be performed such that CW1 is mapped to layers 1 and 2, and CW2 is mapped to layers 3 and 4. According to an example or implementation of the present disclosure, the state, the reward, and the action may be defined as shown in [Table 2].
TABLE-US-00002 TABLE 2 State Channel quality (e.g., CQI) of CW1, code rate of CW1, number of retransmissions of CW1, average SNR of layers 1 and 2, channel quality (e.g., CQI) of CW2, code rate of CW2, number of retransmissions, and/or average SNR of layers 3 and 4 Reward 0 for NACK and 1 for ACK, or 0 for NACK and 1/number of retransmissions for ACK Action Layer demodulation policy #1, layer demodulation policy corresponding to order of factorial 4 including layer 1 -> layer 2 -> layer 3 -> layer 4
[0113] Similar to the case of CW-level SIC illustrated in
[0114] Implementation 3) Consideration of Interference Relationship Between CWs
[0115] According to an example or implementation of the present disclosure, an interference relationship between a plurality of CWs received by the MIMO SIC receiver may further be considered in the reinforcement learning process. For example, as illustrated in
[0116] As illustrated in (a) and (b) of
[0117] Implementation 4) HARQ Buffer Update Policy
[0118] According to an example or implementation of the present disclosure, when CW decoding fails, a method in which the receiving UE manages a HARQ buffer using an LLR value obtained for decoding from a currently received signal is proposed. The proposed method may be called a buffer update policy determined by the agent of the receiving UE. Specifically, the receiving UE may (i) add an LLR value obtained for decoding from the currently received signal to a previous LLR value stored in the HARQ buffer, (ii) replace the previous LLR value stored in the HARQ buffer with the LLR value obtained for decoding from the currently received signal, or (iii) maintain the previous LLR value stored in the HARQ buffer and drop the LLR value obtained for decoding from the currently received signal. The state, reward, and action according to an example or implementation of the present disclosure may be defined as shown in [Table 3] below.
TABLE-US-00003 TABLE 3 State Channel quality (e.g., CQI) of CW1, code rate of CW1, number of retransmissions of CW1, channel quality (e.g., CQI) of CW2, code rate of CW2, and/or number of retransmissions of CW2 Reward 0 for NACK and 1 for ACK, or 0 for NACK and 1/number of retransmissions for ACK Action Buffer update policy Adding/replacing/dropping policies of each CW
[0119] Implementation 5) Application of Threshold Value During HARQ Buffer Update
[0120] According to [Table 3] described above, upon failing to decode a CW from a currently received signal, the receiving UE may add the LLR value obtained for decoding from the currently received signal to the previous LLR value stored in the HARQ buffer. According to an example or implementation of the present disclosure, only LLR values having a specific threshold value or more may be added to the previous LLR values stored in the HARQ buffer. In this case, the agent may learn a threshold value that optimizes decoding performance and add the threshold value to the buffer update policy. The state, reward, and action according to Implementation 5 may be defined as shown in [Table 4]
TABLE-US-00004 TABLE 4 State Channel quality (e.g., CQI) of CW1, code rate of CW1, number of retransmissions of CW1, channel quality (e.g., CQI) of CW2, code rate of CW2, number of retransmissions of CW2 Reward 0 for NACK and 1 for ACK, or 0 for NACK and 1/number of retransmissions for ACK Action Buffer update policy Size threshold value of LLR value when adding is performed for each CW
[0121] Referring to
[0122] The CW decoding method of the UE according to an example or implementation of the present disclosure includes receiving a plurality of CWs, and decoding the CWs based on SIC. The SIC may be performed based on a decoding policy for decoding the CWs, and the decoding policy may be determined by a neural network trained based on a state and a reward related to the CWs.
[0123] The state may include the channel quality of each of a first CW and a second CW, and the reward may include decoding success or failure of each of the first CW and the second CW.
[0124] The decoding policy may include i) the order of decoding the CWs and ii) combination and non-combination of each CW with an LLR value calculated in previous transmission of each CW, stored in a HARQ buffer, and the neural network may be trained based on decoding results of the CWs based on the decoding policy.
[0125] The state may further include an interference relationship in the time domain and the frequency domain of the CWs, and the neural network may be trained based further on the interference relationship.
[0126] Upon failing to decode the CWs based on the decoding policy, the CW decoding method of the UE may further include managing a HARQ buffer using LLR values calculated for the respective CWs.
[0127] The managing the HARQ buffer may include i) adding the LLR values calculated for the respective CWs to previous LLR values stored in the HARQ buffer, (ii) replacing the previous LLR values stored in the HARQ buffer with the LLR values calculated for the respective CWs, or (iii) dropping the LLR values calculated for the respective CWs.
[0128] The managing the HARQ buffer may include adding only LLR values having a threshold value or more among the LLR values calculated for the respective CWs to the previous LLR values stored in the HARQ buffer.
[0129]
[0130] Referring to
[0131] In the present specification, the processor 21 of the UE and the processor 11 of the BS perform operations of processing signals and data, except that the UE 20 and the BS 10 receive or transmit signals and store signals. However, for convenience of description, the processors 11 and 21 will not particularly be mentioned below. Although the processors 11 and 21 are not particularly mentioned, the processors 11 and 21 may be regarded as performing operations such as data processing rather than a function of receiving or transmitting signals.
[0132] The present disclosure proposes various new frame structures for a fifth generation (5G) communication system. In the next-generation 5G system, scenarios may be classified into enhanced mobile broadband (eMBB), ultra-reliable machine-type communications (uMTC), and massive machine-type communications (mMTC). Here, eMBB is a next-generation mobile communication scenario characterized by high spectral efficiency, high user experienced data rate, and high peak data rate, uMTC is a next-generation mobile communication scenario characterized by ultra-high reliability, ultra-low latency, and ultra-high availability (e.g., vehicle-to-everything (V2X), emergency services, and remote control), and mMTC is a next-generation mobile communication scenario characterized by low cost, low energy, short packet, and massive connectivity (e.g., Internet of things (IoT)).
[0133] The UE according to an example or implementation of the present disclosure may include a transceiver and a processor. The transceiver may receive a radio signal including a PDCCH and a PDSCH and transmit a radio signal including a PUCCH and a PUSCH. The transceiver may include a radio frequency (RF) unit.
[0134] The UE for decoding a CW according to an example or implementation of the present disclosure may include a transceiver for receiving a plurality of CWs and a processor for decoding the CWs based on SIC. The processor may perform the SIC based on a decoding policy for decoding the CWs and determine the decoding policy through a neural network which is trained based on a state and a reward related to the CWs.
[0135] The state may include channel quality of each of a first CW and a second CW, and the reward may include decoding success or failure of each of the first CW and the second CW.
[0136] The decoding policy may include i) the order of decoding the CWs and ii) combination or non-combination of each CW with an LLR value calculated in previous transmission of each CW, stored in a HARQ buffer. The processor may train the neural network based on decoding results of the CWs based on the decoding policy.
[0137] The state may further include an interference relationship in the time domain and the frequency domain of the CWs, and the processor may train the neural network based further on the interference relationship.
[0138] Upon failing to decode the CWs based on the decoding policy, the processor may manage a HARQ buffer using LLR values calculated for the respective CWs.
[0139] The processor may i) add the LLR values calculated for the respective CWs to previous LLR values stored in the HARQ buffer, (ii) replace the previous LLR values stored in the HARQ buffer with the LLR values calculated for the respective CWs, or (iii) drop the LLR values calculated for the respective CWs.
[0140] The processor may add only LLR values having a threshold value or more among the LLR values calculated for the respective CWs to the previous LLR values stored in the HARQ buffer.
[0141] The above-described embodiments of the present disclosure may be implemented through various means, for example, hardware, firmware, software, or a combination thereof.
[0142] In a hardware configuration, the methods according to the embodiments of the present disclosure may be achieved by at least one of application specific integrated circuits (ASICs), digital signal processors (DSPs), digital signal processing devices (DSPDs), programmable logic devices (PLDs), field programmable gate arrays (FPGAs), processors, controllers, microcontrollers, microprocessors, etc.
[0143] In a firmware or software configuration, the methods according to the embodiments of the present disclosure may be implemented in the form of a module, a procedure, a function, etc. for performing the above-described functions or operations. Software code may be stored in a memory unit and executed by a processor. The memory unit may be located inside or outside the processor and exchange data with the processor via various known means.
[0144] The detailed descriptions of the preferred embodiments of the present disclosure are provided to allow those skilled in the art to implement and embody the present disclosure. While the present disclosure has been described and illustrated herein with reference to the preferred embodiments thereof, it will be apparent to those skilled in the art that various modifications and variations may be made therein without departing from the spirit and scope of the disclosure. Therefore, the present disclosure is not limited to the embodiments disclosed herein but intends to give the broadest scope consistent with the new principles and features disclosed herein.
[0145] The present disclosure may be carried out in other specific ways than those set forth herein without departing from the spirit and essential characteristics of the present disclosure. The above embodiments are therefore to be construed in all aspects as illustrative and not restrictive. The scope of the disclosure should be determined by the appended claims and their legal equivalents, not by the above description, and all changes coming within the meaning and equivalency range of the appended claims are intended to be embraced therein. It is obvious to those skilled in the art that claims that are not explicitly cited in each other in the appended claims may be presented in combination as an embodiment of the present disclosure or included as a new claim by a subsequent amendment after the application is filed.
Industrial Applicability
[0146] While the above-described method of decoding CWs in a wireless communication system and the UE therefor have been described focusing on an example applied to the 3GPP LTE system, the method and the UE are applicable to various wireless communication systems in addition to the 3GPP LTE system.