FINITE-ALPHABET BEAMFORMING FOR MULTI-ANTENNA WIDEBAND SYSTEMS
20210281302 · 2021-09-09
Inventors
Cpc classification
H04B7/0456
ELECTRICITY
H04B7/0478
ELECTRICITY
H04L25/067
ELECTRICITY
Y02D30/70
GENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
H04B7/0404
ELECTRICITY
International classification
H04B7/0404
ELECTRICITY
H04B7/0456
ELECTRICITY
H04L25/02
ELECTRICITY
Abstract
Finite-alphabet beamforming for multi-antenna wideband systems is provided. The combination of massive multi-user multiple-input multiple-output (MU-MIMO) technology and millimeter-wave (mmWave) communication enables unprecedentedly high data rates for radio frequency (RF) communications. In such systems, beamforming must be performed at extremely high rates over hundreds of antennas. For example, spatial equalization applies beamforming in the uplink to mitigate interference among user equipment (UEs) at a base station (BS). Finite-alphabet equalization provides a new paradigm that restricts the entries of a spatial equalization matrix to low-resolution numbers, enabling high-throughput, low-power, and low-cost equalization hardware. Similarly, precoding applies beamforming in the downlink to maximize the reception of a signal transmitted from a BS to a target UE. Finite-alphabet precoding can be applied in the downlink to similarly improve power and cost in precoding hardware.
Claims
1. A method for digitally beamforming signals for an antenna array, the method comprising: estimating a wireless channel associated with a plurality of digital baseband signals and an antenna array to produce estimates of the wireless channel; and beamforming, with a finite-alphabet equalizer, the plurality of digital baseband signals based on the estimates of the wireless channel; wherein the plurality of digital baseband signals is beamformed at a given resolution less than a resolution of the estimates of the wireless channel.
2. (canceled)
3. The method of claim 1, further comprising determining digital beamforming weights which are optimized for the given resolution; wherein beamforming the plurality of digital baseband signals comprises steering the plurality of digital baseband signals with the digital beamforming weights.
4. The method of claim 3, further comprising adjusting the digital beamforming weights in response to new estimates of the wireless channel.
5. The method of claim 1, further comprising receiving a sequential array of digital signal samples corresponding to the antenna array and comprising the plurality of digital baseband signals.
6. The method of claim 5, wherein the sequential array of digital signal samples is received from analog-to-digital converter circuitry coupled to the antenna array.
7. The method of claim 5, wherein beamforming the plurality of digital baseband signals comprises recovering received user equipment signals from the sequential array of digital signal samples.
8. The method of claim 1, further comprising providing a sequential array of digital signal samples comprising the plurality of digital baseband signals to be transmitted via the antenna array.
9. The method of claim 8, wherein providing the sequential array of digital signal samples comprises providing the sequential array of digital signal samples to digital-to-analog converter circuitry coupled to the antenna array.
10. The method of claim 8, wherein beamforming the plurality of digital baseband signals comprises precoding each of the plurality of digital baseband signals by weighting the sequential array of digital signal samples.
11. The method of claim 1, wherein the plurality of digital baseband signals is represented by a received channel matrix.
12. The method of claim 11, wherein beamforming the plurality of digital baseband signals comprises performing a vector product of the received channel matrix and an equalization matrix.
13. The method of claim 12, wherein: estimating the wireless channel comprises providing a channel matrix; and the method further comprises determining the equalization matrix based on the channel matrix.
14. The method of claim 13, further comprising adjusting the equalization matrix to minimize a post-equalization mean square error of the plurality of digital baseband signals.
15. A radio access node, comprising: an antenna array; channel estimator circuitry coupled to the antenna array and configured to provide an estimate of a wireless channel associated with a plurality of digital baseband signals for the antenna array; finite-alphabet beamforming circuitry coupled to the antenna array and the channel estimator circuitry, wherein: the finite-alphabet beamforming circuitry is configured to beamform the plurality of digital baseband signals in accordance with the estimate of the wireless channel; and the finite-alphabet beamforming circuitry comprises a finite-alphabet precoder configured to beamform a sequential array of digital signal samples; and processing circuitry coupled to the finite-alphabet precoder and configured to determine a precoder matrix applied to the sequential array of digital signal samples.
16. The radio access node of claim 15, wherein: the finite-alphabet beamforming circuitry comprises a finite-alphabet equalizer configured to beamform the sequential array of digital signal samples; and the processing circuitry is coupled to the finite-alphabet equalizer and configured to determine an equalization matrix applied to the sequential array of digital signal samples.
17. The radio access node of claim 16, further comprising analog-to-digital circuitry configured to provide the sequential array of digital signal samples corresponding to antennas of the antenna array.
18. (canceled)
19. The radio access node of claim 15, further comprising digital-to-analog circuitry configured to provide analog signals to be transmitted by the antenna array from the sequential array of digital signal samples.
20. The radio access node of claim 15, wherein the finite-alphabet beamforming circuitry has a resolution of 8 bits or less.
21. The radio access node of claim 20, wherein the finite-alphabet beamforming circuitry has a resolution of 3 bits or less.
22. The radio access node of claim 15, wherein the antenna array comprises at least 64 antennas.
23. A radio access node, comprising: an antenna array; channel estimator circuitry coupled to the antenna array and configured to provide an estimate of a wireless channel associated with a plurality of digital baseband signals for the antenna array; and finite-alphabet beamforming circuitry coupled to the antenna array and the channel estimator circuitry, the finite-alphabet beamforming circuitry being configured to beamform the plurality of digital baseband signals in accordance with the estimate of the wireless channel and at a given resolution less than a resolution of the estimate of the wireless channel.
Description
BRIEF DESCRIPTION OF THE DRAWING FIGURES
[0013] The accompanying drawing figures incorporated in and forming a part of this specification illustrate several aspects of the disclosure, and together with the description serve to explain the principles of the disclosure.
[0014]
[0015]
[0016]
[0017]
[0018]
[0019]
[0020]
[0021]
[0022]
[0023]
[0024]
[0025]
[0026]
[0027]
[0028]
[0029]
[0030]
[0031]
[0032]
[0033]
[0034]
[0035]
[0036]
[0037]
[0038]
[0039]
[0040]
DETAILED DESCRIPTION
[0041] The embodiments set forth below represent the necessary information to enable those skilled in the art to practice the embodiments and illustrate the best mode of practicing the embodiments. Upon reading the following description in light of the accompanying drawing figures, those skilled in the art will understand the concepts of the disclosure and will recognize applications of these concepts not particularly addressed herein. It should be understood that these concepts and applications fall within the scope of the disclosure and the accompanying claims.
[0042] It will be understood that, although the terms first, second, etc. may be used herein to describe various elements, these elements should not be limited by these terms. These terms are only used to distinguish one element from another. For example, a first element could be termed a second element, and, similarly, a second element could be termed a first element, without departing from the scope of the present disclosure. As used herein, the term “and/or” includes any and all combinations of one or more of the associated listed items.
[0043] It will be understood that when an element is referred to as being “connected” or “coupled” to another element, it can be directly connected or coupled to the other element or intervening elements may be present. In contrast, when an element is referred to as being “directly connected” or “directly coupled” to another element, there are no intervening elements present.
[0044] The terminology used herein is for the purpose of describing particular embodiments only and is not intended to be limiting of the disclosure. As used herein, the singular forms “a,” “an,” and “the” are intended to include the plural forms as well, unless the context clearly indicates otherwise. It will be further understood that the terms “comprises,” “comprising,” “includes,” and/or “including” when used herein specify the presence of stated features, integers, steps, operations, elements, and/or components, but do not preclude the presence or addition of one or more other features, integers, steps, operations, elements, components, and/or groups thereof.
[0045] Unless otherwise defined, all terms (including technical and scientific terms) used herein have the same meaning as commonly understood by one of ordinary skill in the art to which this disclosure belongs. It will be further understood that terms used herein should be interpreted as having a meaning that is consistent with their meaning in the context of this specification and the relevant art and will not be interpreted in an idealized or overly formal sense unless expressly so defined herein.
[0046] Finite-alphabet beamforming for multi-antenna wideband systems is provided. The combination of massive multi-user multiple-input multiple-output (MU-MIMO) technology and millimeter-wave (mmWave) communication enables unprecedentedly high data rates for radio frequency (RF) communications. In such systems, beamforming must be performed at extremely high rates over hundreds of antennas. For example, spatial equalization applies beamforming in the uplink to mitigate interference among user equipment (UEs) at a base station (BS). Conventional hardware designs of spatial equalizers in all-digital BSs, where each antenna is equipped with a pair of data converters, would entail prohibitively high power consumption and implementation costs. To address these issues, finite-alphabet equalization provides a new paradigm that restricts the entries of a spatial equalization matrix to low-resolution numbers, enabling high-throughput, low-power, and low-cost equalization hardware.
[0047] Similarly, precoding applies beamforming in the downlink to maximize the reception of a signal transmitted from a BS to a target UE. Finite-alphabet precoding can be applied in the downlink to similarly improve power and cost in precoding hardware.
[0048] To minimize the performance loss of finite-alphabet equalization, embodiments generate an equalization matrix using finite-alphabet minimum mean-square error (MMSE) equalization (FAME), which significantly outperforms a naïve quantization of a linear MMSE matrix. A similar approach is applied to replace part of linear Wiener filter (WF) precoding matrices with a finite-alphabet WF precoding (FAWP) matrix for the downlink. Efficient algorithms can be deployed to quantize solutions to a non-deterministic polynomial-time (NP)-hard FAME/FAWP problem defined herein. Through this, it is shown that for massive MU-MIMO mmWave systems, near-optimal error-rate performance can be achieved with equalization coefficients quantized to only 1-3 bits. In addition, very-large scale integration (VLSI) results demonstrate a reduction in equalization power and area by at least a factor of 3.9× and 5.8×, respectively, over traditional approaches.
[0049]
[0050] In an exemplary aspect, the radio access node 20 includes an antenna array 26 (e.g., having B antennas), which can be operated for MU-MIMO communication with multiple UEs 28 (e.g., U UEs 28). Accordingly, the radio access node 20 is configured to perform spatial equalization in uplink communications (e.g., communications from the UEs 28 to a core network 30 of the wireless communications system 18). The purpose of spatial equalization is to beamform received signals by collecting the signals from all U UEs 28 at B antennas of the antenna array 26, while suppressing inter-UE interference. After this beamforming, these signals may be further processed via baseband processing circuitry 32 and forwarded to the core network 30. In some embodiments, the radio access node 20 is further configured to perform precoding in downlink communications (e.g., from the core network 30 to the UEs 28). Precoding similarly beamforms signals to be transmitted to the UEs 28 to maximize the reception of a transmitted signal at its target UE 28.
[0051] In an exemplary aspect, the antenna array 26 of the radio access node 20 includes a large number of B antennas (e.g., at least B=64 antennas). As described above, for such a large antenna array 26, a conventional matrix-vector-product circuit consumes a large amount of power. Consequently, the finite-alphabet beamforming circuitry (e.g., the finite-alphabet equalizer 22 and/or the finite-alphabet precoder 24) provides more efficient spatial equalization in order to minimize power consumption and semiconductor area (which translate to system costs), while achieving high spectral efficiency.
[0052] The finite-alphabet equalizer 22 and the finite-alphabet precoder 24 beamform digital baseband signals based on a wireless channel estimation provided by channel estimator circuitry 34 using matrix-vector products. The matrix-vector products required for spatial equalization and precoding involve multiplications and additions, where the hardware multipliers dominate power and area. The area and delay of a hardware multiplier scales with O(mn) and O(log(max{m, n})), respectively, where m and n are the number of bits of each operand. Therefore, circuit area, delay, and power consumption (which is roughly proportional to circuit area) of a matrix-vector-product engine can be minimized by using a low number of bits to represent both operands.
[0053] In this regard, Section I below focuses on application of finite-alphabet equalization to uplink communications using the finite-alphabet equalizer 22. Section II adapts the finite-alphabet equalization approach to downlink communications using the finite-alphabet precoder 24.
Notation
[0054] Matrices and column vectors are represented with uppercase and lowercase boldface letters, respectively. The Hermitian transpose and the Frobenius norm of a matrix A are denoted by A.sup.H and ∥A∥.sub.F, respectively. The real part of a complex-valued matrix A is {A} and the imaginary part is ℑ{A}. The M×M identity matrix is denoted by I.sub.M. The kth entry and the
.sub.2-norm of a vector a are a.sub.k and ∥a∥.sub.2, respectively; the entry-wise complex conjugate is denoted by a*. The kth standard basis vector is represented by e.sub.k. The signum function sgn(⋅) is defined as sgn(a)=+1 for a≥0 and sgn(a)=−1 for a<0 and is applied entry-wise to vectors.
.sub.x[⋅] is used to denote expectation with respect to the random vector x. The set
.sub.+ contains all positive semidefinite matrices, and the set
.sub.+ contains all the non-negative real numbers.
I. Finite-Alphabet Equalizer (Uplink)
[0055] With continuing reference to
[0056] To reduce power consumption and implementation costs of spatial equalization, the finite-alphabet equalizer 22 coarsely quantizes the coefficients of one or more spatial equalization matrices. In contrast to approaches that use low-resolution ADCs to quantize the received vector to be equalized, finite-alphabet equalization coarsely quantizes the entries of the spatial equalization matrix. While this approach appears straightforward, obtaining low-resolution finite-alphabet equalization matrices that achieve high spectral efficiency is a hard problem.
[0057]
[0058] The finite-alphabet equalizer 22 uses a specific finite-alphabet equalization-matrix structure that enables one to reduce the complexity of a U×B matrix-vector product by using U×B low-resolution coefficients, while still being able to deliver a performance similar to conventional, high-resolution spatial equalization matrices, as further described in subsection A below. The so-called FAME problem is described in subsection B below, whose solution leads to finite-alphabet equalization matrices that minimize the post-equalization mean-square error (MSE). In subsection C, a range of algorithms are presented that approximate the NP-hard FAME problem—some of these algorithms achieve excellent performance even for 1-bit resolution; some require very low complexity.
[0059] In subsection D, the finite-alphabet equalization approach is evaluated for line-of-sight (LoS) and non-LoS mmWave channel models, which demonstrate the efficacy of FAME in terms of error-vector magnitude (EVM), beamforming capabilities, and uncoded BER. In addition, finite-alphabet equalization circuits are implemented for different numbers of bits in a 28 nanometer (nm) complementary metal-oxide-semiconductor (CMOS) to demonstrate the effectiveness of FAME in practice.
A. System Model
[0060] With continuing reference to
[0061] Focusing on the uplink of the wireless communications system 18 (e.g., a massive MU-MIMO system), the following narrowband input-output relation can be defined:
y=Hs+n Equation 1
Here, yϵ.sup.B is the received signal vector at the radio access node 20, Hϵ
.sup.B×U is the known uplink MIMO channel matrix, sϵ
.sup.U is the transmit data vector, where
is the constellation set (e.g., 16-quadrature amplitude modulation (QAM)), and nϵ
.sup.B is independent and identically distributed (i.i.d.) circularly-symmetric complex Gaussian noise with covariance matrix C.sub.n=
.sub.n[nn.sup.H], N.sub.0I.sub.B per complex entry. In what follows, it is assumed that the transmit signals of the UEs 28, s.sub.u, u=1, . . . , U, are i.i.d. zero mean with variance E.sub.s so that C.sub.s=
.sub.s[ss.sup.H]=EX.
[0062] It should be understood that the input-output relation in Equation 1 is not only valid to model narrowband transmission, but can also be used to model the subcarriers of a wideband massive MU-MIMO system that uses orthogonal frequency-division multiplexing (OFDM) or single-carrier frequency-division multiple access (SC-FDMA). The theory and algorithms developed in the remainder of this disclosure can be generalized for systems with inter-symbol interference.
[0063] For this model, it is assumed that the channel remains constant over multiple symbol transmissions and, hence, can be estimated. For the mathematical derivations, it is further assumed that perfect channel state information is quantized at the radio access node 20. For systems in which the UEs 28 use an antenna array to perform transmit beamforming, the channel matrix H represents the joint effect of beamforming and the physical channel.
[0064] A key task of the radio access node 20 is to form estimates ŝϵ.sup.U of the transmitted data vector s. To develop methods that are computationally efficient and hardware friendly, embodiments focus on linear spatial estimators of the form ŝ=W.sup.Hy where W.sup.Hϵ
.sup.U×B is the L-MMSE equalization matrix that minimizes the post-equalization MSE defined as:
MSE=.sub.s,n[∥W.sup.Hy−s∥.sub.2.sup.2] Equation 2
Given the assumptions on the statistics of the transmit data and noise vectors, s and n:
MSE=.sub.s∥W.sup.H∥H−I.sub.U∥.sub.F.sup.2+N.sub.0∥W.sup.H∥.sub.F.sup.2 Equation 3
[0065] Hence, the L-MMSE equalization matrix can be obtained by solving the following matrix least-squares problem:
W.sup.H=arg ∥I.sub.U−{tilde over (W)}.sup.HH∥.sub.F.sup.2+ρ∥{tilde over (W)}.sup.HH∥.sub.F.sup.2 Equation 4
with regularization parameter ρ=N.sub.0/E.sub.s. This optimization problem has a closed-form solution given by:
W.sup.H=(ρI.sub.U+H.sup.HH).sup.−1H.sup.H Equation 5
which can be computed efficiently in hardware.
[0066] Alternatively, the rows w.sub.u.sup.H, u=1, . . . , U, of the L-MMSE equalization matrix W.sup.H can be computed by solving:
w.sub.u=arg ∥e.sub.u−H.sup.H{tilde over (w)}∥.sub.2.sup.2+ρ∥{tilde over (w)}∥.sub.2.sup.2 Equation 6
This alternative formulation of the L-MMSE equalizer is useful in solving the FAME problem, as detailed in the next subsection.
B. FAME: Finite-Alphabet MMSE Equalization
[0067] Embodiments of the finite-alphabet equalizer 22 of
[0068] Linear equalization in hardware requires the computation of an inner product ŝ.sub.u=w.sub.u,y
=w.sub.u.sup.Hy per UE for every received vector y. As described above, executing even such simple computations at the bandwidth offered by mmWave systems can result in excessively large area and high power consumption. To reduce both the area and power consumption, embodiments of the finite-alphabet equalizer 22 reduce the numerical precision of the equalization vectors w.sub.u, u=1, . . . , U. In the extreme case where the entries of w.sub.u are quantized using 1-bit per real and imaginary component, an inner-product computation would only require additions and subtractions; this is significantly less costly (in area and power) than using high-precision multipliers.
[0069] However, it is obvious that reducing the numerical precision of the equalization vectors w.sub.u will affect the MSE and eventually the error-rate performance. Furthermore, quantization to, e.g., the finite alphabet X={+1+j, +1−j, −1+j, −1−j}, will result in numerical-range issues, meaning that such matrices will not be able to represent large or small entries. To mitigate both of these issues, a principled way to perform equalization is developed with finite-alphabet matrices.
[0070] A U×B finite-alphabet equalization matrix is defined as follows:
V.sup.H=diag(β*)X.sup.H Equation 7
Here, βϵ.sup.U is a vector that consists of post-equalization scaling factors and X.sup.HϵX.sup.U×B is an equalization matrix with entries chosen from the finite alphabet X.
[0071] Embodiments of the finite-alphabet equalizer 22 use finite alphabets X of low cardinality and whose elements can be represented using a small number of bits (e.g., 8 bits or less). An example is the 1-bit finite alphabet X={+1+j, +1−j, −1+j, −1−j}, which uses 1-bit per real and imaginary component.
[0072] With Equation 7, the equalized received symbol for the uth UE 28 is given by:
ŝ.sub.u=v.sub.u.sup.Hy=β.sub.u*x.sub.u.sup.Hy Equation 8
where v.sub.u.sup.Hϵ.sup.1×B and x.sub.u.sup.HϵX.sup.1×H are the uth rows of the matrices V.sup.H and X.sup.H, respectively. The spatial equalization as in Equation 8 allows for efficient circuit implementations, especially for finite alphabets with low cardinality and regularly spaced elements. For such matrices, the inner product x.sub.u.sup.Hy can be implemented using low-resolution multipliers. As βϵ
.sup.U, the post-equalization scaling operation by the scalar factor β.sub.u is performed using high-resolution multipliers. Nonetheless, this operation is executed only once per UE 28. Section I-D below illustrates that equalizer implementations which leverage finite-alphabet equalization matrices enable significant area and power savings.
[0073] From the above equations, FAME is developed as a principled method to compute MSE-optimal finite-alphabet equalization matrices. Analogous to the derivation of the L-MMSE equalizer in Equation 6, FAME is interested in minimizing the post-equalization MSE:
FA−MSE=.sub.s,n[∥V.sup.H.sub.y−s∥.sub.2.sup.2] Equation 9
with the difference that V.sup.H=diag(β*)X.sup.H is now a finite-alphabet equalization matrix as per Equation 7. From Equation 6, it follows that the rows v.sub.u.sup.H=β.sub.u*x.sub.u.sup.H, u=1, . . . , U, of such a FAME matrix can be computed by solving the following optimization problem:
{β.sub.u,x.sub.u}=arg ∥e.sub.u−H.sup.H{tilde over (β)}{tilde over (x)}∥.sub.2.sup.2+ρ∥{tilde over (β)}{tilde over (x)}∥.sub.2.sup.2 Equation 10
[0074] Intuitively, embodiments seek to find the finite-alphabet equalization vectors v.sub.u.sup.H=β.sub.u*x.sub.u.sup.H, u=1, . . . , U, that best mimic the infinite-precision L-MMSE equalizer.
[0075] For a fixed scaling factor β.sub.u, the FAME problem in Equation 10 is a closest vector problem, which is known to be NP-hard. For example, for a system with an antenna array 26 of B=256 antennas using a 1-bit finite-alphabet equalization matrix, solving the FAME problem using an exhaustive search would require one to evaluate the objective function in Equation 10 more than 10.sup.154 times for each UE 28. Clearly, without low-complexity algorithms, the FAME problem cannot be solved in practical massive MU-MIMO mmWave systems.
[0076] Since the FAME problem in Equation 10 minimizes the cost function for two quantities at once, i.e., the scaling factor β.sub.u and the low-resolution vector x.sub.u, it is not obvious how to solve it efficiently. To derive computationally efficient algorithms in subsection C below, the following equivalent form of the FAME problem is used. The FAME problem in Equation 10 is equivalent to solving the following optimization problem for each UE u=1, . . . , U:
where the associated optimal scaling factor is given by:
[0077] This formulation of the FAME problem facilitates first finding the optimal vector x.sub.u using Equation 11 and then computing the associated optimal scaling factor β.sub.u using Equation 12. Note that Equation 12 models the MSE optimal scaling factor β.sub.u for a given vector x.sub.u, regardless of how x.sub.u was computed.
[0078] FL-MMSE: A Baseline Finite-Alphabet Equalizer: Since the FAME problem is NP-hard, a baseline method is presented to compute finite-alphabet equalization matrices as in Equation 7 without having to solve the FAME problem in Equation 11. This approach is referred to as FL-MMSE, as it obtains the entries of the low-resolution matrix X.sup.H by quantizing the L-MMSE equalizer in Equation 5. The corresponding scaling factors β.sub.u are then obtained using Equation 12. Throughout this disclosure, the FL-MMSE equalizer is used as a baseline method to evaluate the performance of embodiments of the finite-alphabet equalizer 22 that attempt to directly solve the FAME problem in Equation 11.
[0079] For the 1-bit case, FL-MMSE applies the signum function sgn(⋅) separately on the real and imaginary parts of the L-MMSE matrix W.sup.H to obtain X.sup.H. Then, FL-MMSE uses the optimal FAME scaling in Equation 12 to compute the high-resolution scaling factors in the vector β. FL-MMSE can also be used with finite-alphabets that have more than 1-bit per complex entry. In such cases, after computing the L-MMSE equalization matrix W.sup.H in Equation 5, the real and imaginary parts are quantized as follows.
[0080] For each row w.sub.u.sup.H of W.sup.H, the scalar w.sub.max corresponding to the largest absolute value in [{w.sub.u.sup.H}, ℑ{w.sub.u.sup.H}] is identified. Then, assuming that the targeted resolution is r bits, the range [−w.sub.max, +w.sub.max] is divided into 2.sup.r uniform-width bins and the entries of
{w.sub.u.sup.H} and ℑ{w.sub.u.sup.H} are quantized to the centroid values of these bins. For 2-bit resolution, for example, the centroid values of the bins are {−0.75, −0.25, +0.25, +0.75}w.sub.max. In hardware, one would scale these centroid values so that the minimum absolute value corresponds to 1. Following the previous example, one would use the values {−3, −1, +1, +3} to represent the entries of
{x.sub.u.sup.H} and ℑ{x.sub.u.sup.H}. Note that this scaling does not affect the solution of the FAME problem in Equation 11, as it can be absorbed into the scaling factor vector β.sub.u in Equation 12. After obtaining the low resolution vector, the corresponding scaling factor β.sub.u is computed using Equation 12.
[0081]
[0082]
[0083] While the infinite-precision L-MMSE equalizer achieves an EVM of 11.58%, quantizing its solution to 1-bit using FL-MMSE degrades the EVM to 30.58%, which blurs the decision regions of the considered 16-QAM constellation. In stark contrast, the 1-bit FAME-EXH equalizer achieves an EVM of only 15.30%, which is close to that of the infinite-precision L-MMSE equalizer; furthermore, the decision regions between constellation points are clearly visible. These results demonstrate the significant EVM advantage of solving the FAME problem over the simple FL-MMSE equalizer.
[0084]
h.sub.b(ϕ)=e.sup.−jπ(b−1)cos(ϕ),b=1, . . . ,B Equation 13
Here, a uniform linear array (ULA) of antennas with half-wavelength antenna spacing and constant path loss is assumed. A primary UE 28 is located at an angle of ϕ.sub.1=60° and a secondary UE 28 is located at ϕ.sub.2=120°. Next, the corresponding equalization matrix is computed using L-MMSE, FL-MMSE, and FAME-EXH equalization. An evaluation is performed of how much the equalization vector v.sub.1.sup.H (which corresponds to the UE at ϕ.sub.1=60°) captures (or rejects) signals incoming from different incident angles by evaluating |v.sub.1.sup.Hh(ϕ′)|.sup.2 for 0≤ϕ′≤π. The equalization vector v.sub.1.sup.H should amplify the signal from the primary UE 28 at ϕ.sub.1=60° but attenuate the signal from the secondary UE 28.
[0085] The results shown in
[0086] Despite the significant performance advantages of 1-bit FAME-EXH over 1-bit FL-MMSE, solving 1-bit FAME-EXH for large-dimensional problems that arise in mmWave systems is infeasible in practice. To this end, low-complexity FAME solvers are developed that scale to large antenna arrays 26.
C. Fast Algorithms to Solve FAME
[0087] Approximate algorithms are presented to solve the FAME problem efficiently for a radio access node 20 with a large antenna array 26. This begins with proposing a semidefinite relaxation (SDR)-based method and then developing a much faster method that uses forward-backward splitting (FBS).
[0088] FAME with SDR: First, SDR is used to solve the FAME problem in Equation 11 for a 1-bit finite alphabet. To do so, the FAME problem is re-expressed in the real domain using the quantities:
Throughout, it is assumed that {x} and ℑ{x} take values from the same alphabet
. For example, for 1-bit finite alphabets,
={−1, +1}. The FAME problem in Equation 11 can now be rewritten as:
[0089] It is now key to realize that the vector can be scaled arbitrarily without changing the objective function of Equation 15. This observation enables us to state an equivalent optimization problem:
where the discrete set Z.sub.α is a scaled version of ; for 1-bit finite alphabets, Z.sub.α={−α, +α} with α>0. This formulation enables formulation of a semidefinite program to solve the FAME problem approximately.
[0090] By focusing on 1-bit finite alphabets, Equation 16 can be relaxed by replacing the constraint ||.sup.2=1 by
=1, where the positive semidefinite matrix
.sub.+.sup.2B should approximate
. This SDR yields
where the diagonal elements of +.sup.2B are made equal (without specifying their value). This constraint is a result of the fact that we are interested in a solution in the set Z.sub.α, where the parameter α is not known. After solving the semidefinite program in Equation 17, the finite-alphabet vector is computed by first extracting the leading eigenvector of the solution matrix
[0091] While FAME-SDR can also be derived for multi-bit finite alphabets, this approach is not pursued for the following reasons. As described further below, the complexity of FAME-SDR does not scale well to a large number of BS antennas. Moreover, FAME-SDR cannot be applied to finite alphabets that are not separable into real and imaginary parts, such as a finite alphabet that contains the elements of an 8-phase shift keying (PSK) constellation. In addition, SDR can only handle finite alphabets with even cardinality that exclude a zero element. To avoid the drawbacks of SDR for FAME, an alternative approach is presented.
[0092] FAME with FBS: Due to the high complexity of FAME-SDR and the fact that SDR solvers are notoriously difficult to implement in hardware, a low-complexity alternative for solving the FAME problem approximately is presented next. To do so, it is assumed that, for each UE u=1, . . . , U, the optimal value of the objective in Equation 11 is known and denoted by γ.sub.u. Mathematically:
where x.sub.u is the solution to the problem in Equation 11. Note that it follows from Equation 18 that γ.sub.u>1. Rearranging 18 yields:
0=∥H.sup.Hx.sub.u∥.sub.2.sup.2+ρ∥x.sub.u∥.sub.2.sup.2−γ.sub.u|h.sub.u.sup.Hx.sub.u|.sup.2 Equation 19
Thus, if γ.sub.u was known, solving the problem:
would yield the same solution as Equation 11. As the value of γ.sub.u is unknown in practice, it is used as an algorithm parameter that is tuned to empirically improve the error-rate performance.
[0093] Since the problem in Equation 20 still contains a search over the finite-alphabet X.sup.B, the non-convex constraint {tilde over (x)}ϵX.sup.B is relaxed to {tilde over (x)}ϵ.sup.B. Here,
corresponds to the convex hull of the finite alphabet X, which is defined as:
={Σ.sub.i=1.sup.|X|α.sub.i
.sub.+,∀i)∧Σ.sub.i=1.sup.|X|α.sub.i=1}, Equation 21
where
is added to the objective function, where δ>0 is a regularization parameter. The resulting optimization problem is given by:
[0094] FBS is used to compute a solution to Equation 22. FBS is an efficient, iterative solver for convex optimization problems of the form:
{circumflex over (x)}=arg min.sub.{tilde over (x)},ƒ({tilde over (x)})+g({tilde over (x)}) Equation 23
where both functions ƒ and g are convex, but ƒ is a smooth function and g is not necessarily smooth or bounded. FBS executes the following operations for t=1, 2, . . . , t.sub.max iterations or until convergence:
{tilde over (z)}.sup.(t+1)={tilde over (x)}.sup.(t)−τ.sup.(t)∇ƒ({tilde over (x)}.sup.(t)) Equation 24
{tilde over (x)}.sup.(t+1)=prox.sub.g({tilde over (z)}.sup.(t+1);τ.sup.(t)) Equation 25
Here, ∇ƒ({tilde over (x)}.sup.(t)) is the gradient of the function ƒ, {τ.sup.(t)>0} is a sequence of step sizes, and prox.sub.g(⋅) is the proximal operator of the function g, defined as:
prox.sub.g({tilde over (z)};τ)=arg min.sub.{tilde over (x)}{τg({tilde over (x)})+½∥{tilde over (x)}−{tilde over (z)}∥.sub.2.sup.2} Equation 26
[0095] The problem in Equation 22 is not convex and hence, FBS is not guaranteed to converge to an optimal solution. Nevertheless, FBS can be used to find approximate solutions to Equation 22 by setting:
Here, the convex constraint {tilde over (x)}ϵ.sup.B in Equation 22 is incorporated into the function g({tilde over (x)}) via the indicator function
({tilde over (x)}), which is zero if {tilde over (x)}ϵ
.sup.B and infinity otherwise. With these definitions:
∇ƒ({tilde over (x)})=HH.sup.H{tilde over (x)}−γ.sub.uh.sub.uh.sub.u.sup.H{tilde over (x)} Equation 29
prox.sub.g({tilde over (z)})=sgn({{tilde over (z)}})min{ν.sup.(t)|
{{tilde over (z)}}|,1}+j sgn(ℑ{{tilde over (z)}})min{ν.sup.(t)|ℑ{ź}|,1} Equation 30
where ν.sup.(t)=(1+τ.sup.(t)(ρ−ϵ)).sup.−1 and Equation 30 is applied element-wise to the vector {tilde over (z)}.
[0096] Note that three sets of algorithm parameters have been introduced: {τ.sup.(t)}, {ν.sup.(t)}, and {γ.sub.u}, where t=1, . . . , t.sub.max and u=1, . . . , U. In some examples, these parameters may be tuned manually. To avoid manual tuning of these parameters, other embodiments apply a neural-network-based approach. As the same algorithm parameters should work across several channel realizations, having a per-UE parameter such as {γ.sub.u} is meaningless. As a result, embodiments set y=γ.sub.u for u=1, . . . , U. Furthermore, to provide the neural network with greater flexibility during optimization, γ is allowed to be different in each iteration; i.e., another set of per-iteration parameters {γ.sup.(t)}, t=1, . . . , t.sub.max are introduced. The resulting algorithm is referred to as FAME-FBS, which is summarized as follows:
TABLE-US-00001 Algorithm 1 (FAME-FBS). Initialize {tilde over (x)}.sup.(1) with either the maximum ratio combining (MRC) solution h.sub.u or the low-resolution vector x.sub.u computed by FL-MMSE, and fix the sets of parameters {τ.sup.(t)}, {ν.sup.(t)}, and {γ.sup.(t)}. Then, for every iteration t = 1,...,t.sub.max, compute: {tilde over (z)}.sup.(t+1) = (I.sub.B − τ.sup.(t)H(I.sub.U − γ.sup.(t)e.sub.ue.sub.u.sup.H)H.sup.H){tilde over (x)}.sup.(t) Equation 31 {tilde over (x)}.sup.(t+1) = prox.sub.g({tilde over (z)}.sup.(t+1)) Equation 32 Here, the proximal operator prox.sub.g(•) is the element-wise function given by Equation 30. The result {tilde over (x)}.sup.(t.sup.
[0097] FAME-FBS supports multi-bit finite alphabet equalization matrices. This is achieved by uniformly quantizing, in the range [−1, +1], the real and imaginary parts of the solution vector {tilde over (x)}.sup.(t.sup.
[0098] Computational Complexity: The complexity is assessed of (i) computing the equalization matrix and (ii) performing equalization on a received vector y, for high-resolution and finite-alphabet equalization approaches. Computational complexity is measured as the number of real-valued multiplications performed by an algorithm.
[0099] Table I lists the computational complexity for computing a single equalization matrix using L-MMSE, FL-MMSE, FAME-SDR, and FAME-FBS. For the infinite-precision L-MMSE equalizer, the complexity corresponds to explicitly computing the equalization matrix W.sup.H. For the finite-alphabet equalizers (FL-MMSE and FAME-based algorithms), the complexity corresponds to the computation of the low-resolution matrix X.sup.H and the scaling factors in the vector β. Solving FAME-SDR results in the highest complexity, which asymptotically scales as O(B.sup.4.5) unless specific problem structures can be exploited. Evidently, FAME-SDR does not scale well to systems with a large antenna array 26. FAME-FBS has the same asymptotic scaling of O(BU.sup.2) as L-MMSE and FL-MMSE equalization, making it suitable for massive MU-MIMO mmWave systems.
TABLE-US-00002 TABLE I Complexity for computing an equalization matrix Algorithm Computational complexity Asymptotic scaling L-MMSE 2U.sup.3 + 6BU.sup.2 − 2BU − 2U + 1 O(BU.sup.2) FL-MMSE 10BU.sup.2 + 2U.sup.3 + 2U.sup.2 + U + 1 O(BU.sup.2) FAME-SDR n.a. O(BU.sup.4.5) FAME-FBS (8t.sub.max + 4)BU.sup.2 + 2U.sup.2 + O(BU.sup.2) (4t.sub.max + 2)U + (2t.sub.max + 3)U
[0100] While the constant associated with the term BU.sup.2 is larger for FAME-FBS than for L-MMSE and FL-MMSE, the complexity of the latter algorithms appears to be higher in practice. Computing the L-MMSE (and the FL-MMSE) equalizer in hardware requires square roots and divisions, which result in high numerical precision requirements. Furthermore, the Cholesky decomposition and forward- and back-substitution procedures required when computing the L-MMSE (and the FL-MMSE) equalization matrix result in stringent data dependencies that limit parallelism and, hence, reduce throughput. In contrast, FAME-FBS has a regular structure with few data dependencies and the matrix-vector multiplications can be parallelized easily. In addition, one can parallelize computation per UE as the FAME problem in Equation 11 is independent for u=1, . . . , U. In fact, a simple hardware engine could be used to efficiently execute FAME-FBS to determine the low-resolution equalization vectors x.sub.u.sup.H.
[0101] After computing the equalization matrix, one must perform spatial equalization on the received signal vectors y at the rate of the ADCs. For the infinite-precision L-MMSE equalizer, this corresponds to computing one high-resolution matrix-vector product ŝ=W.sup.Hy per receive vector. For finite-alphabet equalizers, this corresponds to a low-resolution matrix-vector product z=X.sup.Hy, followed by U high-resolution products ŝ.sub.u=β.sub.u*z.sub.u, u=1, . . . , U. The complexity of equalization is summarized in Table II, which distinguishes between high resolution and low resolution multiplications. While finite-alphabet equalization performs more multiplications than a conventional equalizer, most of these multiplications are performed at low resolution. Thus, for sufficiently low resolution, finite-alphabet equalization effectively reduces the complexity of spatial equalization.
TABLE-US-00003 TABLE II Complexity of finite-alphabet equalization Real-valued multiplication count Equalization High resolution Low resolution Traditional 4BU 0 .sup. Finite Alphabet 4U .sup. 4BU
[0102] While spatial equalization must be carried out at symbol rate, the computation of the equalization matrix must only be carried out if the channel matrix changes. Due to operation at extremely high bandwidths, the complexity of performing equalization will dominate in most mmWave systems. For scenarios with short coherence times, methods that minimize the complexity of computing the equalization matrix are to be preferred.
D. Evaluation
[0103] With reference to
[0104]
[0105]
[0106]
[0107]
[0108]
[0109] Power control is simulated for
[0110] Hardware-Level Evaluation: To demonstrate the real-world benefits of finite-alphabet equalization, the power and area savings that can be attained in comparison with conventional, high-resolution equalizers are quantified. To arrive at a fair comparison between finite-alphabet equalization and conventional, high-resolution equalizers, two equalization circuits were implemented: one for finite-alphabet equalization and one for high-resolution equalization.
[0111] The high-resolution equalizer computes a matrix-vector product between the U×B equalization matrix W.sup.H and the received vector y. The matrix-vector product is computed in a column-by-column fashion by using a linear array of U parallel multiply-accumulate (MAC) units over B clock cycles. The multipliers in the MAC units are high-resolution and take as input 10-bit numbers from the equalization matrix W.sup.H and 7-bit numbers from the received vector y. The accumulators in the MAC units use 18 bits. Finally, 9 bits are taken from both real and imaginary accumulators as the outputs of each MAC unit. These outputs correspond to the estimates ŝ=W.sup.Hy.
[0112] The finite-alphabet equalizer computes a low-resolution matrix-vector product between the U×B finite-alphabet matrix X.sup.H and the received vector y. This matrix-vector product is implemented in the same way as in the traditional equalizer, with the difference that far fewer bits are used for the multipliers and accumulators. The multipliers take as input r-bit numbers from X.sup.H and 7-bit numbers from y, while the accumulators use r+13 bits (except for the case where r=1, where the accumulators use 13 bits). 9 bits are taken from the accumulators in each MAC unit as the output of the low-resolution matrix-vector product X.sup.Hy. Unlike conventional equalization, the results of the U-dimensional vector X.sup.Hy are scaled by the values in β. This scaling operation is implemented with a high-resolution multiplier that computes the product between the 9-bit x.sub.u.sup.Hy and the 10-bit scaling factor β.sub.u. The output of this multiplier are the estimates ŝ=V.sup.Hy, which are represented with 9 bits per real and imaginary components.
[0113] Table III lists post-layout implementation results for the circuits discussed above implemented for a B=256 BS antenna, U=16 UE system, using a 28 nm CMOS technology. The traditional, high-resolution equalizer corresponds to the design with an equalization resolution r of 10 bits, whereas the finite-alphabet equalizer was implemented for r={1, 2, . . . , 5} bits. To allow for a fair comparison between the different equalization circuits, we consider a scenario in which all of the designs support the same throughput. A throughput of 2 G (complex-valued) vectors/s is assumed, which implies that the 2B ADCs at the BS run at 2 G samples/s.
TABLE-US-00004 TABLE III Implementation results in 28 nm CMOS for one equalizer instance operating in a system with B = 256 and U = 16 Equalization resolution r [bit] 1 2 3 4 5 10 Silicon area [mm.sup.2] 0.06 0.08 0.10 0.14 0.16 0.26 Clock freq. [GHz] 1.33 1.25 1.25 1.16 1.16 1.05 Throughput [M vectors/s] 5.18 4.88 4.88 4.53 4.53 4.10 Power [mW] 18.5 29.2 38.8 42.6 51.3 57.1
[0114] As seen from Table III, a single instance of the equalizer design reaches throughputs of the order of M vectors/s, which is well below the target throughput of 2 G vectors/s. However, a time-multiplexed array of equalizers can be instantiated that achieve the desired throughput (at the expense of increased area).
[0115]
[0116] In the example of
[0117] Note that the power and area can be reduced much more. Once the number of bits in the equalization matrix has been reduced to 5 bits or below, emerging processing-in-memory architectures lower the area and power (additionally to the savings above) by about 2× to 4×.
E. Soft-Output Finite-Alphabet Equalization
[0118] In some embodiments, the finite-alphabet equalization approach described above is extended by unbiased estimation and soft-output computation. A compact expression of the post-equalization MSE is derived, which can be used to efficiently compute log-likelihood ratio (LLR) values. The effectiveness of this extension is demonstrated by error-rate simulation results for a coded massive MU-MIMO-OFDM system, for two unbiased soft-output finite-alphabet equalizers, both in LoS and non-LoS mmWave channel scenarios.
[0119] This subsection focuses on a massive MU-MIMO embodiment of the wireless communications system 18 of .sup.U×B is a L-MMSE equalization matrix, which minimizes the MSE defined by:
MSE=.sub.s,n[∥
[0120] Under the statistical assumptions on s and n of Section I-A above, the L-MMSE equalization matrix is given by:
W.sup.H=(ρI.sub.U+H.sup.HH).sup.−1H.sup.H Equation 34
where ρ=N.sub.0/E.sub.s. The rows w.sub.u.sup.H, u=1, . . . , U, of the L-MMSE equalizer W.sup.H can be computed by solving:
w.sub.u=arg ∥e.sub.u−H.sup.H{tilde over (w)}∥.sub.2.sup.2+ρ∥{tilde over (w)}∥.sub.2.sup.2 Equation 35
[0121] Spatial equalization with a biased L-MMSE estimate for each user u=1, . . . , U amounts to computing:
where h.sub.u is the uth column of H and ñ.sub.u=ρ.sub.i=1,i≠u.sup.Uh.sub.is.sub.i+n is the noise-plus-interference (NPI) vector. In general, the L-MMSE equalizer has rows for which w.sub.u.sup.Hh.sub.u≠1. Thus, to perform unbiased estimation, the goal is to compute the estimates for each UE u=1, . . . , U as follows:
In general, the biased
[0122] While the discussion of subsections A-C above focus on hard-output data detection, coded communication systems benefit from spatial equalizers that compute soft-outputs. To fully exploit forward error correction, the post-equalization NPI variance is extracted and then used to generate LLR values. For the uth UE, the NPI variance is given by the MSE of the unbiased estimate ŝ.sub.u, which is computed as follows:
Here, (a) follows from Equation 37 and (b) from Equation 12. Note that this result applies to any finite-alphabet equalizer as in Equation 7, as long as β.sub.u(x.sub.u) is computed as in Equation 12.
[0123] With this, soft outputs can be computed in the form of LLR values, by assuming that the residual error ŝ.sub.u−s.sub.u is circularly-symmetric Gaussian with variance ν.sub.u.sup.2. Concretely, the LLR values are computed as follows:
[0124] Here, .sub.q.sup.(1) and
.sub.q.sup.(0) are the subsets of the constellation
in which the qth bit is 1 and 0, respectively. Note that computing soft outputs for finite-alphabet equalizers entails the same complexity as for infinite-precision L-MMSE.
[0125] The FL-MMSE and FAME-FBS algorithms described above can be used to obtain the rows x.sub.u.sup.H of X.sup.H. For both algorithms, once x.sub.u.sup.H is known, the associated β.sub.u(x.sub.u) is computed using Equation 12; this factor is required to compute the variance ν.sub.u.sup.2 using Equation 42, which is then used to compute LLR values with Equation 43.
[0126]
[0127]
II. Finite-Alphabet Precoder (Downlink)
[0128] Similar to the case of equalization in the uplink, the power consumption and silicon area of precoding in the all-digital mmWave MU-MIMO downlink (e.g., from the core network 30 to the UEs 28 of
A. System Model
[0129] A model is presented which focuses on the downlink of a mmWave massive MU-MIMO embodiment of the wireless communications system 18 of .sup.U is the received vector, Hϵ
.sup.U×B is the channel matrix (e.g., received channel matrix, which may include multiple received channel matrices), xϵ
.sup.B is the precoded vector, and nϵ
.sup.U is i.i.d. circularly-symmetric complex Gaussian noise with variance N.sub.0 per complex entry. It is assumed that the channel matrix H is perfectly known to the radio access node 20, and that the precoded vector x is subject to the following average power constraint:
.sub.x[∥x∥.sub.2.sup.2]≤P Equation 44
[0130] The goal of precoding is to simultaneously transmit constellation points s.sub.uϵto the u=1, . . . , U UEs while reducing MU interference. Here, s.sub.u is assumed to have zero mean and variance E.sub.s, and
denotes the constellation set (e.g., 16-QAM). The radio access node 20 maps the vector s into the precoded vector x with the aid of channel state information (e.g., received from the channel estimator circuitry 34). The precoded vector x is crafted such that the UEs 28 can form an estimate ŝ.sub.uϵ
of the transmitted symbol s.sub.u simply by scaling the received signal y.sub.u. Specifically, it is assumed that each UE forms an estimate as ŝ.sub.u=βy.sub.u. Here, βϵ
.sub.+ is a precoding factor that can be estimated at the UE using pilot-based transmission.
[0131] This section focuses on linear precoders for which it holds that x=Ps, where Pϵ.sup.B×U is the precoding matrix. Thus, some embodiments use linear precoders that attempt to minimize the MSE between the estimated symbols ŝ and the transmitted symbols s:
MSE=.sub.s,n[∥s−ŝ∥.sub.2.sup.2]=
.sub.s,n[∥s−βHx∥.sub.2.sup.2]+β.sup.2UN.sub.0 Equation 45
Minimizing Equation 45 subject to the power constraint in Equation 44 results in the so-called WF precoder, where the precoding matrix is given by
with:
[0132] It is important to realize that the matrix Q.sup.WFϵ.sup.B×U in Equation 46 is the solution of the following optimization problem:
Q.sup.WF=arg |I.sub.U−H{tilde over (Q)}|.sub.F.sup.2+κ.sup.WF|{tilde over (Q)}|.sub.F.sup.2 Equation 48
The columns q.sub.u.sup.WFϵ.sup.B, u=1, . . . , U, of the matrix Q.sup.WF can also be obtained by solving:
q.sub.u.sup.WF=arg ∥e.sub.u−H{tilde over (q)}∥.sub.2.sup.2+κ.sup.WF∥{tilde over (q)}∥.sub.2.sup.2 Equation 49
Applying the Woodbury identity to Equation 46 yields:
Q.sup.WF=H.sup.H(HH.sup.H+κ.sup.WFI.sub.U).sup.−1 Equation 50
which is the solution to the following optimization problem:
Q.sup.WF=arg ∥I.sub.B−{tilde over (Q)}H∥.sub.F.sup.2+κ.sup.WF∥{tilde over (Q)}∥.sub.F.sup.2 Equation 51
Thus, the rows q.sub.b.sup.r,WF, b=1, . . . , B, of Q.sup.WF (where the superscript r denotes a row vector) can be computed as:
q.sub.b.sup.r,WF=arg ∥e.sub.b.sup.H−{tilde over (q)}.sub.rH∥.sub.2.sup.2+κ.sup.WF∥{tilde over (q)}.sup.r∥.sub.2.sup.2 Equation 52
[0133] The alternative optimization problems in Equation 49 and Equation 52 to compute the matrix Q.sup.WF will become useful in the next subsection.
B. FAWP: Finite-Alphabet WF Precoding
[0134] WF precoding computes
for each transmitted vector s. Unfortunately, digital precoding circuitry will be power hungry and large as mmWave MU-MIMO systems operate with high-dimensional data and extremely high sampling rates. As a remedy, FAWP proposes to represent the matrix Q.sup.WF using coarsely quantized numbers, with the objective of reducing the hardware complexity of the matrix-vector product Q.sup.WFs. Unfortunately, a direct quantization of the matrix Q.sup.WF typically leads to a significant error-rate degradation.
[0135] In order to design low-resolution matrices that are WF-optimal, i.e., that best mimic the infinite-precision WF-precoding matrix Q.sup.WF, finite-alphabet matrices are used, similar to those described above for spatial equalization in the mmWave MU-MIMO uplink. Since finite-alphabet matrices are applied to imitate the WF-precoding matrix Q.sup.WF, they are referred to herein as FAWP matrices. FAWP matrices introduce a few high-resolution scaling factors that help to bring a low-resolution matrix to the right scale. Two distinct FAWP matrix structures, namely pre-FAWP and post-FAWP matrices, are considered here.
[0136] Pre-FAWP Matrix: A pre-FAWP matrix is defined as a B×U matrix with the structure:
Q=A diag(α*) Equation 53
where AϵX.sup.B×U is a low-resolution matrix with entries taken from the finite alphabet X and αϵ.sup.U is a vector with per-UE scaling factors.
[0137] By using a pre-FAWP matrix, the matrix-vector product Qs becomes A(diag(α*)s). Such a matrix is called pre-FAWP as the U entries of the transmitted symbol vector s are scaled by the entries of α* before getting multiplied with the matrix A. Pre-FAWP reduces hardware complexity of Qs since the matrix A has low-resolution entries. Consider, for example, the case in which the entries of A are chosen from the 1-bit alphabet X={±1±j}; multiplying this matrix A with the vector diag(α*)s does not require hardware multipliers, but only adders and subtractors.
[0138] To calculate pre-FAWP matrices that are WF-optimal, the problem in Equation 49 is solved by assuming that Q has the form given by Equation 53. By doing so, the following procedure is arrived at:
[0139] The problem in Equation 48 is equivalent to solving the following optimization problem for each UE u=1, . . . , U:
Here, a.sub.u is the uth column of A, h.sub.u.sup.r is the uth row of H, and the associated optimal scaling factor is given by:
[0140] Equations 54 can be established by first plugging Equation 53 into Equation 49. Then, Equation 55 is obtained by taking the Wirtinger derivative with respect to α.sub.n.
[0141] Post-FAWP Matrix: A post-FAWP matrix is defined as a B×U matrix with the structure:
Q=diag(ζ)Z.sup.H Equation 56
where ZϵX.sup.U×B is a low-resolution matrix with entries taken from the finite alphabet X and ζϵ.sup.B is a vector with per-BS-antenna scaling factors.
[0142] By using a post-FAWP matrix, the matrix-vector product Qs becomes diag(ζ)(Z.sup.Hs). Such a matrix is called post-FAWP as the B scaling factors in ζ are applied after multiplying the matrix Z.sup.H with the vector s. Post-FAWP reduces the hardware complexity of Qs since the B×U matrix-vector product Z.sup.Hs can be implemented using exclusively low-resolution arithmetic units. The results of Z.sup.Hs are then entry-wise scaled by ζ, which requires only B high-resolution scalar multiplications.
[0143] Akin to the case of pre-FAWP matrices, post-FAWP matrices are obtained that are WF-optimal by solving the problem in Equation 52 with a matrix Q that has the form given in Equation 56. By doing so, the following procedure is arrived at:
[0144] The problem in Equation 51 is equivalent to solving the following optimization problem for each BS antenna b=1, . . . , B:
Here, z.sub.b is the bth column of Z, h.sub.b is the bth column of H, and the associated optimal scaling factor is given by:
[0145] In summary, both pre-FAWP and post-FAWP matrices are composed by a low-resolution matrix and a set of scaling factors. The difference is that a pre-FAWP matrix applies its U scaling factors before the multiplication with the low-resolution matrix, whereas a post-FAWP matrix applies its B scaling factors after matrix multiplication. As B>>U in typical massive MU-MIMO systems, a pre-FAWP matrix performs fewer high-resolution scaling operations than a post-FAWP matrix. However, the matrix-vector product is simpler with a post-FAWP matrix than with a pre-FAWP matrix, since the vector has a lower resolution as the symbols in s are not scaled yet. Thus, neither pre-FAWP nor post-FAWP matrices have a clear advantage over the other in terms of hardware complexity. Nonetheless, both FAWP matrix structures are expected to reduce hardware complexity when compared to traditional precoding, as the low-resolution matrices in both structures have coarsely quantized entries.
[0146] Computing FAWP Matrices: Different methods are proposed to compute pre-FAWP and post-FAWP matrices defined in Equation 53 and Equation 56, respectively. Means to estimate the precoding factor β are also discussed.
[0147] For pre-FAWP and post-FAWP matrices, the scaling factors are computed by means of Equation 55 and Equation 58, respectively, regardless of how the low-resolution matrix (A for pre-FAWP and Z for post-FAWP) is computed. Instead of solving the problems in Equation 54 or Equation 57, a simple approach is to directly quantize the infinite-precision matrix Q.sup.WF. This approach is called FAWP-WF; more specifically, pre-FAWP-WF and post-FAWP-WF when applied to pre-FAWP and post-FAWP matrices, respectively.
[0148] Q.sup.WF is quantized following the method put forward in Section I. For pre-FAWP-WF, the maximum value w.sub.max of [|{q.sub.u.sup.WF}|; |ℑ{q.sub.u.sup.WF}|] is found for each column q.sub.u.sup.WF of Q.sup.WF. The range [−w.sub.max, w.sub.max] is then divided into uniform-width bins, where each bin is represented by its centroid value. The centroid values are scaled by the same factor so that they are integer numbers, which preserves the objective value in Equation 54 and results in the low-resolution entries of the column a.sub.u. For post-FAWP-WF, the same procedure is applied on a per-row basis: each quantized row of Q.sup.WF corresponds to one row of Z.sup.H.
[0149] Since the problems in Equation 54 and Equation 57 are NP-hard, FAWP-WF significantly reduces complexity. Concretely, FAWP-WF requires the same complexity of (BU.sup.2) as computing the infinite-precision Q.sup.WF in Equation 50. As a result, we will use FAWP-WF as a baseline to evaluate the performance of the algorithm proposed next, which tackles the problems in Equation 54 and Equation 57.
[0150] FAWP via Forward-Backward Splitting (FBS): Similar to finite-alphabet equalization matrices in Section I, the FAWP problems in Equation 54 and Equation 57 can also be approximately solved using FBS, an approach dubbed FAWP-FBS. Pre-FAWP-FBS is presented as an algorithm for computing the low-resolution part of a pre-FAWP matrix starting from the problem in Equation 54. The algorithm for post-FAWP matrices, dubbed post-FAWP-FBS, can be derived in a similar way starting from Equation 57.
[0151] As in Section I, it is assumed that the optimal objective value γ.sub.u of Equation 54, u=1, . . . , U, is known. Then, solving the problem in Equation 54 is equivalent to solving the following problem:
As γ.sub.u is unknown, we will use it as a parameter that can be tuned to empirically improve the performance of our algorithm.
[0152] Next, the finite-alphabet constraint áϵX.sup.B in Equation 59 is relaxed to áϵ.sup.B, where
represents the convex hull of X. By doing so, the all-zeros vector 0.sub.B×1 becomes a trivial solution. To avoid this solution, the term −
with δ>0, is included in Equation 59 to encourage large entries in the vector ã. The resulting optimization problem is:
[0153] Now FBS can be applied. FBS is an efficient procedure for solving convex problems of the form â=arg min.sub.ãƒ(ã)+g(ã), where both functions ƒ and g are convex, but ƒ is smooth and g is not necessarily smooth or bounded. FBS is an iterative method that runs for t.sub.max iterations or until convergence. In each iteration t, FBS computes:
{acute over (v)}.sup.(t+1)=ã.sup.(t)−τ.sup.(t)∇ƒ(ã.sup.(t)) Equation 61
ã.sup.(t+1)=prox.sub.g{tilde over (v)}.sup.(t+1);τ.sup.(t) Equation 62
[0154] where ∇ƒ(ã.sup.(t)) is the gradient of the function ƒ and {τ.sup.(t)>0} is a sequence of step sizes. The proximal operator of the function g is defined as prox.sub.g({tilde over (v)};τ)=arg min.sub.ã{τg(ã)+½|ã−{tilde over (v)}|.sub.2.sup.2}.
[0155] Since the problem in Equation 60 is non-convex, FBS is not guaranteed to converge to an optimal solution. Nevertheless, FBS is used to approximately solve Equation 60 by setting:
where (ã) is the indicator function, which is zero if ãϵ
.sup.B and infinity otherwise. The indicator function is used to incorporate the convex constraint ãϵ
.sup.B in Equation 60 into the function g(ã). These choices for ƒ(ã) and g(ã) result in:
∇ƒ(ã)=H.sup.HHã−γ.sub.u(h.sub.u.sup.r).sup.Hh.sub.u.sup.rã Equation 65
prox.sub.g({tilde over (ν)})=sgn({{tilde over (ν)}})min{ν.sup.(t)|
{{tilde over (ν)}}|,1}+j sgn(ℑ{{tilde over (ν)}})min{ν.sup.(t)|ℑ{{tilde over (ν)}}|,1} Equation 66
where ν.sup.(t)=(1+τ.sup.(t)(κ.sup.WF−δ)).sup.−1 and Equation 66 is applied element-wise to {tilde over (ν)}. Pre-FAWP-FBS can be summarized as follows:
TABLE-US-00005 Algorithm 2 (Pre-FAWP-FBS). Initialize ã.sup.(1) with either the maximum-ratio transmission (MRT) solution (h.sub.u.sup.r).sup.H or the pre-FAWP-WF solution a.sub.u.sup.WF, and fix the sets of parameters {τ.sup.(t)}, {ν.sup.(t)}, and {γ.sup.(t)}. Then, for every iteration t = 1,...,t.sub.max, compute: {tilde over (v)}.sup.(t+1) = (I.sub.B − τ.sup.(t)H.sup.H(I.sub.U − γ.sup.(t)e.sub.ue.sub.u.sup.H)H)ã.sup.(t) Equation 67 ã.sup.(t+1) = prox.sub.g({tilde over (v)}.sup.(t+1)) Equation 68 The result {tilde over (x)}.sup.(t.sup.
[0156] To tune the algorithm parameters {τ.sup.(t)}, {ν.sup.(t)}, and {γ.sup.(t)}, some embodiments use a neural-network-based approach. Note that γ.sub.u is replaced with γ.sup.(t) in Algorithm 2 in order to (i) keep the algorithm general for different user locations and (ii) to increase flexibility during optimization.
[0157] Post-FAWP-FBS is now summarized, which can be derived following similar steps as for the derivation of pre-FAWP-FBS.
TABLE-US-00006 Algorithm 3 (Post-FAWP-FBS). Initialize {tilde over (z)}.sup.(1) with either the MRT solution h.sub.b or the post-FAWP-WF solution z.sub.b.sup.WF, and fix the sets of parameters {τ.sup.(t)}, {ν.sup.(t)}, and {γ.sup.(t)}. Then, for every iteration t = 1,...,t.sub.max compute: {tilde over (v)}.sup.(t+1) = (I.sub.U − τ.sup.(t)H(I.sub.B − γ.sup.(t)e.sub.be.sub.b.sup.H)H.sup.H){tilde over (z)}.sup.(t) Equation 69 {tilde over (z)}.sup.(t+1) = prox.sub.g({tilde over (v)}.sup.(t+1)) Equation 70 The result {tilde over (z)}.sup.(t.sup.
[0158] We note that both FAWP-FBS algorithms have the same complexity order of (BU.sup.2) as WF and FAWP-WF.
[0159] While the BS is able to compute the precoding factor β via Equation 47 with a FAWP matrix Q instead of Q.sup.WF, the UEs need to estimate such precoding factor in order to correctly estimate the transmitted symbols in s. Estimation can be achieved in a block-fading scenario by transmitting a pilot symbol that is known at the UE side. Specifically, the BS will transmit the pilot s.sub.u=√{square root over (E.sub.s)}, u=1, . . . , U. Then, the uth UE will receive y.sub.u=β.sup.−1h.sub.u.sup.rq.sub.us.sub.u+ĕ.sub.u+n.sub.u, where ĕ.sub.u represents residual interference from the other UEs. The objective now is for the UE to find a {circumflex over (β)}.sub.uϵ.sub.+ such that it generates an unbiased estimate ŝ.sub.u of s.sub.u, i.e., ŝ.sub.u={circumflex over (β)}.sub.uy.sub.u≈s.sub.u.
[0160] By taking into account that the transmitted pilot symbol s.sub.u is known to be √{square root over (E.sub.s)} and by assuming that ĕ.sub.u+n.sub.u is zero-mean Gaussian distributed and independent of s.sub.u, the UE can compute a maximum likelihood estimate (MLE) of {circumflex over (β)}.sub.u as:
{circumflex over (β)}.sub.u.sup.MLE={√{square root over (E.sub.s)}/y.sub.u}. Equation 71
[0161] While more pilots could be transmitted to form a better estimate {circumflex over (β)}.sub.u.sup.MLE, the results in Subsection C below show that one pilot is sufficient to achieve reliable downlink communication.
C. Evaluation
[0162] Simulation results for both pre-FAWP and post-FAWP matrices are generated by either FAWP-WF or FAWP-FBS. A comparison is provided in terms of BER and EVM versus normalized transmit power, defined as P/N.sub.0. For simplicity, the evaluation is restricted to a mmWave system with B=256 BS antennas serving U=16 UEs in a 16-QAM system operating over an i.i.d. Rayleigh-fading channel.
[0163]
[0164]
[0165]
[0166]
[0167] With reference to
[0168]
[0169]
[0170] The exemplary computer system 1300 in this embodiment includes a processing device 1302 or processor, a system memory 1304, and a system bus 1306. The system memory 1304 may include non-volatile memory 1308 and volatile memory 1310. The non-volatile memory 1308 may include read-only memory (ROM), erasable programmable read-only memory (EPROM), electrically erasable programmable read-only memory (EEPROM), and the like. The volatile memory 1310 generally includes random-access memory (RAM) (e.g., dynamic random access memory (DRAM), such as synchronous DRAM (SDRAM)). A basic input/output system (BIOS) 1312 may be stored in the non-volatile memory 1308 and can include the basic routines that help to transfer information between elements within the computer system 1300.
[0171] The system bus 1306 provides an interface for system components including, but not limited to, the system memory 1304 and the processing device 1302. The system bus 1306 may be any of several types of bus structures that may further interconnect to a memory bus (with or without a memory controller), a peripheral bus, and/or a local bus using any of a variety of commercially available bus architectures.
[0172] The processing device 1302 represents one or more commercially available or proprietary general-purpose processing devices, such as a microprocessor, central processing unit (CPU), or the like. More particularly, the processing device 1302 may be a complex instruction set computing (CISC) microprocessor, a reduced instruction set computing (RISC) microprocessor, a very long instruction word (VLIW) microprocessor, a processor implementing other instruction sets, or other processors implementing a combination of instruction sets. The processing device 1302 is configured to execute processing logic instructions for performing the operations and steps discussed herein.
[0173] In this regard, the various illustrative logical blocks, modules, and circuits described in connection with the embodiments disclosed herein may be implemented or performed with the processing device 1302, which may be a microprocessor, field programmable gate array (FPGA), a digital signal processor (DSP), an application-specific integrated circuit (ASIC), or other programmable logic device, a discrete gate or transistor logic, discrete hardware components, or any combination thereof designed to perform the functions described herein. Furthermore, the processing device 1302 may be a microprocessor, or may be any conventional processor, controller, microcontroller, or state machine. The processing device 1302 may also be implemented as a combination of computing devices (e.g., a combination of a DSP and a microprocessor, a plurality of microprocessors, one or more microprocessors in conjunction with a DSP core, or any other such configuration).
[0174] The computer system 1300 may further include or be coupled to a non-transitory computer-readable storage medium, such as a storage device 1314, which may represent an internal or external hard disk drive (HDD), flash memory, or the like. The storage device 1314 and other drives associated with computer-readable media and computer-usable media may provide non-volatile storage of data, data structures, computer-executable instructions, and the like. Although the description of computer-readable media above refers to an HDD, it should be appreciated that other types of media that are readable by a computer, such as optical disks, magnetic cassettes, flash memory cards, cartridges, and the like, may also be used in the operating environment, and, further, that any such media may contain computer-executable instructions for performing novel methods of the disclosed embodiments.
[0175] An operating system 1316 and any number of program modules 1318 or other applications can be stored in the volatile memory 1310, wherein the program modules 1318 represent a wide array of computer-executable instructions corresponding to programs, applications, functions, and the like that may implement the functionality described herein in whole or in part, such as through instructions 1320 on the processing device 1302. The program modules 1318 may also reside on the storage mechanism provided by the storage device 1314. As such, all or a portion of the functionality described herein may be implemented as a computer program product stored on a transitory or non-transitory computer-usable or computer-readable storage medium, such as the storage device 1314, volatile memory 1308, non-volatile memory 1310, instructions 1320, and the like. The computer program product includes complex programming instructions, such as complex computer-readable program code, to cause the processing device 1302 to carry out the steps necessary to implement the functions described herein.
[0176] An operator, such as the user, may also be able to enter one or more configuration commands to the computer system 1300 through a keyboard, a pointing device such as a mouse, or a touch-sensitive surface, such as the display device, via an input device interface 1322 or remotely through a web interface, terminal program, or the like via a communication interface 1324. The communication interface 1324 may be wired or wireless and facilitate communications with any number of devices via a wireless communications system 18 in a direct or indirect fashion. An output device, such as a display device, can be coupled to the system bus 1306 and driven by a video port 1326. Additional inputs and outputs to the computer system 1300 may be provided through the system bus 1306 as appropriate to implement embodiments described herein.
[0177] The operational steps described in any of the exemplary embodiments herein are described to provide examples and discussion. The operations described may be performed in numerous different sequences other than the illustrated sequences. Furthermore, operations described in a single operational step may actually be performed in a number of different steps. Additionally, one or more operational steps discussed in the exemplary embodiments may be combined.
[0178] Those skilled in the art will recognize improvements and modifications to the preferred embodiments of the present disclosure. All such improvements and modifications are considered within the scope of the concepts disclosed herein and the claims that follow.