Multi-axis position sensing system
11620466 · 2023-04-04
Assignee
Inventors
Cpc classification
G06F3/0321
PHYSICS
G06F3/0317
PHYSICS
H03M13/15
ELECTRICITY
International classification
G06K7/10
PHYSICS
G06F3/00
PHYSICS
G06K7/14
PHYSICS
Abstract
Multi-axis self-location method and apparatus wherein de Bruijn sequences on 2 or more axes are convolved into an array of symbols such as halftone dots to form a reference scale. The position of an imaging device such as a camera relative to the reference scale is ascertained from the captured camera image by bit-wise reconstitution of axis position codes with simple, predominantly linear operations over small neighbourhoods. Judicious choice of differential coding, LFSR generator polynomials, mathematical operators, and deconvolution kernels enables code digits of an axis to be regenerated while simultaneously cancelling out the contributions of other axes. Also optionally provided are uniform DC-balanced variants yielding greatly improved position interpolation, isometric implementations decodable from high-aspect-ratio sample windows, robust concatenated error correction, and extensions into n-space.
Claims
1. A computer-implemented method for multi-axis position sensing comprising the steps of: capturing, by an imaging device, a partial image of a reference scale; wherein the reference scale comprises a plurality of symbols in a grid pattern and each symbol represents a symbol value, and wherein the grid positions of said symbols along a plurality of axes in space are encoded using respective symbol values, wherein the plurality of symbol values each represent a combination of code sequence digits from differentially coded position code sequences along each axis of the plurality of axes, wherein multiple code sequence digits form a codeword of sufficient size to represent a digital position on each of the multiple axes, and wherein said combination is obtained by convolving the code sequence of each axis with a respective kernel based on a separable mathematical kernel to thereby define each symbol value by a respective combination function applied to two or more code sequence digits of each axis, such that said code sequence digits can also be decoded from said symbol values by application, for each axis, of an arithmetic function in form of a kernel to multiple symbol values, each application thereby yielding one code sequence digit of the respective axis; by a processing apparatus: performing a resampling transform of the captured partial image to yield an array with one discrete symbol value per entry; applying said kernels to the array to extract said code sequence digits of each axis; assembling the extracted code sequence digits into a plurality of vectors; and decoding the plurality of vectors to determine a value for integer-value alignment of the imaging device to the reference scale for each axis of the plurality of axes.
2. The method of claim 1, wherein the plurality of symbols comprise at least one of halftone dots in 2D, spheres, blocks, and hyperspheres in higher dimensional lattices, and wherein a value for the plurality of discrete symbols is represented by area or volume.
3. The method of claim 1, wherein the position code for each axis is a differentially-coded de Bruijn sequence or a Linear Feedback Shift Register (LFSR) sequence that is self-dual under differential coding.
4. The method of claim 1, wherein the combination function is linear addition, and wherein the step of applying kernels comprises extending calculations to accumulate linear sums over a plurality of symbols wherein the plurality of vectors are maintained in the linear processing domain for subsequent processing.
5. The method of claim 1, wherein a spectral analysis of the captured image is used to extract a plurality of fractional alignment parameters of the imaging device relative to the grid pattern; wherein the method further comprises the steps of: using the determined alignment values to resample the captured image into the array with one symbol per entry; and extracting a plurality of angular and fractional positions for subsequent concatenation with corresponding integer position alignments based on the decoded vectors.
6. The method of claim 1, further comprising estimating axes of further degrees of freedom between the imaging device and the reference scale according to perspective analysis of the captured image and calibration of mechanical characteristics of the imaging device.
7. The method of claim 1, wherein the resampling uses pixel mapping based on a frequency-domain derived transform with inclusion only of values of valid pixels wherein the kernel of each symbol group is fully represented in the captured image without clipping at the captured image edges.
8. The method of claim 1, wherein static, dynamic, inter-frame and intra-frame information, constraints and heuristics are used to reduce position vector decode complexity and increase reliability.
9. The method of claim 1, wherein the grid pattern is configured such that, when decoded by a processing apparatus, the grid pattern is used to compensate for (a) systematic interpolation errors introduced by pattern-related bias and (b) bias caused by known errors detected in the decode process.
10. A position sensing system comprising an imaging device and processing apparatus, wherein the position sensing system is configured to perform the method steps of: capturing, by the imaging device, a partial image of a reference scale; wherein the reference scale comprises a plurality of symbols in a grid pattern and each symbol represents a symbol value, and wherein the grid positions of said symbols along a plurality of axes in space are encoded using respective symbol values, wherein the plurality of symbol values each represent a combination of code sequence digits from differentially coded position code sequences along each axis of the plurality of axes, wherein multiple code sequence digits form a codeword of sufficient size to represent a digital position on each of the multiple axes, and wherein said combination is obtained by convolving the code sequence of each axis with a respective kernel based on a separable mathematical kernel to thereby define each symbol value by a respective combination function applied to two or more code sequence digits of each axis, such that said code sequence digits can also be decoded from said symbol values by application, for each axis, of an arithmetic function in form of a kernel to multiple symbol values, each application thereby yielding one code sequence digit of the respective axis; by the processing apparatus: performing a resampling transform of the captured partial image to yield an array with one discrete symbol value per entry; applying said kernels to the array to extract said code sequence digits of each axis; assembling the extracted code sequence digits into a plurality of vectors; and decoding the plurality of vectors to determine a value for integer-value alignment of the imaging device to the reference scale for each axis of the plurality of axes.
11. The position sensing system of claim 10, wherein the imaging device comprises at least one of an optical camera, electrostatic, micromechanical, X-ray, nuclear magnetic resonance, magnetic sensor or other imaging apparatus.
12. The position sensing system of claim 10 wherein the imaging device further comprises multiple sensors and the reference scale further comprises multiple scales systems configured to resolve a full 6-axis position of the imaging device or the higher dimensionality of a moving or flexing target object relative to the imaging device.
13. The position sensing system of claim 10 wherein (a) the grid pattern is configured to act as both the reference scale and a motor element including at least one of the stator grid of a Sawyer motor, an inch-worm drive, or a motion actuator system or (b) the image device is further configured to act as motion actuator.
14. A position coding apparatus for multi-axis position sensing by an imaging device, the position coding apparatus comprising: a reference scale comprising a plurality of symbols in a grid pattern and each symbol represents a symbol value, and wherein the grid positions of said symbols along a plurality of axes in space are encoded using respective symbol values, wherein the plurality of symbol values each represent a combination of code sequence digits from differentially coded position code sequences along each axis of the plurality of axes, wherein multiple code sequence digits form a codeword of sufficient size to represent a digital position on each of the multiple axes, and wherein said combination is obtained by convolving the code sequence of each axis with a respective kernel based on a separable mathematical kernel to thereby define each symbol value by a respective combination function applied to two or more code sequence digits of each axis, such that said code sequence digits can also be decoded from said symbol values by application, for each axis, of an arithmetic function in form of a kernel to multiple symbol values, each application thereby yielding one code sequence digit of the respective axis.
15. The position coding apparatus of claim 14, wherein a modulation is applied that inverts up to half of the code sequence digits to achieve a uniform DC-balanced reference scale.
16. The position coding apparatus of claim 14, wherein the plurality of symbols comprises axis sequences which are selected for the grid pattern such that every codeword used is unique in sequence codeword values of all axes and may be used to unambiguously identify both axis and position when processed by a processing apparatus.
17. The position coding apparatus of claim 16, wherein axis sequences are chosen such that every codeword used and its corresponding bit-reversed value is unique within sequence codeword values of all axes and thus unambiguously identifies axis, position, and quadrant under any rotation or reflection.
18. The position coding apparatus of claim 14, wherein the reference scale further comprises redundant information useable for error detection and correction by including at least one of averaging, majority voting, correlation functions, minimum Hamming distance calculation, maximum likelihood, and concatenated error correction.
19. The position coding apparatus of claim 14, wherein the grid pattern includes an extra or redundant code axis which enables an any-angle, high-aspect-ratio imaging scale pattern decoding.
20. The position coding apparatus of claim 14 wherein the reference scale is formed by shaping, printing, lithography, display, projection, or holography on a planar surface, planar surface with polar coordinate arrangement, curved surface including at least one of cylinder or sphere, and optionally where cyclic codes are used to seamlessly wrap around the scale on one or more axes.
Description
BRIEF DESCRIPTION OF DRAWINGS
(1)
(2)
(3)
(4)
(5)
(6)
(7)
(8)
(9)
(10)
DETAILED DESCRIPTION OF PREFERRED IMPLEMENTATIONS
(11) Decode Process
(12) The preferred embodiment decode pipeline from scale, 100, to position output is depicted in
(13) Scale Symbol Representation and Implementation
(14) The preferred halftone representation of digits is depicted in the drawings with area of circular dots proportional to digit value. When imaged, this translates to a greyscale distributed over several pixels and subsequently a number closely related to the original digit value when these pixels are correctly resampled and integrated. Circular dots have the advantage of being constant under rotation and easily printed, but it will be noted that any convenient pattern may be used including, but not limited to, lines, squares, superimposed printing inks with subtractive overlay properties, multi-dot symbols, or full greyscale patterns—and as such fall within the scope of the present invention. XOR combination produces 2-value symbol arrays, 200, 500. Linear addition produces 3-values (300, 400) for two axis combination, and 4-value symbols for 3-axis combinations (600). For illustrative purposes,
(15) A significant advantage of the present invention is that scale pattern construction is computationally straight forward due to separability of axis sequence generation and simplicity of axis symbol combination. Indeed, it will be understood by one skilled in the art that patterns may be produced iteratively line-by-line, block-by-block, or symbol-by-symbol using CPU, GPU or hardware for raster-based systems such as a scanning laser writer. Realtime, on-the-fly pattern generation obviates very lengthy processing times for intermediate file formats (e.g. Gerber, CIF) and enormous rasterised image output files—potentially terabytes per square metre of scale. Hence, a direct-write lithography or photo plotting scale manufacturing system utilising these position coding schemes can enable customised or one-off scales to be manufactured at little or no additional cost using.
(16) Isometric grids (500, 600), using a different LFSR polynomial and larger kernels (505, 506, 507) have a redundant axis and are decoded by kernel covering a minimum of 4 symbols. A high aspect ratio window, 604, extracts very little information from the y axis, but sufficient from x and w to locate position by simple geometric calculation. No matter the angle or translation, such a window accesses at least 2 of three axes and can provide higher speed decode. High-aspect-ratio decoding is achievable with window heights somewhat larger than kernel-diameters, in practise around a dozen video lines. As with most redundancy, additional information can be used for error checking. When additional axis position information is not required (e.g. when sample window is large enough), the third-axis can be used to embed ancillary, side-channel data. Furthermore, if a small part of a full m-sequence range is used, as described in the summary, the remaining range can be used for static information, to label scale properties, product identification, manufacturer name, for example. According to an embodiment of the present invention, cryptographic methods can be used to encode some of this data for security or intellectual property protection, e.g. to tie a scale to a specific sensor. According to an embodiment of the present invention, dynamic patterns, say from computer screens or cell phone displays (and using their cameras), can insert data blocks conveniently aligned to position data, to provide movement information and a relative high-bandwidth side-channel. DC-balanced codes appear almost flat grey and, in displayed scale applications, can be sub-coded into existing video with polarity alternated each frame. According to an embodiment of the present invention, foreground video can be cancelled out over two frames leaving a representation of the scale data. Applications may include using the subliminal positioning signal for pointing devices and/or data transfer and bridging the “airgap” between computers.
(17) According to an embodiment of the present invention, dot size variance (modulation) can maximise symbol differences, for example 600, and thereby aid symbol discrimination, providing smallest dots are still printable and largest do not overlap. Low modulation depth, 601, improves interpolation accuracy and can be minimised providing integer position decoding is not compromised. To improve interpolator performance of linear addition patterns further (
(18) The present invention is not limited to binary, Galois Field, two-tap LFSR, nor maximum-length sequences (m-sequence). Ternary or higher number bases with a larger symbol alphabet are viable, and polynomial generators can be selected for any specific application. Non-maximum length sequences are useful, for example, since even-number cyclic sequences (902) allow even-odd DC-balanced scale rings (900, 901). Indeed,
(19) The scale grid establishes the metrology reference and system accuracy relies fundamentally on scale accuracy. However, small localised imperfections will be averaged out over the sampling window providing there is little or no correlation with the underlying grid periodicity.
(20) Image Capture
(21) The scale grid, not optics, primarily defines x-y-α accuracy, providing the optical axis is maintained perfectly at right-angles to the scale. Hence, data is captured by an imaging device comprising a sensor pixel array without onerous constraints on focus or magnification and is digitised with analogue-to-digital converters, ADCs, for subsequent processing. Typically, motion blur is eliminated by strobed scale illumination. ADCs with 8 or more bits of resolution are typical, but the speed, complexity, and power consumption of conversion are related to the bit-depth. Pixel data is averaged over many sites according to a preferred embodiment of the present invention, and therefore even one-bit ADCs handling extremely high-speed (mega-Hz frame rates) could be workable at the expense of interpolation accuracy.
(22) To one skilled in the art, it will be clear that the present invention is not limited to optical systems, CMOS image sensors, nor specific scale technologies. Other sensor types such as magnetic, electrostatic, micromechanical, and so on, can be used as appropriate. For example, a lithographically defined scale on the flip-side of semiconductor wafers could be sensed electrostatically by an integrated array of capacitive sensors in proximity to the scale and processed in a similar way to the photo-diode arrays of commodity camera chips. However, unlike optical systems this would not be diffraction limited, and sub-micron dot-pitches for the scale and sensor are a real possibility. With the scale carried through manufacturing attached to the flip side of the wafer, this embodiment enables repeatable, sub-nanometre mask alignment over many lithographic and process steps. In wafer-to-wafer bonding (stacking, or “3D manufacturing”), the scale provides an alignment guide on the accessible, outer faces of two wafers that are to be bonded, thus greatly simplifying bringing together the inaccessible and invisible active inner layers with the extraordinary precision normally required. A further advantage of electrostatic operation is that a sensor chip can double as an enhanced Sawyer Motor, or multi-dimensional positioning actuator. The same array of plate capacitors can be alternately used as sensors and actuator drivers, or the two functions accomplished by separate chips as in
(23) Imaging sufficient data in a high-aspect-ratio window under all rotations is a problem also solved for the scales of
(24) The object of using high-aspect-ratio windows is speed. Applications such as metrology that do not need high sample rates can use a larger, slower sensor array. A single sensor can accommodate both, and be dynamically reconfigured (programmable windowing is a common feature of camera chips) to trade off speed for accuracy under different operating conditions. For instance, high-speed continuous servo motion and instantaneous, high-resolution metrology.
(25) Spectral Analysis, FFT
(26) The well-known 2D FFT, 701, provides an efficient spectral analysis tool and convenient vehicle to illustrate this component of the present invention. Frequency—distance of peaks from centre—relates to magnification or pixels per dot. Angle around FFT centre-lines of the four peaks corresponds to a with, at this stage, indeterminate quadrant. Phase (calculated as arctangent of the complex FFT output) of two orthogonal peaks, 702, produces the prized fractional part of prospective x-y positions. (Since FFT input is real, half the outputs are duplicate and 2 peaks redundant.) FFT outputs are discretised frequency-bins and therefore cannot yield smooth and precise results from a single bin, or FFT output pixel. However, well-known methods produce far superior, “sub-pixel” characterisation using calculations including neighbouring bins around the peaks. Furthermore, since coding is essentially linear (row/column summation) and the Fourier Transform linear also, the contribution of row and column symbol values to sample window asymmetry and interpolation bias will, to the first order, be separable and linear. In other words, given successful decode of the integer position, pre-calculated error maps or empirical calibrations can be applied as a series of fine interpolation corrections and, importantly, each codeword bit can be considered independently in the process.
(27) In a preferred embodiment of the present invention, only a small fraction of frequency bins (data around peaks, 702) are utilised and this presents significant opportunities for optimising this computationally demanding decode stage. Peaks can be determined from the first few video lines or rows, and subsequently only two or three small regions calculated in the second vertical dimension of a separable 2D FFT computation—one region of columns for each horizontal peak. Real-input horizontal FFTs require half the computation of vertical FFTs, hence for square-format video, 75% of computational resource and power can be saved, and memory requirements decimated. Early estimation and other constraints may also be applied to pruning horizontal FFT calculations and buffering, but more significantly can decimate frame-store requirements and reduce system latency since resampling (the next processing stage) starts much earlier. In summary, a fast, rough, but good-enough estimate is made to guide resampling, followed by a more accurate phase calculation that makes full use of all data to get the best possible interpolation results.
(28) The search for peaks in the FFT output can be directed and qualified by static and dynamic system constraints. For instance, if magnification or a is known a priori or can be predicted (as is normal in real-world kinematics), then less computation is generally needed and higher confidence results can be garnered. If a and magnification are fixed or variation small, full FFTs may not be needed at all and interpolation computation reduced by an order of magnitude. Note three peaks from a total of six would be used for isometric scale grids.
(29) The previous treatment considers digital computation of Fourier transforms. However, such spectral analysis is not limited to electronic implementation. Optical-domain processing is known in the art and could provide simple, virtually instantaneous Fourier transform results for use in subsequent processing either electronically or otherwise.
(30) Align and Resample
(31) Resampling uses FFT-derived affine-transform information to reduce the input frame to a one-symbol-per-entry array, 703, by rotating and downsizing wherein the symbol grid is aligned to output array rows and columns. Integration of source pixels by area-preserving resampling is preferred over bi-cubic and other resizing algorithms to preserve linearity. The process is well suited to GPUs and, for example, can be implemented as an over-sampling, anti-aliased texture mapping. Lens distortion and shading corrections can be applied in this process also.
(32) The 2D resampled output array generally forms an inclined rectangle within a larger array, and at its edges symbols that are only partially captured should be rejected. The corners of the inclined rectangle will contain fewer symbol repetitions than at the centre where there is a longer diagonal, and this is accounted for in averaging or weighting the output row or column. Depending on the aspect ratio of the input and its angle, the width and depth of valid output samples varies with some rows and columns having no valid data. A mask array can be generated to qualify which output symbol pixels are of use.
(33) In the isometric cases, resampling can include a shear component in the affine transform such that the output array becomes a rhomboid with, for example, the w axis on a diagonal and x-y on rows and columns.
(34) Convolution Kernels, Digital Discrimination
(35) In a preferred embodiment of the present invention, with symbols realigned, 703, kernels are applied to extract 1D symbol vectors, 704, for each axis. Kernel operations are convolutions in the mathematical sense: multiply by the kernel values and sum results at each valid site. Generally, in the realigned array, symbol values will be represented by signed, zero-centred values, i.e. a 3-valued system will have symbol values of nominally +1.0, 0.0, and −1.0, but will vary due to sampling artefacts and noise. Discrimination refers to converting analogue values into binary vectors, one for each axis, by thresholding kernel calculation results. Since thresholding is applied to absolute differences (reversing differential coding: a large difference signifies 1, a small result 0), this represents a non-linear operation in the signal chain. (For ternary or other number systems, discrimination digitises by slicing a more complex eye diagram.) In the examples of
(36) It will be recognised by those skilled in the art that resample, convolve, and discriminate may be combined into a single pipeline stage to simplify computation and obviate intermediate frame-buffering.
(37) LFSR Decode, Error Detection/Correction, Resolve Quadrant
(38) In a preferred embodiment of the present invention, bit vectors, 704, can be decoded to produce the integer portion of the position output by various known means including discrete logarithm, lookup table, and searching a reference LFSR symbol string. The resolved quadrant and x-y integer translations are added to interpolated fractions of the spectral analysis stage for final output. While trivial to encode, LFSR sequences are relatively expensive to decode and the process will be compounded by larger vectors that may contain errors. However, static system conditions, intra- and inter-frame information or constraints can reduce algorithmic complexity and/or improve confidence in the result at several stages of the pipeline including LFSR decode. Static information, for example, includes maximum travel of a CNC machine where x-y integer spans would typically be in the thousands rather than millions for some m-sequences. Such information can reduce search time to practical levels, typically milliseconds.
(39) In a preferred embodiment of the present invention, a computationally intensive and potentially slow calculation of initial position is found from the complete space of possibilities. Then, after an initial fix with full and unequivocal calculation of 4-orientation and position, subsequent samples are highly constrained by system kinematics and are more simply calculated. Typical machine velocity and acceleration maxima limit position translations between samples to very few integer shifts, e.g. at 10 kHz sampling, 10 ms.sup.−1, and 100 μm scale pitch, to no more than 10, and yet fewer if acceleration is accounted for. Similarly, an instantaneous 90° rotation is unthinkable, and once an angular fix is established, decode can track without needing to resolve quadrants or vector reversals. Given that it is almost trivial to extend LFSR strings around a seed (a previous position fix), the position vector decode may now be implemented as a cross-correlation function over, say, 16 locations on that extended string. Essentially, Hamming distance is calculated between input vector and possible next positions, and the result with lowest distance (fewest errors) wins. The Hamming distance of the best fit and other candidates give a good indication of system health and noise margin and can be used to decide when tracking has failed and unconstrained decode should restart, or a hard error raised and emergency system shut down invoked. It will be understood that the input vector is typically over 100 bits with substantial redundancy, and this method is an efficient way to achieve error detection and correction over long and variable-length vector inputs. Furthermore, it can be calculated by hardware in nanoseconds.
(40) As an example of error correction capabilities for the order-22 LFSR (
(41) For linear combination scales (
(42) The parity properties of the example LFSR polynomials can also be used for error correction.
(43) Alternatively, parity and other cues such as analysis of regions that lack the high-frequency data characteristic of the dot pattern, may be used to excluding dubious results from Hamming distance and correlation calculations. Where large scale areas may be occluded, for instance in marker field applications, more complex image processing algorithms may be implemented to make the best of the sample image. In summary, the present invention presents a powerful “concatenated error correction code” using repetition codes, parity, Hamming distance or correlation, kinematic constraints, and other cues that enables reliable retrieval of integer position even from signals buried in noise.
(44) Additional Coding Axes
(45) The previous treatment considers a system constrained to planar motion where a scale acts as primary metrology reference. However, in a preferred embodiment of the present invention, up to the full six degrees of freedom can be measured by an adapted, single-camera system, albeit with lower accuracy. With simple lenses, the dot pitch in the sampled image gives a direct height indication, z, of the camera from the scale and if this height is calibrated, z can be calculated. z accuracy now relies on dot-pitch frequency interpolation accuracy (which can be parts per million) as a fraction of the effective optical path length. Rather than a reference dot pitch of 100 μm or less, z could be a fraction of perhaps 20 mm and in many applications, such as 3D mice and video game controllers, this is easily good enough. The FFT data can also be used to estimate the last two degrees of freedom, camera pitch and roll angles, by analysing peak shapes and deviation from an orthogonal constellation. Alternatively, the sensor array can be segmented into, for example, four windows and these quadrants processed separately to yield perspective information from dot-pitch frequency relationships. In this circumstance, perspective information must inform the resampling process to correctly extract sequences vectors. Providing the system remains reasonably in focus, there is a wealth of information from which to calculate these extra axes by the above or other well-known methods.