Processor communications
09842067 · 2017-12-12
Assignee
Inventors
Cpc classification
G06F5/06
PHYSICS
G06F13/102
PHYSICS
International classification
G06F3/00
PHYSICS
Abstract
A processor module including a processor configured to share data with at least one further processor module processor; and a memory mapped peripheral configured to communicate with at least one further processor memory mapped peripheral to control the sharing of the data, wherein the memory mapped peripheral includes a sender part including a data request generator configured to output a data request indicator to the further processor module dependent on a data request register write signal from the processor; and an acknowledgement waiting signal generator configured to output an acknowledgement waiting signal to the processor dependent on a data acknowledgement signal from the further processor module, wherein the data request generator data request indicator is further dependent on the data acknowledgement signal and the acknowledgement waiting signal generator acknowledgement waiting signal is further dependent on the acknowledgement waiting register write signal.
Claims
1. An electronic device comprising: a first processor configured to share data with a second processor; and a first memory mapped peripheral associated with the first processor and configured to communicate with a second memory mapped peripheral associated with the second processor to control the sharing of the data, the first memory mapped peripheral comprising a data acknowledgement generator configured to output a data acknowledgement signal to the second processor dependent on a data acknowledgement register write signal from the first processor, and a data request waiting signal generator configured to output a data request waiting signal to the first processor dependent on a data request signal from the second processor, and the data acknowledgement signal, wherein the data acknowledgement generator comprises a toggle flip flop configured to receive as an input the data acknowledgement register write signal from the first processor and to output the data acknowledgement signal to the second processor.
2. The electronic device of claim 1, wherein the data request waiting signal generator comprises an XOR logic combiner configured to receive as a first input an output of the toggle flip-flop, receive as a second input the data request signal from the second processor, and output the data request waiting signal to the first processor.
3. The electronic device of claim 2, wherein the data request waiting signal generator further comprises a data request synchronizer configured to synchronize the data request signal from the second processor into a clock domain of the first processor.
4. The electronic device of claim 1, further comprising a memory; and wherein the first processor is configured to share data with the second processor via the memory.
5. An electronic device comprising: a first circuit configured to share data with a second circuit; and a first memory mapped peripheral associated with the first circuit and configured to communicate with a second memory mapped peripheral associated with the second circuit to control the sharing of the data, the first memory mapped peripheral comprising a data acknowledgement generator configured to output a data acknowledgement signal to the second circuit dependent on a data acknowledgement register write signal from the first circuit, and a data request waiting signal generator configured to output a data request waiting signal to the first circuit dependent on a data request signal from the second circuit, and the data acknowledgement signal, wherein the data acknowledgement generator comprises a toggle flip flop configured to receive as an input the data acknowledgement register write signal from the first circuit and to output the data acknowledgement signal to the second circuit.
6. The electronic device of claim 5, wherein the data request waiting signal generator comprises an XOR logic combiner configured to receive as a first input an output of the toggle flip-flop, receive as a second input the data request signal from the second circuit, and output the data request waiting signal to the first circuit.
7. The electronic device of claim 6, wherein the data request waiting signal generator further comprises a data request synchronizer configured to synchronize the data request signal from the second circuit into a clock domain of the first circuit.
8. The electronic device of claim 5, further comprising a memory; and wherein the first circuit is configured to share data with the second circuit via the memory.
9. A method comprising: sharing data between a first processor and a second processor; and communicating with the second processor to control the sharing of the data, wherein communicating with the second processor comprises controlling receiving data comprising outputting a data acknowledgement signal to the second processor dependent on a data acknowledgement register write signal from the first processor, and outputting a data request waiting signal to the first processor dependent on a data request signal from the second processor and the data acknowledgement signal, wherein outputting a data acknowledgement signal comprises configuring a toggle flip flop to receive as an input the data acknowledgement register write signal from the first processor and to output the data acknowledgement signal to the second processor.
10. The method as claimed in claim 9, wherein outputting a data request waiting signal comprises configuring an XOR logic combiner to receive as a first input an output of the toggle flip-flop, receive as a second input the data request signal from the second processor, and output the data request waiting signal to the first processor.
11. The method as claimed in claim 10, wherein outputting a data request waiting signal further comprises synchronizing the data request signal from the second processor into a clock domain of the first processor.
12. The method as claimed in claim 9, wherein sharing data between the first processor and the second processor comprises sharing data via a memory.
Description
BRIEF DESCRIPTION OF THE DRAWINGS
(1) For better understanding of the present application, reference will now be made by way of example to the accompanying drawings in which:
(2)
(3)
(4)
(5)
(6)
(7)
DETAILED DESCRIPTION
(8) The following describes in further detail suitable apparatus and possible mechanisms for the provision of shared memory controlling.
(9) With respect to
(10) It would be understood that the first device 10 and the second device 20 can be any suitable electronic processing unit, such as processing cores fabricated on the same or different silicon structures, or packaged with the same or different integrated circuit packages. In some embodiments the first device 10, the second device 20 and the shared memory 30 are fabricated on the same silicon structure or packaged within the same integrated circuit package. In some embodiments the first device 20 is synchronized by a first clock domain signal and the second device 20 is synchronized by a second clock domain signal. In some embodiments the first clock domain signal and the second clock domain signal are the same signal, however the following examples are described where the first clock domain signal and the second clock domain signals are different, for example having a phase or frequency difference. Furthermore although the following examples show the first device and the second device as sender and receiver respectively it would be understood that in some embodiments each device can be configured to send and receive. Furthermore in some embodiments the system can comprise more than two device configured to communicate to each other. In such embodiments each device communication pairing can comprise a sender and receiver pair as shown in the exampled described herein.
(11) The sender device 10 can in some embodiments comprise a central processing unit (CPU) 11 configured to generate data and enable the sending of data to the memory 30 shared ring buffer 31. The CPU 11 can be configured to be any suitable processor.
(12) The sender device 10 can further comprise a sender memory mapped peripheral (sender MMP) 13. The sender memory mapped peripheral can be configured to assist in the control of data flow between the sender and the receiver devices. In some embodiments the sender MMP 13 can be configured to receive data request (DREQ) register write information from the CPU 11, and output a data request (DREQ) to the receiver indicating that the sender requests transferring data to the receiver (in other words that there is data placed in the shared memory for retrieval). The sender MMP 13 can further in some embodiments be configured to receive from a receiver MMP a data acknowledge (DACK) signal indicating that the request has been acknowledged by the receiver device, and to output to the sender CPU 11 a data acknowledgement waiting register signal. In some embodiments the sender MMP further can be configured to receive the data acknowledge (DACK) register write signal.
(13) The sender device 10 can further in some embodiments comprise a register 15 suitable for storing values to be used by the CPU. The sender register 15 in some embodiments comprises a sender write pointer S:WP, and a sender read pointer S:RP. The sender write pointer S:WP and sender read pointer S:RP define write and read addresses within the share ring buffer 31 detailing the current address of the shared memory for the sender device to write to (the write pointer) and read from (the reader pointer RP). The pointers can in some embodiments be absolute or relative pointers.
(14) The receiver device 20 can in some embodiments comprise a central processing unit (CPU) 21. The central processing unit 21 can in some embodiments be a CPU similar to that of the sender CPU 11, however in other embodiments the receiver CPU 21 can be different from the sender CPU 11. The receiver CPU 21 can be configured to be suitable for reading from the shared memory.
(15) In some embodiments the receiver device 20 comprises a memory mapped peripheral (receiver MMP) 23. The receiver MMP can be configured to receive a data acknowledge (DACK) register write signal from the receiver CPU 21, and output an acknowledge signal (DACK) to the sender. Furthermore the receiver MMP 23 can be configured to receive the data request (DREQ) signal from the sender device and further be configured to output a request waiting signal (DREQ waiting) to the receiver CPU 21.
(16) The receiver device 20 can further comprise in some embodiments a register 25 comprising the receiver read pointer (R:RP). As described herein the receiver read pointer (R:RP) can be configured to contain an address value for the shared memory 30 detailing from which location is the next location to read from.
(17) With respect to
(18) The first flip-flop 101 receives as the set input the data request (DREQ) register write signal. Furthermore the first flip-flop 101 is configured to receive the data acknowledge (DACK) waiting register signal as a clear input. The first flip-flop 101 can be configured to output the data output (Q) to a first AND gate 103.
(19) The sender MMP 13 can in some embodiments further comprise an AND gate 103. The AND gate 103 is configured to receive as a first input the data output of the first flip-flop 101, and a second input which is an inverted data acknowledgement (DACK) waiting register signal. The output of the AND gate 103 is passed to a first XOR gate 105.
(20) In some embodiments the sender MMP 13 comprises a first XOR gate 105. The first XOR gate 105 is configured to receive as a first input the output of the AND gate 103 and further configured to receive as a second input the output of a fifth flip-flop 113 (flip-flop E). The first XOR gate 105 is further configured to output the XOR′ed logic combination to a second flip-flop 107.
(21) The sender memory mapped peripheral 13 in some embodiments further comprises a second flip-flop 107 (flip-flop B) configured to receive as a data input the first XOR gate 105 output. The second flip-flop 107 is further configured to output a synchronized version of the input which is the data request (DREQ) signal passed to the receiver device 20.
(22) In some embodiments the sender MMP 13 further comprises a third flip-flop 109 (flip-flop C). The third flip-flop 109 is configured to receive as a data input the data acknowledgement signal (DACK) from the receiver. The third flip flop 109 is configured to output a synchronized or clocked version of the input signal to a fourth flip-flop 111.
(23) In some embodiments the sender MMP 13 comprises a fourth flip flop (flip-flop D) 111. The fourth flip flop 111 is configured to receive as a data input the output of the third flip flop 109 and further configured to output a synchronized or clocked version of the input signal to the fifth flip-flop 113, and a second XOR gate 115.
(24) In some embodiments the sender MMP 13 comprises a fifth flip-flop 113 (flip-flop E) configured to receive as a data input the output of the fourth flip-flop 111, and configured to output a synchronized or clocked version of the input signal to the first XOR gate 105 and the second XOR gate 115.
(25) In some embodiments the sender MMP 13 further comprises a second. XOR gate 115 configured to receive the output of the fourth flip-flop 111 as a first input and the output of the fifth flip-flop 113 as a second input. The second XOR gate 115 is configured to output the XOR'ed combination to a sixth flip-flop 117.
(26) In some embodiments the sender MMP 13 further comprises a sixth flip-flop 117 (flip-flop F). The sixth flip-flop 117 is configured to receive as a set input (SET) the output of the second XOR gate 115, and configured to receive as a clear input (CLR) a data acknowledgement waiting register write signal (DACK waiting register write). The sixth flip-flop 117 is configured with the set input given priority over the clear input. The output of the sixth flip-flop 117 (Q) is output as the data acknowledgement (DACK) waiting register signal which is output to the CPU, the first flip-flop 101, and as the inverted input of the AND gate 103.
(27) With respect to
(28) In some embodiments the receiver MMU 23 can further comprise a second flip-flop 203 (flip-flop H), configured to receive the request (DREQ) signal from the sender 10 and output a clocked version to a third flip-flop 205.
(29) The receiver MMU 23 can further comprise in some embodiments a third flip flop 205 (flip-flop I) configured to receive as a data input the output of the second flip-flop 203 and configured to output a clocked version to the XOR gate 207.
(30) The receiver MMU 23 can in some embodiments further comprise a XOR gate 207 configured to receive as a first input the output of the toggle flip-flop 201 and as a second input the output of the third flip-flop 205. The XOR gate 207 can be configured to output the data request waiting (DREQ waiting) signal to the receiver CPU 21.
(31) In the examples herein the sender and receiver MMU flip-flops the clock and reset connections have been omitted for clarity. Furthermore in examples described herein all of the flip-flops are reset to 0 at power up. In such embodiments as described above the flip-flop inputs (set, clear and toggle) are considered to be synchronous inputs.
(32) In these examples all the sender flip-flops are furthermore clocked using the same ‘sender’ clock source and all the receiver flip-flops are clocked from the same ‘receiver’ clock source. In some embodiments the sender and receiver clock sources can be the same clock source or be substantially the same. However it would be understood that in some embodiments as described herein the clock sources may differ and have phase or frequency differences.
(33) With respect to
(34) With respect to
(35) The sender CPU 11 can be configured in some embodiments to write data into the circular buffer (or shared ring buffer 31). The sender CPU 11 can for example write data into the circular buffer using the sender write pointer S:WP. The sender CPU 11 furthermore can be configured to ensure that the data does not overflow the buffer by checking that the write pointer S:WP does not pass the read point S:RP for the determined data transfer size.
(36) The operation of writing data onto the circular buffer is shown in
(37) Once the sender CPU 11 has written data into the circular buffer the sender CPU 11 can be configured to determine or calculate how much data is remaining in the buffer. In other words the sender CPU 11 determines the buffer capacity. Where the capacity or available amount of data is greater than the transfer threshold than the CPU 11 can be configured to write to the data request register to send the request to the receiver 20. The transfer threshold can be any suitable value such as zero in other words the sender CPU can be configured to send a request whenever the buffer is not empty.
(38) The operation of determining buffer capacity and when checking the capacity being greater than the transfer threshold to write the data request register to write to the data request register to send the request is shown in
(39) The data request register write signal being asserted sets the sender MMP first flip-flop 101 to a value of one as the data request register write signal is equal to 1.
(40) The data request register write signal being asserted setting the flip-flop 101 to 1 operation is shown in
(41) The signal then propagates via the AND gate 103 and the XOR gate such that the input of the second flip-flop 107 is inverted and is then propagated at the next clock signal to output a request to the receiver in the form of the DREQ signal being output.
(42) The operation of outputting the signal DREQ to the receiver is shown in
(43) With respect to
(44) The receiver MMP 23 can be configured to receive the data request (DREQ) signal from the sender 10 where the second flip-flop 203 and the third flip-flop 205 synchronize the request into the receiver clock domain. It would be understood that the number of flip-flops required to synchronize the request can be greater than or fewer than two flip-flops depending on the clock frequencies used for the sender (CPU) and the receiver (CPU). In some embodiments the receiver MMP 23 can be configured to comprise no resynchronization flip-flops, in other words the DREQ signal passes directly to the XOR gate 207, where the sender and receiver are within the same clock domain or where the process technology is sufficient to allow auto-synchronisation.
(45) The operation of synchronizing the data request received from the sender into the receiver clock domain is shown in
(46) The receiver MMP 23 then can be configured to compare the receiver acknowledgement (DACK) output with the resynchronized request from the sender. This comparison can be performed, as shown in
(47) The comparison of the receiver acknowledgement signal with the synchronized data request signal to generate a DREQ waiting signal is shown in
(48) The receiver CPU 21 can be configured to receive the request notification (DREQ waiting) and react by reading the agreed amount of data from the shared memory area using the receiver read pointer (R:RP). The receiver CPU 21 can then be configured to update the receiver read pointer (R:RP) to take account of the data that has been read and write to the data acknowledgement register to send an acknowledgement to the sender.
(49) The operation of receiving the request waiting notification, reading the agreed amount of data, updating the read pointer, and writing to the data acknowledgement register to send an acknowledgement is shown in
(50) The writing to the DACK register to send an acknowledgement to the sender causes the data acknowledge register write signal to be asserted, which in some embodiments is received by the toggle flip-flop 201 causing the value of the flip-flop to toggle.
(51) The operation of asserting the data in a register write signal and toggling the flip-flop 201 is shown in
(52) The toggling of the flip-flop 201 causes the output value to be the same as the resynchronised request from the sender de-asserting the data request waiting signal.
(53) The de-asserting of the data request waiting signal is shown in
(54) Furthermore the toggle flip-flop 201 is configured to send the acknowledge signal (DACK) to the sender.
(55) The outputting of the flip-flop 201 output acknowledge signal to the sender is shown in
(56) With respect to
(57) The operation of synchronizing the acknowledge signal into the sender clock domain is shown in
(58) Furthermore the resynchronized acknowledge signal is passed through the fifth flip-flop 113 and the input and output of the fifth flip-flop 113 can then be compared by the second XOR gate 115. The second XOR gate 115 thus can be configured to detect a rising or falling edge of the acknowledge signal being received.
(59) The detection or determination of an acknowledge rising or falling edge is shown in
(60) The rising or falling edge detection output then sets the sixth flip-flop 117.
(61) The operation of setting the flip-flop 117 is shown in
(62) The setting of the flip-flop 117 causes an assertion of the output value, in other words the output of the sixth flip-flop is set to 1. This output value causes the output of the first flip-flop 101 to be cleared, cancelling the effect of the original data request register write (DREQ Reg Write) signal at the first flip-flop 101 which is propagated to the DREQ signal output to the receiver.
(63) The clearing of the DREQ value and in other words cancelling the request operation is shown in
(64) The data acknowledge waiting register signal is also output from the sixth flip-flop 117. In other words in some embodiments the sender MMP 13 asserts the value of the sixth flip-flop to the sender CPU 11 to indicate that an acknowledgement has been received. This acknowledgement can be used according to some embodiments as an interrupt, register value or flag in a manner similar to the data request signal received at the receiver.
(65) The operation of outputting the data acknowledge waiting flip-flop signal to the sender CPU 11 is shown in
(66) The sender CPU 11 can in some embodiments react to the acknowledge waiting signal by updating the sender read pointer S:RP. In other words the sender CPU 11 frees up the shared memory space occupied by the data the receiver has read.
(67) The operation of updating the S:RP, is shown in
(68) The sender CPU 11 can furthermore in some embodiments recalculate the buffer fill level. Where the amount of data in the buffer is greater than the transfer threshold the sender CPU 11 can be configured to write to the data request (DREQ) register to reassert the date request.
(69) The operation of re-calculating the buffer capacity and writing to the DREQ register where the capacity is greater than the transfer threshold is shown in
(70) This operation enables further cycles of generating DREQ and DACK signals.
(71) The advantages of these embodiments of the application are that there is a minimisation of the number of connections required (only one signal in each direction). Furthermore in some embodiments the operation is tolerant of any amount of latency in the request or acknowledgement signals.
(72) In these embodiments the sender can place as many transfers worth of data into the buffer without the receiver being required to remove any of the data provided that the buffer is large enough. Furthermore in some embodiments multiple requests do not cancel each other out.
(73) The senders CPU according to some embodiments can be further be allowed to be recalculate and reassert the request safely at any time without causing the receiver to receive a spurious request. For example where the sender attempts to reassert the request whilst an acknowledgement is waiting the request is ignored because the waiting acknowledgement overrides the request.
(74) It would be understood that while the read and write pointer management can be performed by software running on the CPUs the embodiments of the application can be extended to include automatic pointer management hardware.
(75) Furthermore although the above description describes where the control of a shared memory is performed the operations and designs in communicating predetermined commands from one CPU to another can also be implemented using embodiments of the application.
(76) In general, the various embodiments may be implemented in hardware or special purpose circuits, software, logic or any combination thereof. For example, some aspects may be implemented in hardware, while other aspects may be implemented in firmware or software which may be executed by a controller, microprocessor or other computing device, although the invention is not limited thereto. While various aspects of the invention may be illustrated and described as block diagrams, flow charts, or using some other pictorial representation, it is well understood that these blocks, apparatus, systems, techniques or methods described herein may be implemented in, as non-limiting examples, hardware, software, firmware, special purpose circuits or logic, general purpose hardware or controller or other computing devices, or some combination thereof.
(77) Embodiments may be implemented by computer software executable by a data processor of the mobile device, such as in the processor entity, or by hardware, or by a combination of software and hardware. Further in this regard it should be noted that any blocks of the logic flow as in the Figures may represent program steps, or interconnected logic circuits, blocks and functions, or a combination of program steps and logic circuits, blocks and functions. The software may be stored on such physical media as memory chips, or memory blocks implemented within the processor, magnetic media such as hard disk or floppy disks, and optical media such as for example DVD and the data variants thereof, CD.
(78) The memory may be of any type suitable to the local technical environment and may be implemented using any suitable data storage technology, such as semiconductor-based memory devices, magnetic memory devices and systems, optical memory devices and systems, fixed memory and removable memory. The data processors may be of any type suitable to the local technical environment, and may include one or more of general purpose computers, special purpose computers, microprocessors, digital signal processors (DSPs), application specific integrated circuits (ASIC), gate level circuits and processors based on multi-core processor architecture, as non-limiting examples.
(79) Embodiments may be practiced in various components such as integrated circuit modules. The design of integrated circuits is by and large a highly automated process. Complex and powerful software tools are available for converting a logic level design into a semiconductor circuit design ready to be etched and formed on a semiconductor substrate.
(80) Programs, such as those provided by Synopsys, Inc. of Mountain View, Calif. and Cadence Design, of San Jose, Calif. automatically route conductors and locate components on a semiconductor chip using well established rules of design as well as libraries of pre-stored design modules. Once the design for a semiconductor circuit has been completed, the resultant design, in a standardized electronic format (e.g., Opus, GDSII, or the like) may be transmitted to a semiconductor fabrication facility or “fab” for fabrication.
(81) The foregoing description has provided by way of exemplary and non-limiting examples a full and informative description of the exemplary embodiment of this invention. However, various modifications and adaptations may become apparent to those skilled in the relevant arts in view of the foregoing description, when read in conjunction with the accompanying drawings and the appended claims. However, all such and similar modifications of the teachings of this invention will still fall within the scope of this invention as defined in the appended claims.