rfnoc: Refactor ctrlport_endpoint; fix MT issues - uhd

diff options

author	Aaron Rossetto <aaron.rossetto@ni.com>	2022-02-17 12:15:21 -0600
committer	Aaron Rossetto <aaron.rossetto@ni.com>	2022-03-03 14:09:27 -0600
commit	8601d225d52b5d5d3876f633bf5d3a97ff475bda (patch)
tree	d4c47b15075324c21690514fd3a7c02bcf6ddb72 /fpga
parent	7599789f83bcf9f9e9bdef3432c37e460bddc1e4 (diff)
download	uhd-8601d225d52b5d5d3876f633bf5d3a97ff475bda.tar.gz uhd-8601d225d52b5d5d3876f633bf5d3a97ff475bda.tar.bz2 uhd-8601d225d52b5d5d3876f633bf5d3a97ff475bda.zip

rfnoc: Refactor ctrlport_endpoint; fix MT issues

This commit refactors ctrlport_endpoint and fixes several issues related to multiple threads sending and receiving control transfers. First, it refactors the change that Martin Braun implemented in 0caed5529 by adding a tracking mechanism for control requests where clients have explicitly asked to receive an ACK when the corresponding control response is received. When a client wants to wait for an ACK associated with a control request, a combination of that request's opcode, address, and sequence number is added to a set when the request is sent. When a control response is received, the set is consulted to see if the corresponding request is there by matching the packet field data listed above. If so, the control response is added to the response queue, thus notifying all threads waiting in `wait_for_ack()` that there is a response that the thread may be waiting on. If the request is not in the set, the request is never added to the response queue. This prevents the initial problem that 0caed5529 was addressing of the response queue growing infinitely large with control responses that would never be popped from the queue. Secondly, it addresses issues when multiple threads have sent a request packet and are waiting in `wait_for_ack()` on the corresponding response. Originally, the function contained a loop which would sleep the calling thread until the control response queue had at least one element in it. When awakened, the thread would pop the frontmost control response off the queue to see if it matches the corresponding control request (i.e., has the same sequence number, opcode, and address elements). If so, the response would be handled appropriately, which may include signalling an error if the response indicates an exceptional status, and the function would return. If the response is not a matching one, the function would return to the top of the loop. If the corresponding response is not found within a specified period, the function would throw an op_timeout exception. However, there is a subtle issue with this algorithm when two different calling threads submit control requests and end up calling `wait_for_ack()` nearly simultaneously. Consider two threads issuing a control request. Thread T1 issues a request with sequence number 1 and thread T2 issues a request with sequence number 2. The two threads then call `wait_for_ack()`. Let's assume that neither of the control reponses have arrived yet. Both threads sleep, waiting to be notified of a response. Now the response for sequence number 1 arrives and is pushed to the front the response queue. This generates a signal that awakes one of the waiting threads, but which one is awakened is completely at the mercy of the scheduler. If T1 is awakened first, it pops the response from the queue, finds that it matches the request, and handles it as expected. Later, when the reponse for sequence number 2 is pushed onto the queue, the still-sleeping T2 will be awakened. It pops the response, finds it to be matching, and all is well. But if the scheduler decides to wake T2 first, T2 ends up popping the response with sequence number 1 off the front of the queue, but it doesn't match the request that T2 sent with sequence number 2, so T2 goes back to the top of the loop. At this point, it doesn't matter if T2 or T1 is awakened next; because the control response for sequence number 1 was already popped off the queue, T1 never sees the control response it expects, and will throw uhd::op_timeout back up the stack. This commit modifies the `wait_for_ack()` algorithm to search the queue for a matching response rather than indiscriminately popping the frontmost element from the queue and throwing it away if it doesn't match. That way, the order in which threads are awakened no longer matters as they will be able to find the corresponding response regardless. Furthermore, when a response is pushed onto the response queue, all waiting threads are notified of the condition via `notify_all()`, rather than just waking one thread at random (`notify_one()`). This gives all waiting threads the opportunity to check the queue for a response. Finally, the `wait_for_ack()` loop has been modified such that the thread waits to be signalled regardless of whether the queue has elements in it or not. (Prior to this change, the thread would only wait to be signalled if the queue was empty.) This effectively implements the behavior that all threads are awakened when a new control response is pushed into the queue, and combined with the changes above, ensures that all threads get a chance to react and check the queue when the queue is modified.

Diffstat (limited to 'fpga')

0 files changed, 0 insertions, 0 deletions


context:
space:
mode: