aboutsummaryrefslogtreecommitdiffstats
path: root/host/lib/rfnoc/ctrlport_endpoint.cpp
Commit message (Collapse)AuthorAgeFilesLines
* rfnoc: Refactor ctrlport_endpoint; fix MT issuesAaron Rossetto2022-03-031-121/+170
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This commit refactors ctrlport_endpoint and fixes several issues related to multiple threads sending and receiving control transfers. First, it refactors the change that Martin Braun implemented in 0caed5529 by adding a tracking mechanism for control requests where clients have explicitly asked to receive an ACK when the corresponding control response is received. When a client wants to wait for an ACK associated with a control request, a combination of that request's opcode, address, and sequence number is added to a set when the request is sent. When a control response is received, the set is consulted to see if the corresponding request is there by matching the packet field data listed above. If so, the control response is added to the response queue, thus notifying all threads waiting in `wait_for_ack()` that there is a response that the thread may be waiting on. If the request is not in the set, the request is never added to the response queue. This prevents the initial problem that 0caed5529 was addressing of the response queue growing infinitely large with control responses that would never be popped from the queue. Secondly, it addresses issues when multiple threads have sent a request packet and are waiting in `wait_for_ack()` on the corresponding response. Originally, the function contained a loop which would sleep the calling thread until the control response queue had at least one element in it. When awakened, the thread would pop the frontmost control response off the queue to see if it matches the corresponding control request (i.e., has the same sequence number, opcode, and address elements). If so, the response would be handled appropriately, which may include signalling an error if the response indicates an exceptional status, and the function would return. If the response is not a matching one, the function would return to the top of the loop. If the corresponding response is not found within a specified period, the function would throw an op_timeout exception. However, there is a subtle issue with this algorithm when two different calling threads submit control requests and end up calling `wait_for_ack()` nearly simultaneously. Consider two threads issuing a control request. Thread T1 issues a request with sequence number 1 and thread T2 issues a request with sequence number 2. The two threads then call `wait_for_ack()`. Let's assume that neither of the control reponses have arrived yet. Both threads sleep, waiting to be notified of a response. Now the response for sequence number 1 arrives and is pushed to the front the response queue. This generates a signal that awakes one of the waiting threads, but which one is awakened is completely at the mercy of the scheduler. If T1 is awakened first, it pops the response from the queue, finds that it matches the request, and handles it as expected. Later, when the reponse for sequence number 2 is pushed onto the queue, the still-sleeping T2 will be awakened. It pops the response, finds it to be matching, and all is well. But if the scheduler decides to wake T2 first, T2 ends up popping the response with sequence number 1 off the front of the queue, but it doesn't match the request that T2 sent with sequence number 2, so T2 goes back to the top of the loop. At this point, it doesn't matter if T2 or T1 is awakened next; because the control response for sequence number 1 was already popped off the queue, T1 never sees the control response it expects, and will throw uhd::op_timeout back up the stack. This commit modifies the `wait_for_ack()` algorithm to search the queue for a matching response rather than indiscriminately popping the frontmost element from the queue and throwing it away if it doesn't match. That way, the order in which threads are awakened no longer matters as they will be able to find the corresponding response regardless. Furthermore, when a response is pushed onto the response queue, all waiting threads are notified of the condition via `notify_all()`, rather than just waking one thread at random (`notify_one()`). This gives all waiting threads the opportunity to check the queue for a response. Finally, the `wait_for_ack()` loop has been modified such that the thread waits to be signalled regardless of whether the queue has elements in it or not. (Prior to this change, the thread would only wait to be signalled if the queue was empty.) This effectively implements the behavior that all threads are awakened when a new control response is pushed into the queue, and combined with the changes above, ensures that all threads get a chance to react and check the queue when the queue is modified.
* rfnoc: Ignore errors in ctrlport response packets if ACKs not wantedAaron Rossetto2022-01-211-0/+8
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | In 0caed5529, a change was made to ctrlpoint_endpoint's behavior such that if a client does not care about checking for ACKs on poke or poll operations, the code calls `wait_for_ack()` with a flag indicating that it should not wait for the ACK, but find and remove the corresponding response from the response queue. This prevents the queue from potentially growing endlessly with response packets that the client doesn't even care about. However, this introduced a subtle, undesired behavioral change. When `wait_for_ack()` finds the corresponding response for a request, it also checks the status field of the response to report any errors flagged by the hardware such as invalid command, routing error, etc. Prior to the change mentioned above, since `wait_for_ack()` was never called when the client doesn't want ACKs, the client would never be never notified of any errors associated with the request. However, with the aforementioned change in placd, when `wait_for_ack()` is called to find and remove the unwanted response packet corresponding to the request, errors **are** checked and reported up the user. The behavior change was unearthed by the X410 ZBX CPLD initialization code, which writes an initial value of 0 to all ZBX CPLD registers--even read-only registers. A control request to write a read-only register results in a response with CMDERR in its status field, as it should. However, since the ZBX CPLD register initialization is performed with a `poke32()` operation which by default doesn't wait for ACKs, this was never a problem until the change to drain the response queue inadvertently caused the error to surface. The result is that creating a USRP session or RFNoC graph session to an X410 device is seen to occasionally fail with a 'RuntimeError: RuntimeError: Failure to create rfnoc_graph' message printed to the console. This commit preserves the desired queue-draining behavior, but ignores any error status on the response when it is found and removed from the queue, thus restoring the behavior pre-0caed5529.
* rfnoc: Always clear response queue in ctrlport_endpointMartin Braun2022-01-131-16/+29
| | | | | | | | | | | | | | | | | | | | | | | | | | | This changes the behaviour of ctrlport_endpoint (the register interface for block controllers) to always check for an ACK after doing a poke or poll of any kind. Previously, the behaviour was to only check for an ACK if the policy was set that way, or if the user requested the ACK to be received. The problem with the former approach was that if many pokes were performed without ever requesting an ACK or a poll, the response queue would fill up without ever getting emptied, eventually draining the available heap space. Note that this is not a memory leak in the usual sense, as the response queue was correctly holding on to the response packets. With this change, ctrlport_endpoint::wait_for_ack() now receives a require_ack parameter. If it is false, the behaviour of wait_for_ack() is changed as follows: - If the response queue is empty, immediately return with an empty response payload object. - Otherwise, continue reading elements out of the response queue until it is either depleted (in which case the previous rule kicks in), or we find the ACK corresponding to the command previously sent out. Note that this replicates the corresponding behaviour in UHD 3 (see ctrl_iface_impl::wait_for_ack()).
* host: Update code base using clang-tidyMartin Braun2021-03-041-21/+21
| | | | | | | | | The checks from the new clang-tidy file are applied to the source tree using: $ find . -name "*.cpp" | sort -u | xargs \ --max-procs 8 --max-args 1 clang-tidy --format-style=file \ --fix -p /path/to/compile_commands.json
* rfnoc: Fix time conversion in ctrlport_endpoint sleep methodCiro Nishiguchi2020-12-211-2/+4
|
* fixup! rfnoc: Fix thread unsafe accesses in ctrlportCiro Nishiguchi2020-10-201-7/+5
|
* rfnoc: Fix thread unsafe accesses in ctrlportCiro Nishiguchi2020-10-201-56/+35
|
* rfnoc: Increase ctrlport_endpoint default timeoutSteven Koo2020-08-121-1/+1
| | | | The .1 second timeout fails on macOS. Expand this timeout to 1 second.
* utils: Expose CHDR Types in Public APIrobot-rover2020-07-131-1/+1
| | | | | | | | | | This commit exposes uhdlib/rfnoc/chdr_types.hpp in the public includes. Additionally, it takes some types from uhdlib/rfnoc/rfnoc_common.hpp and exposes them publicly in uhd/rfnoc/rfnoc_types.hpp. Finally, one constant is moved from uhdlib/rfnoc/rfnoc_common.hpp to uhd/rfnoc/constants.hpp Signed-off-by: robot-rover <sam.obrien@ni.com>
* rfnoc: Rename chdr_packet to chdr_packet_writerSamuel O'Brien2020-07-131-12/+8
| | | | | | | | | It would be confusing to have two classes named chdr_packet. As it makes more sense to name the new public chdr parser class chdr_packet, the internal uhd::rfnoc::chdr::chdr_packet class is being renamed to chdr_packet_writer to better represent its functionality. Signed-off-by: Samuel O'Brien <sam.obrien@ni.com>
* rfnoc: Use large timeout for reads when timed commands existCiro Nishiguchi2020-05-151-8/+24
| | | | | If a timed command is in the queue, writes use a large timeout. Changing reads to do the same.
* uhd: Apply clang-format against all .cpp and .hpp files in host/Martin Braun2020-03-031-7/+5
| | | | | Note: template_lvbitx.{cpp,hpp} need to be excluded from the list of files that clang-format gets applied against.
* rfnoc: ctrlport: Fixing timeouts for timed commandsBrent Stapleton2019-11-261-5/+25
| | | | | | When issuing a timed command, if there is no room in the command FIFO and there is a timed command queue'd up, wait for a long time before timing out.
* rfnoc: ctrlport: Separately validate and handle async messagesMartin Braun2019-11-261-5/+28
| | | | | | | | | | | This introduces the concept of an async message validator, an optional callback for functions to check if an async message has a valid payload. After validation, the async message is ack'd. Then, the async message handler is executed. This makes sure that an async message is ack'd as soon as possible, rather than after the async message handling, which can itself have all sorts of communication going on to the device.
* rfnoc: async message: Include timestamp in async message handlingMartin Braun2019-11-261-1/+2
| | | | | | | | | Async messages (like, e.g., overrun messages) can include a timestamp. This change enables access to the timestamp in the async message handler. It is up to the FPGA block implementation to include the timestamp, if desired/necessary. The definition of the timestamp may also depend on the block, for example, the overrun async message will include the time when the overrun occurred.
* rfnoc: ctrlport_endpoint: Audit locks, lock more selectivelyMartin Braun2019-11-261-5/+14
| | | | | The existing implementation would lock judiciously, causing a deadlock when the async message handler would try and call poke32().
* rfnoc: Added src port, EPID getters to register_iface, 64-bit callsAshish Chaudhari2019-11-261-1/+14
| | | | - Add peek64() and poke64() convenience calls
* rfnoc: Moved chdr types/packet class out of chdr dirAshish Chaudhari2019-11-261-2/+2
| | | | | | - Moved chdr_packet and chdr_types from rfnoc/chdr to rfnoc and updated all references - Moved non-CHDR definitions to rfnoc_common.hpp
* rfnoc: Added clock_iface to convey info about clocksAshish Chaudhari2019-11-261-15/+22
| | | | | The inteface provides a mechanism for users of clocks to query information such as the running status or rate
* rfnoc: Add block_poke support to reg_iface async msgAshish Chaudhari2019-11-261-4/+4
| | | | | The async message callback now has a vector of data words instead of a single one
* rfnoc: Added impl for reg_iface and ctrl_endpointAshish Chaudhari2019-11-261-0/+471
- Added new register_iface class that translates high-level peek/poke calls into CHDR control payloads - Added new chdr_ctrl_endpoint class that emulates a control stream endpoint in SW. It can create and handle multiple register interfaces