Skip to main content

Fixing PLC Communication Latency: Where the Milliseconds Really Go

Find the few settings draining your milliseconds and the timing follows.

Drew Gatti, Solutions Engineer , Sumit Shinde, Technical Writer
Back to Blog Posts
Image representing null
TL;DR

Most PLC latency isn't the runtime, the network, or the CPU, it's a few unexamined settings. Run a SISO test to isolate the transport from your data handling. Tier poll rates to how fast each tag actually changes instead of polling everything at one rate. Set timeouts from a measured floor times 1.5 to 2, not by feel. Spend the controller's scan budget deliberately: poll only what needs the fast tier, respect connection limits, and keep telemetry off the time-critical protocol's driver. At the supervisory layer, no protocol is a winner; pick one the device supports and that's well implemented on both ends, then hold timing with the techniques above before considering a switch. Then sample and timestamp at the source and batch writes instead of streaming single ones, so the timing is built in rather than fought for over the wire.

If you run control over a PLC link, a fast loop, an interlock, a setpoint that has to land on time, the fix usually isn't a faster runtime or a different protocol. It's a handful of settings made once at setup and never revisited, each one quietly costing milliseconds the loop can't spare.

Latency on a PLC link comes from how the data is handled, not the runtime, the network, or the PLC's CPU. Fix those settings, in order, and the timing comes back on its own.

This post follows the time: where a PLC link loses its milliseconds, and how to get each one back.

First, check if it's even the transport

Following the time means starting at the bottom. Before you change anything, find out which layer is slow. You can spend a day tuning poll rates when the real problem is the protocol shaking underneath you, or the other way around.

The simplest check is a SISO test: single input, single output. Run a tiny exchange straight on the controller, with no PLC logic in the loop, just a line or two of code, and watch it on a scope. If even that bare exchange jumps around from one pass to the next, the problem is the transport itself, not anything you built on top. If the bare exchange holds steady but your real flow doesn't, the time is going somewhere in your data handling. That one test tells you which half of this article to read first.

The data doesn't change at one speed, and your poll rate shouldn't either

When the test points at your data handling, the first place to look is also where most of the lost time hides. It usually starts with a thirty-second choice: pick a scan rate, use it for every tag, move on. It works on day one, so it stays that way. But one rate is the wrong answer for almost every real system, and a quick calculation shows why.

Take a shared bus with twenty devices, all polled at 500 ms, where each read takes 50 ms. That's not 500 ms per device, it's a queue: twenty reads at 50 ms each is a full second to get through one cycle. By the time you read device twenty, its value is already a second old, and the next cycle hasn't started. You read a slow temperature sensor ten times before its value could have moved, and a fault bit that goes high for 800 ms can fall between two passes and never get seen. One rate ends up too fast for some tags and too slow for others.

The reason it's the wrong model is that the data isn't all one kind. A temperature drifts over minutes. A runtime counter ticks once an hour. An alarm bit can flip in under a second. A setpoint only changes when an operator touches it. Polling all four at the same rate fits none of them, and it wastes the controller's time on the ones that didn't need it.

The fix is to set the rate to how fast the data actually changes:

  • Fast, 100 to 500 ms: fault and alarm bits, interlocks, values that drive closed-loop control. A missed change here has real consequences.
  • Normal, 1 to 5 seconds: running values like flow, pressure, temperature, current draw. They change all the time, but not in an instant. A two-second poll catches the trend without flooding the bus.
  • Slow, 30 seconds to a few minutes: counters, runtime hours, setpoints. They change by operator action, or build up slowly enough that fast polling is pure waste.
  • Once, then cache: nameplate data, firmware versions, calibration values. Read them at startup and never poll again.

To sort any tag, ask whether a delay changes a control decision or only the timestamp on a record. If it changes a decision, it goes on the fast tier. If it only changes a timestamp, it's telemetry and belongs slower. Then ask how many tags really need the fast path, because it's almost always far fewer than the number on it now. Most points in most systems are reads, and most of those reads are not urgent.

A timeout is a measurement, not a guess

Once the rates are sorted, the next thing that turns a slow moment into a stall is the timeout, and people usually set it by guessing a round number.

Both directions hurt. Too tight, and the request fails the moment the device is busy. That fires a retry, which adds more traffic to an already busy link, which makes the next request more likely to fail too. Too loose, and the master sits stuck on a dead request far longer than it should, holding up everything behind it. Either way, you've made latency.

The way to set it is to measure the floor and add margin. The floor has three parts: time to send the request, the device's own processing time, and time to send the response back. Take a slow serial device. Eight milliseconds to send, seventy-five for the device to process, twenty-five for the response: about 108 ms before anything has gone wrong. A 100 ms timeout fails that read every time. A 150 ms timeout usually passes but fails under load, which is the worst case because it's hard to catch. A 250 ms timeout gives you real room. Find your floor, multiply by 1.5 to 2, and don't treat the margin as wasted time. It's the difference between a link that holds under load and one that drops frames when the device gets busy. The datasheet gives you a starting number. A protocol analyzer gives you the real one.

The controller's scan time is borrowed, so spend it deliberately

Rates and timeouts are two settings, but step back and there's a simple idea underneath both. The PLC's first job is running the process, not talking to you. It works through a fixed scan, and communication is squeezed in around the logic that actually drives the machine. Every request you send borrows time from that work. Set your rates and timeouts well and you're spending that borrowed time with care. Three habits waste it.

  • Polling more than you need. This is the poll rate problem from the controller's side. A big set of points read at a fast rate burns budget on data that didn't need the rate, leaving less room for the data that does. Tiering isn't only about the bus. It gives the scan its time back.
  • Ignoring the connection limit. Many devices accept only a few connections at once, sometimes as few as one to four. Point too many clients at one device, or fire too many requests at once, and the extras wait in line or get turned away. Over TCP that shows up as connections backing off on a timer that climbs through 250, 500, then 1000 ms. It looks like random network latency, but it's the device turning work away because it ran out of sockets. Know the limit and stay under it.
  • Stacking everything on one protocol. Most PLCs run each protocol through its own driver and process, with its own share of resources. Put fast control data and high-volume telemetry on the same protocol and they fight over the same driver, so the fast path picks up the load from the slow path. It also helps to know which way each connection opens, whether the controller reaches out or something reaches in, because that changes how the work gets scheduled and what it costs.

Match the protocol to the data, not the other way around

Even with a driver to itself, none of the protocols a runtime reaches a PLC or device through at this level, the northbound, supervisory layer, are real-time by design. That's true of Modbus TCP, OPC UA, MQTT, and EtherNet/IP's explicit messaging alike. They were built to move data reliably between a supervisory system and a controller, not to guarantee it lands inside a fixed time window. So there's no protocol to crown the winner here, only ones better or worse suited to the job in front of you.

Start with what each one is built for:

  • MQTT is pub/sub transport, built to move lots of telemetry, wrong for steady sub-loop response.
  • OPC UA is strong for structured, modeled data and a poor fit for sub-100 ms control.
  • WebSocket is light and often the quickest of the general-purpose options, but it makes no promise on turn time.
  • Modbus TCP can hold steady low-millisecond reads in good conditions, but never as a hard promise.
  • EtherNet/IP's explicit messaging carries structured requests over standard Ethernet and can be tuned to hold a steady turn time, but it's making the same kind of effort the others are, not a different one.

None of these guarantees delivery inside a fixed time window by default. They're non-deterministic at this layer.

The real determinism in a PLC system lives one level down, inside the controller's own I/O scan and the fieldbus underneath it, things like EtherCAT or PROFINET IRT, where timing is enforced by the hardware and the scan cycle itself, cycle after cycle, regardless of what's happening above it. A supervisory runtime doesn't poll that layer and doesn't reach it; it talks to the controller through one of the four protocols above. Treating EtherNet/IP, or any of them, as carrying that hardware-level determinism up to the supervisory layer is where a "pick protocol X and you're guaranteed real-time" claim goes wrong.

So the answer isn't a ranking, it's a method. Pick a protocol the device actually supports and that's well implemented on both ends. Apply the techniques already in this article, tiered poll rates, measured timeouts, a respected scan budget, sampling and timestamping at the source, before touching the protocol at all. Only consider switching protocols if the timing still won't hold after that. Keep the fast path lean either way: carry only the points needed to make the decision, and fetch everything else after the fact from a database or a slower read. Most slow links come from reaching for a protocol swap before doing the work above it.

The steadiest timing is built at the source, not fixed downstream

Tiering, timeouts, and the work above all cut wasted time. But the steadiest timing comes from a change in where the work happens: do it in the PLC, at the source, instead of trying to fix it downstream over the network.

Start with sampling. If you need a fixed gap between readings, the network can't give it to you. MQTT won't, WebSocket won't, none of them will. Everything past the point where the signal is read is transport, and transport doesn't promise a fixed gap. So build the guarantee in at the source: have the controller sample at a fixed rate as it reads the signal, and stamp each reading with the time it was taken. Once a sample carries its own timestamp, it doesn't matter that the transport delivers it at an uneven pace, because the timing of the measurement is already saved. Trying to fix this by controlling when data lands over the wire is solving it in the wrong place.

That same move fixes the polling load. Instead of reading without pause, let the controller hold telemetry in a buffer and pull the buffer at the slower rate the data actually needs. The controller stops handling requests that never had to happen, and because each reading was already timestamped at the source, the slower, uneven pull never touches the timing of the data.

Then move the buffer in batches, not a trickle. Build an array, say a window of samples, each with a timestamp and value, and write the whole batch to a time-series database at once. These databases are built for batch writes, so this is far lighter on both the database and the controller than a stream of single writes. One thing to watch when you batch: build each request from registers that sit together. Stack points that are scattered across the register map into one request, and you force the device to read the dead space between them, or split the read into many small ones behind the scenes. Either way the cycle slows. Group points that are next to each other, and let the gaps fall on request edges.

And make sure the receiving end is built for the rate too. Reliability isn't only an upstream worry. A time-series database can take data at high speed, but feed it single writes at that speed and the cost climbs far above batched writes. Whatever takes the data, database, broker, or app, should be set up for the rate it's actually getting.

The runtime isn't your variable

By the time poll rates, timeouts, protocol selection, and data handling are properly tuned, the runtime becomes just one fixed cost in the path—and a predictable one.

In the case of FlowFuse, which is built on Node-RED, the runtime runs on Node.js rather than on compiled embedded code. That introduces a small amount of overhead on each pass through a flow: typically a few milliseconds, and consistently so. It doesn't drift over time, and it doesn't increase when a device on the line becomes busy. A well-designed flow can maintain low-millisecond response times all day.

That fixed overhead is also a useful diagnostic tool. A constant delay of a few milliseconds cannot cause timing variations of tens of milliseconds. So if you're seeing swings that large, the root cause is somewhere else. In most cases, it comes down to one of the four issues covered in this article: an overloaded poll rate, a timeout chosen by guesswork, an ignored connection limit, or a protocol being asked to deliver timing guarantees it was never designed to provide.

There is one exception worth calling out. Node-RED runs as a single process with a shared event loop, so a CPU-intensive transformation or a slow database write anywhere within that instance can temporarily block everything else—including the fast path—whenever it executes. Moving that work to a different tab in the editor doesn't help, because all tabs within the same instance share the same event loop.

This isn't a case of the runtime being slow; it's a case of unrelated workloads competing for the same process. The solution is the same principle used throughout this article: give the fast path its own lane by moving heavy workloads into a separate instance.

Final thought

A slow PLC link is the configuration assuming the data is simpler than it is. One rate for data that moves at four different speeds. A timeout that's never been measured. A scan budget spent without counting. A protocol asked to promise what it was never designed to promise.

None of that is a tool problem. It's a mismatch between what the system expects and what the data actually needs. Close that gap, by how fast each tag changes, by what the wire actually measures, by what each protocol was actually built for, and the timing follows on its own.

Still seeing PLC latency, or not sure your runtime can keep up?

If you've worked through poll tiers, timeouts, and protocol splits and the link still isn't steady, or you're not sure whether your runtime can hold the timing your process needs, that's exactly what our team can help diagnose, and help you achieve with FlowFuse.

Frequently Asked Questions

About the Author

Drew Gatti

Solutions Engineer

Drew is a Solutions Engineer at FlowFuse with more than 15 years of hands-on experience across automation, industrial controls, and full-stack systems integration. Before joining FlowFuse, he led projects in PLC and SCADA development, electrical design, and process optimization for manufacturing, energy, and biogas facilities. His work bridges industrial automation with modern software architecture to deliver reliable, scalable solutions.

About the Author

Sumit Shinde

Technical Writer

Sumit is a Technical Writer at FlowFuse who helps engineers adopt Node-RED for industrial automation projects. He has authored over 100 articles covering industrial protocols (OPC UA, MQTT, Modbus), Unified Namespace architectures, and practical manufacturing solutions. Through his writing, he makes complex industrial concepts accessible, helping teams connect legacy equipment, build real-time dashboards, and implement Industry 4.0 strategies.