14 May LoRaWAN Join Storm Resolution
Concept13 was approached by one of Europe’s largest facility management companies who had a troubled LoRaWAN deployment. Several thousands LoRaWAN sensors within a small confined office building driven by a handful of LoRaWAN gateways. .Substantial packet loss was causing distrust and inaccurate utilisation of resources by the lack of accurate and timely data. The entire solution costing several hundred thousand pounds in hardware alone, and more in time and resources. The reputational damage though more significant than the financial risks and cost.
Concept13 was appointed by the business to review its architecture and present its findings as to the cause and subsequent approach to resolution. With attendance not possible, the deployment was mapped out in terms of the LoRaWAN eco-system. The individual LoRaWAN sensor configurations captured, the LoRaWAN Gateway settings recorded, and traffic logs captured. A detailed picture of the LoRaWAN deployment mapped out providing exact details on the entire LoRaWAN eco-system. Of critical importance understanding what might be possible as to what was expected.
Concept13 undertook two days consultancy in the diagnosis of the entire setup. The first steps looking at each of the components individually. The LoRaWAN Sensors correctly configured but inappropriately using a shared AppKey. .The LoRaWAN Gateways optimally configured but running as self contained LoRaWAN Network Servers (LoRaWAN Islands). The LoRaWAN Network Server configured appropriately but with intensive gateway side processing logic. Individually the LoRaWAN network having minor issues, but collectively the choice of architecture resulting in the potential for major collisions across LoRaWAN traffic.
Diagnosis of the traffic logs indicating serious issues and backing up the thinking by Concept13’s consultants. The number of LoRaWAN joins exceeding all expectations with 1 in 5 joining each day. Substantial node messages being lost and while broadcast, not received by the gateways. The available duty cycle of the Gateways depleted . Intensive Node Red logic consuming resources. The LoRaWAN gateways now with extensive logging activity, the primary activities of LoRaWAN at risk of being delayed. Subsequent Receive Windows (RX1 and RX2) possibly being missed. The cumulative result of simple configuration decisions self-replicating to much wider problems.
Concept13 set about producing a twenty page proposal detailing documenting the LoRaWAN architecture and exact configurations. It went on to demonstrate factually the actual issues evidenced such as packet loss and join requests. An extensive diagnosis then delivered which showed categorically how the issues individually might not matter, but collectively how they all accumulated. How up to 80% packets where being lost the entire system collapsed under its own weight of escalating retries. The LoRaWAN eco-system in trying to self-repair, instead contributing to further work and overload of the entire network by a LoRaWAN Join Storm.
The final piece by Concep13 was a clear list of 25 colour coded improvements to address the issues identifying what would make the biggest impact. Ultimately a simple change of running a centralised cloud LoRaWAN Network Server giving the largest single improvement. The main problem being that by each Gateway running as a self-contained LoRaWAN Network Server, all the Gateway’s were working with independence and talking over each each other in a confined space., Subject to commercial decisions on a suitable LoRaWAN Network Server, the entire solution recoverable remotely within a few days.