Today I used a public web application used by thousands of users. The web interface asked me to validate a password confirmation received by a SMS on my phone. There was a ten minute countdown and I was very impatient.
The application informed me that I could never receive my sms and implicitly that I could never connect to the application…. I was warned. I tried several times without success and realized - just because of my knowledge - the system was running asynchronously with a message queue architecture!
I tried several times for half an hour and managed to log into the app and was able to continue working with the app. phew! I was happy then.
Several hours later, I received all of the SMS I hadn’t received before in the last critical ten minutes. It was unnecessary and disturbing .
I think it could have been better designed.
Many solutions exist to set up a fully integrated on-premises messaging architecture like Apache Kafka, RabbitMQ…
You can also use a PASS solution in the cloud, here is an official guide to do this on an Azure environment for example: https://docs.microsoft.com/en-us/azure/architecture/guide/technology-choices/messaging or an equivalent for AWS https://aws.amazon.com/fr/blogs/compute/understanding-asynchronous-messaging-for-microservices/
Some ideas come to mind to improve the solution:
- configure an exit time (ttl) for all the messages, in order to define a validity of a message which must be notified and or to place it in a queue of dead letters to be able to process it later for example. Here an example for RabbitMQ: https://www.rabbitmq.com/ttl.html
- prevent the application from producing several messages on an event within a period of time.
- evaluate the change of messaging solution and favor a lower latency solution if this is really necessary.
- do not prevent access to the connected part of the application if the user has not validated his password.