There are many different troubleshooting models which can be used, some created by vendors, some created by the IT industry in general. If you look at all the models in more detail, one thing they all have in common is they all start with the same first step, to identify the problem.
In my opinion, this step is so often rushed or skipped, and IT professionals jump straight into fixing the problem without properly defining the problem. It’s very important that when looking at a new problem, we pause for a moment and think, do I really know what the problem actually is? When we are working with WiFi I think this first step is even more important as there are so many things that the users will see as a WiFi problem, and however, quite regularly may not always be a WiFi issue. As good and as helpful as our customers are, they are not the best at reporting problems very well and often miss out information which they don’t see as important, but to us as WiFi professionals is key to solving the problem.
It is not always possible to get to the site and see the problem firsthand, so we rely on the information given to us by our users. It is very easy to create incorrect assumptions from reports from users. To assist in fully understanding the problem, I always start by asking questions; here are some ideas of the more general questions which might help to find the problem:
- What exactly does not work?
- Has it ever worked?
- Do you see any error messages?
- Where are you located?
- What device are you using?
- Does it still happen when you are in a different place?
- Has it happened before?
- Does it happen at the same time of day every day?
- When did it start?
- What software are you using?
- Does it work for other users who are near you?
It doesn’t mean asking every question one by one. Instead, consider which question(s) are most likely to help identify the issue, and use the responses to guide you toward more focused, follow-up questions.
There are so many more questions I could add to that list, and depending on the issue, they may or may not be relevant. If the fault was related to software VoIP calls or video calling, then the questions to ask might be very different, as these would need to be more application-specific.
For example:
A typical issue might be a user reporting Wi-Fi problems. Below is an example of how the conversation could flow, including sample questions and possible user responses.
Q: What specific issues are you having?
A: Cannot make phone calls.
Q: Are you having any other issues surfing websites or accessing email?
A: No
Q: Is this affecting other users?
A: Yes its affecting everyone in the office
At this point, I am either going to use the knowledge of the customer setup about all users on the WiFi or do any users use a VoIP phone. If I have no knowledge, then I will ask the user.
Q: Is everyone using their laptop for making calls?
A: No, this is also affecting the same if use a deskphone.
At this point, I’m quite sure it’s not a WiFi issue and will send this ticket over to the team who manage the VoIP system.
Over the years of working in this industry, I have tried (and failed) on many different occasions to create a flow chart of questions to ask and, based on that, answer what the next question should be; as there are so many questions which can be asked and so many different problems, it became too complicated.
In my training and presentations I give at conferences, I love to tell stories, and while writing this, I am reminded of a few incidents from my years in this industry which I hope highlight the importance of asking questions or seeing the problem first hand.
Story 1.
A home user reported that the WiFi was not working upstairs but was working downstairs.
Based on information I knew about this specific home, there was one single AP located central to the home on the landing. Based on this information, the problem made no logical sense. After a few questions with the user, it turned out that when upstairs, the user was using a laptop, but downstairs was using a mobile phone, which now makes the problem device-specific, not the WiFi. The user, when reporting this issue, did not think that additional information was important and as is so often the case, saw everything as “WiFi”. Without this additional information gained by asking questions, it would have been so very easy to go down a long WiFi diagnostic, only to have wasted lots of time because of incorrect assumptions. This was soon fixed by updating the laptop’s NIC driver.
Story 2.
A small independent coffee shop where the staff complained that the WiFi never works.
This one was a site visit to the coffee shop in question, so I did not need to rely on the users to give information or try to diagnose the problem remotely. On this visit, I connected my laptop to the wifi and my laptop was unable to get an IP address from the DHCP server. This is something that we would not expect users to diagnose and report to us, and quite often modern devices, if they do not get an IP address when connecting to the wifi just disconnect straight away, giving the end user the perception that the WiFi is not working. This problem was a classic misconfiguration. The coffee shop in question just had a SOHO internet connection with the ISP supplied router. Once I had logged into the router, I quickly found the DHCP lease was 8 days, and the pool was 192.168.1.100-192.168.1.150. Reducing the lease time to 10 mins and increasing the pool size to 240 addresses this fixed the problem in around 10 mins. That problem is much harder to resolve if you don’t have remote access and have someone technical onsite with a laptop-based device, as I’ve not yet seen a mobile device that reports being unable to connect to the WiFi, as I can’t get an IP address from the network!
Story 3.
A small branch office of about 10 users reported that the WiFi was down for everyone.
This ticket has support staff baffled. They could see from the monitoring system that the site was also offline. They had the staff reboot the router and switch, and that did not resolve anything. They started the process of raising a fault with the ISP, as that was their conclusion about the fault. It was only a few hours later, when it got escalated, that I was involved and asked the staff if there were any lights flashing on the router. It then turned out there was no light flashing on the router, and the staff in this office had been sitting in the dark for the last couple of hours because there was a power cut! This, unfortunately, is not the only incident of a user not linking that the WiFi also needed the power, and not thinking that it was important to report that the power was off.
Conclusion.
Ask Questions, never assume anything and do not jump to conclusions when diagnosing wifi issues.
LinITX Blog Ubiquiti & MikroTik Wireless Networking Experts