By: Michael Terranova
In today’s world, it is unreasonable to expect every device you own to not be talking to someone back home. It may be reporting what it’s doing, check out some new updates, or maybe even its telling its home what else is happening in your network. What might we see if we were to dive into our network? Well that’s where a handy tool called “Pi-Hole” comes in. This service requires an internet connection, power, a Raspberry Pi, and about 10 minutes of setup (to get the basics running). Once the basics are done, you have a nice interphase that can be used to do analytics on your network, telling you which devices are most talkative and where they call to. By default, 99,794 domains/hosts are blocked, I checked these were only advertising/malicious hosts before installing. I kept these defaults in to see if they would interfere with any of my devices on my network immediately. All devices remained working with no issues, and had no DNS callouts/replies being blocked, so I felt this is still a safe starting point. At this point in the article I feel I should make it clear, I am not using Pi-Hole for its primary purpose of adblocking, I am using it to prevent companies and devices from calling home with anything that is not vital to them working. The ad-blocking functionality is merely a bonus to me. Through following this article, and patience, you can reduce the amount of corporate intelligence gathering within your home!
Stuffing Your Pi-Hole:
So, we know that we want to hide our information more within this world, but where do we start? Well considering your Pi-Hole is all set to go, it really isn’t too hard. Let it sit for a day or so, to find to get a more accurate idea of what’s going on in the network. Once the day goes by, open up your web console and scroll down to “Top Permitted Domains”. Give that a nice look over and find the more talkative domains. If it is something you expect to be there such as “reddit.com”, keep going. For me, I found something called “google-analytics.com”.
Upon visiting the website with the giant greeting of: “Get to know your customers”, I decided this is a great candidate to be my first block for Pi-Hole! Surprisingly, it was as simple as pressing the Blacklist button at the right of the query. After that, head over to your logs and do a search for ”google” and see what else comes up.
It appears there’s at least two more domains collecting analytics for Google. Let’s go ahead and Blacklist these too. Using this method, you can repeat this until you are comfortable with the data leaving your network. I added about 15 of more of the domains I realized were sending my data, which will be seen in the next section. The main reason these domains were easy to find, is because they were talkative, and quickly showed up within the “Top Permitted Domains” list within Pi-Hole. A few google searches and visiting/curling the websites, I was able to see what data was going to them, and what they may be doing. Once I verified they were nothing but analytics and information collecting, I added them to the blacklist.
Analyzing the Pi-Hole:
After letting this run for the weekend, the results were honestly surprising to see. The top six domains calling out from my network are shown, the top one with a whopping 7k callouts over 48 hours is Nvidia’s event logger, which is their telemetry solution to gather information about their users. The next two were my Wyze Camera’s method of calling out to a Chinese server, I couldn’t find why it was doing this online, so I opted to stop it for the peace of mind. Next up is Reddit’s analytics page, which is watching how long you stood on a page, how far you scrolled, and what you upvoted/downvoted while you were there. Rapidvideo comes in 5th, but this is just a malware serving video platform. Which finally brings us to 6th with Crashlytics, a mobile phone application analytic provider, which provides info on who you are, what phone you have, how long you played the games, and how long you watched an ad for, to name a few things.
Over a 48-hour period from roughly 12am Friday morning to 12am Sunday morning, we can observe how many total DNS queries were blocked vs how many were made. You can see when I began to actively use my devices at roughly Noon on Saturday and committed the changes I had for the Blacklist. There was a major spike in the number of blocked queries compared to the previous day. From this observation, we can conclude that many more queries are being blocked by my Pi-Hole and my data is already safer than it was. I was surprised that the few changes I had made resulted in such a large increase in my blocked queries. Of course, there is room for improving my blocking list, but this is a good start for what my purposes are and where I was intending to take this project.
What Else Could’ve Been Done?
I wanted to take this experiment farther than this initially. My plan was to write a python script that would wait for a query to come into my network, requesting some info, then to send my own reply, with incorrect information. Not only would this have given more privacy to myself and the people in my network, it would have skewed any results collect already from me to be incorrect and allowing me to control what information is seen by companies. I started this idea by researching how companies send the data they collect on me. During my analysis, I noticed another website that would be a potential test for my idea, graph.oculus.com. Once I decided on this testcase, I went off and tried to research how it is oculus specifically is collecting info. During my research, I found a Reddit post of someone who had my exact same question. This person reverse engineered Oculus’s software and found what exact information is collected, and how it is being sent. In short, they are collecting: what applications I’m running, if I ever went into the transaction menu, how long I was there for, if I bought anything, or if I cancelled an order. Most importantly, he found something I didn’t think about. The information is all sent via encrypted JSON over HTTPS. That means my likelihood of being able to intercept the information, change it, and send it with my information, is likely not going to happen. Once I began looking at the information I would change within a Wireshark capture, I noticed this is typical of companies to send this data over secure lines, which, in all honesty, makes a lot of sense.
I made one last attempt and reached out to the creators of Pi-Hole, wondering if they had any functionality within the application, or even any ideas they could give me. Unfortunately, the two that responded to me told me that Pi-Hole had no capabilities of the sort and that I likely would not be able to do what it is I am looking for without a method of grabbing the certificate, unencrypting, changing the values, re-encrypting, then sending it off. That or I could decompile it and change the program entirely to send my own custom replies, but that is far out of my knowledge and I did not have enough time to learn how to do it. Had I had more time available, I would have liked to attempt this method, and learned how to decompile a program and change it to my specifications.
Corporate tracking is something that has become a normal part of almost everyone’s life in within our current world. My Pi-Hole project revealed a shocking amount of information is being collected from within my network. When I think of things that would be giving information to their company, my smart TV, and my computer comes to mind, but not my lightbulbs, or my security camera, or even my oculus. It was kind of scary once I saw how often Wyze would phone home, so much so that I disabled any communications outside of my network. I learned much about how it is companies are getting the information sent to them, and just how much information they are collecting. The simplest of things seemed to be sending information to some sort of analytics page, and it is nice to know that I am protecting my data more now than before the experiment. I wish I was able to take my project where I wanted to within this time, but I hope to continue working on it when I have more free time available to me in the future. My next steps would be to learn how to decompile, learn how to change the program to send what I want, then learn how to re-compile the program. It will be interesting to observe how corporate tracking evolves now that something such as Pi-Hole and other domain blocking services are becoming more easily available, even if this isn’t the main intent of the service, it is possible to prevent the tracking using them. It will only be a matter of time for them to find a way around it and continue to mine our data.