Who's collecting analytics data from mobile apps?

Who's collecting analytics data from mobile apps?

I have the absolute pleasure and honour of being one of the beta testers of the GuardianApp. In an upcoming post I'll be explaining in more detail how it helps protect your privacy. For now, tl;dr: the GuardianApp creates a VPN connection between your device and their servers and their algorithm blocks outgoing analytics calls with sensitive data. After a few weeks of using it every day and even though I don't use many different apps, I was surprised with the amount of requests that it was blocking. The team behind the GuardianApp has made and amazing effort to not collect data themselves and because of that the app doesn't show which app is making which request. Though they are actively searching for a solution. My curiosity fired up and I wanted to know which apps were making which requests.

The GuardianApp team also published an article where they show how analytics companies were (are?) selling all your information like geolocation, device type, battery level, accelerometer info and more.

Most of the apps I use fall under two categories, social media and food delivery (yes, lazy 😬). None of the social media apps I use have 3rd party analytics, they all use their own analytics libraries. That left me with a pretty clear view, most of the requests being blocked by Guardian are from food delivery apps. To test this I force-closed all apps on my device (yeh yeh, even though iOS halts processes in the background) then opened the Grubhub app and only by launching it, I saw 5 analytics requests to 4 different analytics companies being blocked by Guardian.

I went ahead and downloaded the top 10 free apps from the Food & Drink category in the US App Store, this is what I found:

DoorDash UberEATS GrubHub Postmates Starbucks McDonalds Chick-fil-A Dominos Pizza Chipotle Dunkin Donuts
New Relic ✔️ ✔️ ✔️ ✔️ ✔️
Button ✔️ ✔️
Facebook ✔️ ✔️ ✔️ ✔️ ✔️ ✔️
Segment ✔️
Adjust ✔️
App Measurement ✔️ ✔️ ✔️ ✔️
Mobile App Tracking ✔️ ✔️
App Boy ✔️ ✔️
Taplytics ✔️ ✔️
Perimeterx ✔️
In Auth ✔️
MParticle ✔️ ✔️ ✔️ ✔️
Apptimize ✔️
Apptentive ✔️ ✔️ ✔️
Urban Airship ✔️ ✔️
Kochava ✔️ ✔️
Hockeyapp ✔️
Apps Flyer ✔️ ✔️
App Dynamics ✔️
Flurry ✔️

Maybe this is easier to visualize:

As you can see the top 3 analytics companies are:

  • Facebook: Present in 6 of the top 10 free Food & Drink apps.
  • New Relic: Present in 5 of the top 10 free Food & Drink apps.
  • MParticle & App Measurement (tied): Present in 4 of the top 10 free Food & Drink apps.

These are the hosts each analytics library sends and receives data from:

Analytics Lib Host
New Relic mobile-collector.newrelic.com
Button api.usebutton.com
Facebook graph.facebook.com
Segment cdn-settings.segment.com
Adjust app.adjust.com
App Measurement app-measurement.com
Mobile App Tracking [number].engine.mobileapptracking.com
App Boy dev.appboy.com
Taplytics api.taplytics.com
Perimeterx px-conf.perimeterx.net
In Auth risk-api.inauth.com
MParticle config2.mparticle.com
Apptimize md-i-c.apptimize.com
Apptentive api.apptentive.com
Kochava kvinit-prod.api.kochava.com
Hockeyapp gate.hockeyapp.net
Apps Flyer t.appsflyer.com
Urban Airship device-api.urbanairship.com
App Dynamics mobile.eum-appdynamics.com
Flurry data.flurry.com

I've never seen GuardianApp block any Facebook analytics calls, but if you look at the host URL, you can see it's the same as regular requests for the social media content. Blocking these kind of requests will probably be very hard but I trust the Sudo Security Group (the team behind GuardianApp) will be working hard to figure a way and if it exists they'll find it.

I don't know how many times all these apps have been downloaded but probably it's in the order of tens or even hundreds of millions. Even more worrisome is that we use them very often, specially on weekends and holidays. This gives these companies (indirectly) an insight on where people go for their vacations on holidays, to name one example.

Another point you can get out of this is that only within 10 apps, there were 20 different analytics companies. There are a lot of people interested in tracking all this data and it's nearly impossible for an individual to keep up with all of the existing players as well as new ones. This is why I love what GuardianApp is doing, they'll keep track of all these companies and the new ones and will block as many requests as they can.

For me this was an eye opener research and really caught my interest in understanding how this space works and want to learn as much as I can from all these analytics companies and their libraries. In my next post I'll dig a bit deeper and will try to figure out what is the actual data being sent from the device to all these analytics endpoints. Some of these requests send the payload encrypted, other libraries use certificate pinning, so it will take me some time to gather all the information to expose the raw data. Stay tuned.

Photo by Luke Chesser on Unsplash