|Photo by Štefan Štefančík on Unsplash|
So far I have samples that cover the following data sources:
- Google Analytics - to my previous learning have added analysis of browsers used by site visitors and an example Shniy app for data exploration.
- InfluxDB - Working with time series data generated by the product, the two examples here are API response times and feature usage. This includes an example of manipulating time series data to set missing values to 0 for plotting.
- UptimeRobot - Simple example of taking error data and using a pivot table to explore the data, after some cleaning and filtering. This kind of workflow can be useful with large data sets.
One thing that I really wanted to get working was Elasticsearch, to get an example analysis of components and logging output, including relative error rates. Sadly the only wrappers available at the moment stop just before the version of Elasticsearch we use. Elasticsearch is also the first NoSQL system I have used, so it takes a bit of getting your head around how everything hangs together. Already having example visualisations in Kibana was helpful in working it out though.
Another thing on my to do list is getting standard pirate metrics out of Google Analytics and then format into a nice report. The R Markdown components come with a range of templates that you can use, for example based on the Bulma framework, or you can create your own with your own branding.
This exercise has been really helpful in getting disparate sources of information into a standard reporting framework. Previously I would have tackled this by building some kind of "data warehouse", but actually as long as I can query well enough to get a format that can then be manipulated into a report then it feels like an extra step. Especially where you don't need to drill down further in the report. If something needs more investigation I'll likely go to the source anyway.
I hope that someone out there finds some value in my R scripts. Please feel free to fork and send my pull requests with any improvements! I am also always happy to talk to data-driven members of the Product Management community who might be interested in using R.