5. The Stats Page

Building on my most popular page.

Written by

Image of Sam Larsen-Disney

Sam Larsen-Disney

Software Engineer at Amplitude

I nerded out in this post. Its longer than most of my posts and features rather a lot of code examples. Hope that's cool.

Did you know that the UK government website has a secret stats page for each of its site pages? If you put /info before a route, it will show you the stats page for any given page. As an example, if we wanted to look at the stats for the government's page on universal credit, we could type https://www.gov.uk/info/universal-credit. You can read more about this here.

I have always admired the UX of these stats pages. They seem to convey a large amount of data without it being overwhelming to the user. This is what I wanted for my stats page and was the source of much inspiration when building it.

Data Collection

My site collects data in a variety of different ways, I have site tracking, a database, code statistics, real-time connections and data from external sources. But we can't pull from them without setting them up so they can collect data first.

Site Tracking

Site Tracking is a powerful tool. But I think if you're going to track your users, you should only take the minimal amount you need to to make informed decisions. Previous versions of this site used google analytics to track users. I have since switched to fathom.

It should be noted that there are plenty of alternatives to Fathom. As the tool I use, some of this article is geared towards it.

Adding Fathom Analytics

Like everything I seem to want to add to my site - "there's a plugin for that". I added gatsby-plugin-google-analytics by following these steps:

  1. Sign up to Fathom and acquire a Site ID.

  2. Install the plugin:

    Copied!
    npm install gatsby-plugin-fathom
  3. Add it to your Gatsby config:

    gatsby-config.js
    Copied!
    module.exports = {
    plugins: [
    {
    resolve: `gatsby-plugin-fathom`,
    options: {
    siteId: "Your-Site-ID",
    },
    },
    ],
    };
    1
    Be sure to set your Site ID as a plugin option.

If you start you local development server, you won't notice a difference. The plugin is disabled in development, this ensures that you don't accidentally track yourself while working away on your site. But if you want to test it out you can always gatsby build && gatsby serve.

This is all you need to do to get Fathom tracking working. You should start to see page views being collected.

Database

I use a database to track how many reactions each of my article have received as well as for polls. In fact, a part of this case study is about how I added page reactions that you can read if you want to find out more about the implementation in detail.

Firebase Implementation Overview

In previous iterations I bundled firebase in with Gatsby to use it within my front-end but I have moved all firebase interaction to Gatsby functions that can be called from anywhere. This makes my bundle size smaller and makes it easier to manage.

If you would prefer to use Firebase from within Gatsby itself you can check out

gatsby-plugin-firebase

. There's also a great article by Kyle Shelvin that has a very solid introduction to using it. I used that in combination with react-firebase-hooks to read from the database.

Real-time Connections

I use these connections to present without sharing my screen but It also works as a great way to retrieve the active number of users on the site. Socket.IO is my go to solution for this kind of thing.

Implementing Socket.IO

To use sockets in the project we need to create a server to communicate between connected clients. You can get started quickly by throwing the following on a node server (probably best to keep this seperate from your Gatsby project):

server.js
Copied!
const express = require("express");
const PORT = process.env.PORT || 3000;
const server = express().listen(PORT, () =>
console.log(`Listening on ${PORT}`)
);
const io = require("socket.io")(server);
io.on("connection", function (socket) {
console.log("socket connected: " + socket.id);
io.emit("userCount", {
data: io.engine.clientsCount,
});
socket.on("disconnect", function () {
io.emit("userCount", {
data: io.engine.clientsCount,
});
});
});
1
On socket connection, emit the new count to everyone.
2
On socket disconnect, emit the new count to everyone.

I use React.context to retrieve data from the socket connection. This is a great way to pass data between components. The magic really all happens in these two useEffects:

Copied!
useEffect(() => {
setSocket(io("ws://localhost:3000"));
}, []);
useEffect(() => {
if (socket) {
socket.on("connect", () => {
setConnected(true);
});
socket.on("disconnect", () => {
setConnected(false);
});
socket.on("userCount", ({ data: count }) => {
setCount(count);
});
}
}, [socket]);
1
Use a useEffect to connect to the socket on mount.
2
On connect and disconnect update state to reflect the connection status.
3
On userCount update the count in state.

I usually wrap the root element in the context.

This wrapper needs to wrap our site content in a Redux provider.

src/state/wrapWithProvider.js

Copied!
import React from "react";
import { Provider } from "react-redux";
import createStore from "./createStore";
export default ({ element }) => {
return <Provider store={createStore}>{element}</Provider>;
};

In our store we add methods to handle any incoming actions from the server.

src/state/createStore.js

Copied!
import { createStore, applyMiddleware } from "redux";
import createSocketIoMiddleware from "redux-socket.io";
import io from "socket.io-client";
let socket = io("http://your-socket-server-here.com");
let socketIoMiddleware = createSocketIoMiddleware(socket, "server/");
function reducer(
state = {
count: 0,
},
action
) {
switch (action.type) {
case "userCount":
return Object.assign({}, { ...state, count: action.data });
default:
return state;
}
}
let store = applyMiddleware(socketIoMiddleware)(createStore)(reducer);
store.subscribe(() => {
console.log("new client state", store.getState());
});
export default store;

Now if we connect any component using Redux we will have access to these values:

Copied!
import React from "react";
import { connect } from "react-redux";
const Stats = ({ count }) => {
return (
<div>
<h1>There are {count} users online!</h1>
</div>
);
};
const mapStateToProps = ({ count }) => {
return { count };
};
const ConnectedStats =
typeof window !== `undefined` ? connect(mapStateToProps, null)(Stats) : Stats;
export default ConnectedStats;

Code Stats

To collect code statistics, I use cloc. Cloc counts blank lines, comment lines, and physical lines of source code in many programming languages including JS! I am always impressed by how quickly it runs. I have created a custom script in my package.json that runs the tool.

Copied!
{
"count_totals": "cloc --exclude-ext=svg --exclude-dir=animation-data --json src/ MDX/ > data/stats/count_total.json"
}

I've included quite a few flags in the command so lets run through them:

  • --exclude-ext=svg ignores files with the .svg extension.
  • --exclude-dir=animation-data ignore the folder that contains my animation data.
  • --json create the output of the program in JSON.
  • src/ MDX/ look at code only in my src and MDX directories
  • > data/stats/count_total.json write the output of the run to a file at the specified path.

I added gatsby-transformer-json to ingest this data and sourced the file in my gatsby-config.js:

Copied!
module.exports = {
plugins: [
`gatsby-transformer-json`,
{
resolve: `gatsby-source-filesystem`,
options: {
name: `stats`,
path: `${__dirname}/data/stats/count_total.json`,
},
]

External Sources

I also use data that is collected for me by other services connected to this site most notably Github. These require no set up within the code.

Using Collected Data

Source Plugins At Build

I have created a couple of local plugins that I use to source data at build time. This data is then available in my graphQL layer to use on the page. Its important to note that data pulled into the site in this way is not updated in real-time. It only updates when the site is built and deployed. To stop myself needing to manually build it every-time, I have a cron-job set up on a node server that pings gatsby cloud at 9PM GMT:

Copied!
const CronJob = require("cron").CronJob;
const GatsbyWebHook = "your-gatsby-cloud-webhook-url";
var job = new CronJob(
"0 00 21 * * *",
function () {
fetch(GatsbyWebHook, { method: "POST", body: "a=1" }).then(() =>
console.log("Pinged Gatsby")
);
},
null,
true,
"Europe/London"
);

There's a great cheat sheet for cron that I would totally reccomend using if you're setting this up.

Github

Github has an awesome API you can use to grab your GitHub stats and repos. Their most recent version is queried via graphQL which should be familiar to Gatsby users. Convenient! You can discover the data you can retrieve using the API explorer](https://developer.github.com/v4/explorer/).

In my gatsby-source-github-profile plugin, I fetch repository information via this API and add it to my graphQL layer using the createNode action. An Example fetching just forks and stars of a repo can be found below:

Copied!
const fetch = require("node-fetch");
const crypto = require("crypto");
exports.sourceNodes = async ({ actions }, configOptions) => {
const { createNode } = actions;
const headers = { Authorization: `bearer ${configOptions.token}` };
const body = {
query: `query {
user(login: "${configOptions.username}") {
name
repository(name: "personal-site") {
id
createdAt
url
forkCount
stargazers {
totalCount
}
}
}
}`,
};
const response = await fetch("https://api.github.com/graphql", {
method: "POST",
body: JSON.stringify(body),
headers: headers,
});
const data = await response.json();
const { forkCount } = data.data.user.repository;
const stars = data.data.user.repository.stargazers.totalCount;
createNode({
forks: Number(forkCount),
stars: Number(stars),
id: "Github-Profile",
internal: {
type: `GitHubProfile`,
mediaType: `text/plain`,
contentDigest: crypto
.createHash(`md5`)
.update(JSON.stringify({ stars, forkCount }))
.digest(`hex`),
description: `Github Profile Information`,
},
});
};

Google Analytics

I take a similar approach with google analytics but instead of using REST I use the googleapis npm package. In my implementation I am using V3 of the google analytics API but V4 has been out for a while. I am used to V3 so just feel more comfortable using it but will update this code soon. I make multiple queries but as an example here is my retrieval and ingestion of my site wide views to date:

Copied!
const crypto = require("crypto");
const { google } = require("googleapis");
const SiteWideStats = await google.analytics("v3").data.ga.get({
auth: jwt,
ids: "ga:" + viewId,
"start-date": startDate || "2009-01-01",
"end-date": "today",
metrics: "ga:pageviews, ga:sessions",
});
for (let [pageViews, sessions] of SiteWideStats.data.rows) {
createNode({
pageViews: Number(pageViews),
sessions: Number(sessions),
id: "All-site",
internal: {
type: `SiteWideStats`,
contentDigest: crypto
.createHash(`md5`)
.update(JSON.stringify({ pageViews, sessions }))
.digest(`hex`),
mediaType: `text/plain`,
description: `Page views & sessions for the site`,
},
});
}

Accessing Firebase Firestore Data

To retrieve data from the store is easy with the previously mentioned react-firebase-hooks. I do just want to note that I use the hook “useDocumentOnce” as opposed to “useDocument”. This means that it reads the data once when loading the page, instead of continually reading. This reduced the amount of usage my database saw from thousands reads to hundreds of reads and therefore saves me money. The only downside is that the reactions may be out of sync if they are liked by someone else while you are still viewing the page.

Using this implementation I can retrieve reactions for a given article with the following code:

Copied!
const [value, loading, error] = useCollectionOnce(
firebase.firestore().collection("likes")
);

Whats cool about this hook is that it gives us a loading boolean to indicate if the data is still being loaded and any errors can be retrieved to. If both loading and error are false then we know we have retrieved our data and can use it in our components:

Copied!
if (!loading && !error) {
console.log(value);
}

Stats Page Design

Inspired by the gov.uk website, I wanted to make my stats human readable. To do this I opted to incorporate most of the stats into sentances.

I also created a graph for views by day. I intially implemented it using the recharts library but found it to be a little heavy-weight for my use case. I instead decided to just create it with divs:

Copied!
allViewsPerDate.map((item) => {
return (
<div
className="is-white-bg border-radius margin-1-r"
style={{
position: "relative",
width: 100 / allViewsPerDate.length + "%",
}}
data-tip={`${item.node.views} views`}
>
<div
className="is-pink-bg-always border-radius margin-1-r"
style={{
position: "absolute",
bottom: 0,
width: "100%",
height: Math.floor((item.node.views / maxViews) * 100),
}}
/>
</div>
);
});

You'll be able to see here that I am just placing one div on top of the other to create the bars but I think it is quite effective:

Implementation

Code for the page can be found on my personal sites repo.

Looking at my retrieved google analytics data in my graphQL layer, it occured to me that I had everything I would need to put the stats that each individual page had received in its footer. I would need a fallback though. If I add a new page, it won't have any stats until the following build. In such instances I thought I would display site wide stats.

In my footer I use a StaticQuery to retrieve this information from my graphQL layer:

Copied!
<StaticQuery
query={graphql`
{
allPageViews {
edges {
node {
totalCount
path
sessions
}
}
}
siteWideStats {
sessions
pageViews
}
}
`}
render={(data) => {
const pageViews = data.allPageViews.edges.find(
(item) => item.node.path === location.pathname
);
if (pageViews) {
let node = pageViews.node;
return (
<p className="footer-sub-text margin-0 margin-1-t margin-2-b">
{node.totalCount} page views | {node.sessions} sessions |{" "}
<Link to="/stats" className="is-special-blue">
All Stats
</Link>{" "}
-{" "}
<Link to="/disclaimer" className="is-special-blue">
Disclaimer
</Link>
</p>
);
} else {
// render site wide stats
}
}}
/>

The Result

You can check out the results using the magical teleportation device (buttons) below:

If you enjoyed this deep dive why not show me by using the article reactions? You can also sign up to my newsletter using the form below this article - no pressure though!

Share

Continue the series...

About The Author

Image of Sam Larsen-Disney

Sam Larsen-Disney

Software Engineer

I enjoy teaching the next generation to code through my books, articles, presentations and at hackathons. You can learn more about me below.