Aggregate with Vega-Lite

Posted on April 24, 2020 in
2 min read

This post is a follow-up of my Vega-Lite journey you can start reading from here.

I'm using the same dataset to stress-test my abilities with Vega-Lite.

Now, I want to explore the dataset using different aggregation types. As a reminder, the dataset is about the athletes information of the Sochi Winter Olympic Games.

I want to see the age distribution. Vega-Lite allows to add an handy aggregate function within the encoding, such as:

"y":{
  "aggregate": "count",
  "field": "name",
  "type": "quantitative"
}

This way I can count all the athletes belonging to a specific age:

Now, if I want to change the aggregation field, let's say, by sport:

"y":{
  "aggregate":"count",
  "field": "name",
  "type": "quantitative",
  "title": "Athletes"
}

Now I can add a further dimension, using the color to encode the countries:

"color":{
  "field": "sport",
  "type":"nominal"
}

Now, time to show the medals. Here the histogram that shows the medals won by each country with the gender evidence as well:

"color": {
  "field": "gender",
  "type": "nominal",
  "scale":{
    "domain":["Male", "Female"],
    "range":["blue", "red"]
  }
}

If I want to filter out the countries with zero medals won, just add a filter in the transform array:

"transform": [
  {
    "filter": "datum.total_medals > 0"
  }
]

My next wish would be to create a stacked barchart showing the three type of medals won by each country. In the dataset those props are on three different columns, because a single athlete might won more than one metal type. If they were on a single column, by making the dataset more stacked, it'd be strightforward in Vega-Lite.

Unfortunately, with my dataset structure, it looks like much more complicated. I've made a couple of (failed) attempts, meaning I need to deep dive more on it. Stay tuned for the next updates.