S2E5 - Filter Methods
Demystify Vega-Lite Examples in this step-by-step rebuild 🕊️🧙🏼♂️✨
💌
PBIX
file available at the end of the article 1 Enjoy!
Intro
Welcome to Season 2 of the Vega-Lite walkthrough series. In this season, we will go step-by-step through many of the Vega-Lite Examples - learning loads of techniques and tips along the way. Enjoy and welcome! 🕊️
Data
All data used in this series can be found in the Vega github repo:
Official Vega & Vega-Lite Data Source Repo
Today’s dataset is provided by yours truly (see PowerQuery custom function fn_generate_randomised_data
)
. . .
Concept
In this episode we will focus on filtering methods.
Important Note: Viewing Vega/Vega-Lite Outputs
When opening in the Vega Online Editor, remember to delete the raw path url before /data/. Examples:Github pages:
• “data”: {“url”: “https://raw.githubusercontent.com/vega/vega/refs/heads/main/docs/data/stocks.csv”}For online editor:
• “data”: {“url”: “data/stocks.csv}For PowerBI:
• “data”: {“name”: “dataset”}
. . .
Build
What we are building here is the strong foundations of understanding the many ways in which we can filter marks and datasets directly in our Vega-Lite spec.
The concept is fairly simple, and we can reuse and reapply the following methods anywhere and everywhere within your Deneb/Vega/Vega-Lite visuals.
Step 1: How do we filter?
Filters can be applied with a the Filter Transform block. You’ll be familiar with this from our previous tutorials. Here is our example below:
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
{
"data": {"name": "dataset"},
/* transform block */
//-----------------
"transform": [{
"filter": {
"field": "year_num", //-- field to filter on
"equal": 2025 //-- numerical value
}}],
//-----------------
"layer": [{"mark": {...}}],
"encoding": {...}
}
For a wordy explanation, we could say:
The
Filter Transform
must have a filterproperty
. This property is described as the predicate and it determines the manner or method in which we perform the filter operation. A predicate is a logical condition that returns a Boolean value (true or false).
In other words:
A predicate is a test that says whether each row of data should be kept or removed.
So we can indeed think of it like this:
• Keep only the rows where this condition is true
• Keep only the rows of data where the values in the [category]
field equal “Laptops”
1
2
3
4
5
6
7
8
9
10
11
12
13
{
"data": {"name": "dataset"},
/* transform block */
//-----------------
"transform": [{ // transform
"filter": { //-- filter operation
"field": "category", //-- field to filter on
"equal": "Laptops" //-- predicate
}}],
//-----------------
"layer": [{"mark": {...}}],
"encoding": {...}
}
Step 2: Filter Transformation Types
Predicate Types
There are four types of predicate:
- Field Predicate
- Vega Expression
- Selection Predicate
- A logical composition (a combination of the above with (
and
,or
,not
) )
Field Predicates:
Observe the table below. We have the field predicate and their descriptions:
predicate | description |
---|---|
equal | equal to |
lt | less than |
lte | less than or equal to |
gt | greater than |
gte | greater than or equal to |
range | an array between minimum and maximum value |
oneOf | behaves like the IN operator |
example with and
& gte
& lte
:
1
2
3
4
5
6
7
"transform": [{
"filter": {
"and": [
{ "field": "year_num", "gte": 2026 },
{ "field": "year_num", "lte": 2027 }
]
}}]
example with oneOf
:
1
2
3
4
5
6
7
8
9
"transform": [{
"filter": {
"field": "subcategory",
"oneOf": [
"Smartphones",
"Mini PCs",
"Gaming PCs"
]
}}]
Vega-Expressions
An alternative method, and one which I find preferabble for both ease and aesthetic reasons, is to use Vega Expressions. They feel more like your DAX, M-code, SQL-type statements, albeit with a little JSON syntax sprinkles.
Field Predicate vs Vega-Expression
Let’s make a comparison between the two filter methods…
Field Predicate:
1
2
3
4
5
6
7
8
// predicate method
"transform": [{
"filter": {
"and": [
{ "field": "year_num", "gte": 2026 },
{ "field": "year_num", "lte": 2027 }
]
}}]
— VS —
Vega-Expression:
1
2
3
4
5
6
// vega-expression method
"transform": [
{
"filter": "datum.year_num >= 2026 & datum.year_num <= 2027"
}
]
What do you think? There is a slickness to the Vega-Expression method that is most satisying, isn’t there? 🥲
Step 3: Operators
You will notice that Vega Expressions allow us to use Operators. Operators are a kind of shorthand and make complex code easier to read and write.
🔧 Logical Operators (for combining conditions)
Operator | Meaning | Example |
---|---|---|
&& | AND | a > 5 && b < 10 |
| | | OR | a < 5 | | b == 20 |
! | NOT | !(a == 'PCs') |
🔍 Comparison Operators (for evaluating values)
Operator | Meaning | Example |
---|---|---|
== | Equals | datum.type == 'A' |
!= | Not equal | datum.status != 'closed' |
> | Greater than | datum.value > 100 |
>= | Greater than or equal | datum.score >= 80 |
< | Less than | datum.age < 18 |
<= | Less than or equal | datum.size <= 50 |
🧮 Arithmetic Operators (for math in expressions)
Operator | Meaning | Example |
---|---|---|
+ | Addition | datum.x + datum.y |
- | Subtraction | datum.sales - datum.returns |
* | Multiplication | datum.price * 1.1 |
/ | Division | datum.total / datum.count |
% | Modulo (remainder) | datum.id % 2 == 0 |
🧠 Ternary Operator (if-else shorthand)
Syntax | Meaning |
---|---|
condition ? valueIfTrue : valueIfFalse | Example: datum.score > 50 ? 'Pass' : 'Fail' |
Applying this to real-world examples, we can see how easily these can be applied:
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
"transform": [
// multiplication (*)
{"calculate": "(datum.price * datum.quantity)", "as": "fx_multiply"},
// division (/)
{"calculate": "(datum.price / datum.quantity)", "as": "fx_divide"},
// modulo (%)
{"calculate": "(datum.price % datum.quantity)", "as": "fx_modulo"},
// greater than (>)
{"calculate": "(datum.fx_modulo >1)", "as": "fx_greater_than"},
// equals (==)
{"calculate": "(datum.category_name == 'PCs')", "as": "fx_equals"},
// not equal (!=)
{"calculate": "(datum.category_name != 'PCs')", "as": "fx_not_equals"},
// OR (||)
{"calculate": "(datum.category_name == 'PCs' || datum.category_name == 'Laptops')", "as": "fx_or"},
// IF-ELSE - AND (&&)
{"calculate": "(datum.price >= 1000 && datum.quantity > 2 ? 'pricey' : null)", "as": "fx_condition"}
]
. . .
En Fin, Serafin
Thank you for staying to the end of the article… I hope you find it useful 😊. See you soon, and remember… #StayQueryous!🧙♂️🪄
PBIX 💾
🔗 Repo: Github Repo PBIX Treasure Trove
. . .
Footnotes
PBIX: Repo - Walkthrough Series ↩︎