Essential Reading
The genstudio.plot
module is an interface to the Observable Plot library, with 100% coverage and a straightforward mapping between what you write in Python and how plots are typically written in JavaScript. The Python code you write produces a structured representation which is serialized and rendered in a browser environment. To use this library effectively, you will want to frequently refer to the Observable Plot documentation to understand the API surface we're targeting.
Marks¶
Marks are the basic visual elements used to represent data. Common marks include line
, dot
, bar
, and text
.
Each mark type has its own set of properties that control its appearance and behavior. For example, with line
, we can control the stroke, stroke width, and curve:
import genstudio.plot as Plot
six_points = [[1, 1], [2, 4], [1.5, 7], [3, 10], [2, 13], [4, 15]]
Plot.line(
six_points,
{
"stroke": "steelblue", # Set the line color
"strokeWidth": 3, # Set the line thickness
"curve": "natural", # Set the curve type
},
)
line_plot = Plot.line(six_points, {"stroke": "pink", "strokeWidth": 10})
dot_plot = Plot.dot(six_points, {"fill": "purple"})
line_plot + dot_plot + Plot.frame()
Layouts¶
To show more than one plot, we can compose layouts using &
(for rows) and |
(for columns).
line_plot & dot_plot
For more advanced layout options, including grids and responsive layouts, see the Layouts guide.
Supplying Data¶
To render a mark we must (1) supply some data, and (2) indicate how that data maps to visual properties (or channels, in Observable lingo). Common channels include:
x
andy
for 2d coordinatesfill
andstroke
for colorsopacity
for alpha blending
The documentation for each mark will indicate what channels are available or required.
There are a few ways to supply data and map it to channels. Shorthand syntax exists to make common cases faster to specify; this tends to be appreciated by advanced users but can be tricky when getting started.
Below is an example of the "base case" that Observable Plot is designed around. In this mode of working, data arrives as a list of objects, and our task is to specify how each object's properties should map to the necessary channels.
object_data = [
{"X": 1, "Y": 2, "CATEGORY": "A"},
{"X": 2, "Y": 4, "CATEGORY": "B"},
{"X": 1.5, "Y": 7, "CATEGORY": "C"},
{"X": 3, "Y": 10, "CATEGORY": "D"},
{"X": 2, "Y": 13, "CATEGORY": "E"},
{"X": 4, "Y": 15, "CATEGORY": "F"},
]
We always pass data as the first argument to a mark, followed by options (which may be a dictionary or keyword args). For each required or optional channel, we specify "where to find" that channel's data in each entry. Let's start with the simple case of using strings, which are simply used as keys to look up a property in each object.
Plot.dot(object_data, x="X", y="Y", fill="CATEGORY", r=20)
A mark takes data followed by an options dictionary, which specifies how channel names get their values.
There are a few ways to specify channel values in Observable Plot:
- A string is used to specify a property name in the data object. If it matches, that property's value is used. Otherwise, it's treated as a literal value.
- A function will receive two arguments,
(data, index)
, and should return the desired value for the channel. We usePlot.js
to insert a JavaScript source string - this function is evaluated within the rendering environment, and not in python. - An array provides explicit values for each data point. It should have the same length as the list passed in the first (data) position.
- Other values will be used as a constant for all data points.
Plot.dot(
object_data,
{
"x": "X",
"y": "Y",
"stroke": Plot.js("(data, index) => data.CATEGORY"),
"strokeWidth": [1, 2, 3, 4, 5, 6],
"r": 8,
"fill": None,
},
)
There are a couple of special cases to be aware of.
- If all of your data is in columnar format (ie. each channel's values are in their own arrays), we can pass them in dictionary format in the first (data) position, eg.
Plot.dot({"x": [...], "y": [...]})
. - Some marks, like
Plot.dot
andPlot.line
, which expectx/y
coordinates, will accept an array of arrays without any need to manually map channel names. eg/Plot.line([[x1, y1], [x2, y2], ...])
.
Data Serialization¶
Data is passed to the JavaScript runtime as JSON with binary buffer support. The serialization process handles various data types:
Data Type | Conversion |
---|---|
Basic types (str, int, bool) | Direct JSON serialization |
Binary data (bytes, bytearray, memoryview) | Stored in binary buffers with reference |
NumPy/JAX arrays | Converted to binary buffers with dtype and shape metadata |
Objects with for_json method |
object.for_json() result is serialized |
Datetime objects | Converted to JavaScript Date |
Iterables (list, tuple) | Recursively serialized |
Callable objects | Converted to JavaScript callback functions (widget mode only) |
Binary data is handled efficiently by storing the raw bytes in separate buffers rather than base64 encoding in JSON. This is particularly important for large numeric arrays and binary data.
There is a 100mb limit on the size of initial data and subsequent messages (per message).
The serialization process also handles state management for interactive widgets, collecting initial state and synced keys to enable bidirectional updates between Python and JavaScript. For more details on state management, see the State guide.
Widgets vs HTML¶
GenStudio offers two rendering modes:
HTML mode: Renders visualizations as standalone HTML, ideal for embedding in web pages or exporting. Plots persist across kernel restarts.
Widget mode: Renders visualizations as interactive Jupyter widgets. Enables bidirectional communication between Python and JavaScript.
You can choose the rendering mode in two ways:
- Globally, using
Plot.configure()
:
Plot.configure(display_as="widget") # Set global rendering mode to widget
- Using a plot's .display_as(...) method:
categorical_data = [
{"category": "A", "value": 10},
{"category": "B", "value": 20},
{"category": "C", "value": 15},
{"category": "D", "value": 25},
]
(
Plot.dot(categorical_data, {"x": "value", "y": "category", "fill": "category"})
+ Plot.colorLegend()
).display_as("html")
The global setting affects all subsequent plots unless overridden by .display_as()
.
You can switch between modes as needed for different use cases.