2023-09-16, updated: 2024-03-12
NJSON—One JSON Library to Rule Them All
Working on Nyxt often means interacting with the Internet in ways no other Lisp project ever has. As a natural consequence of this, we've accumulated a set of libraries covering unexplored areas of Lisp use. One of these libraries is NJSON, a JSON handling framework that focuses on interactive exploration of huge JSON objects (as are often returned by APIs on the Web).
A typical illustration of working with JSON. La Carte de l'Enfer - Sandro Botticelli, 1480 - 1490
Most JSON-related libraries in Lisp are parsers. Though sometimes there are standalone libraries supporting JSON-adjacent standards like JSON Pointer or JSON Schema. Yet, despite a sea of libraries, there seems to be no library that optimizes a typical JSON workflow—getting through multiple layers of data, destructuring and validating objects, and type-checking parsed objects.
NJSON is such a library that optimizes exactly those things. It abstracts over JSON parsers and makes the data they return predictable and useful. Based on this abstraction, NJSON adds operations to explore data through multiple layers of objects and arrays; validate, destructure, and match the given JSON against an expected structure; and to use the found data with Lispy macros like jif
and jbind
aware of JSON specificities and conventions.
JSON Exploration
One of the JSON APIs that actually need an involved library like NJSON is the Reddit one. Reddit allows to append .json
to almost every posts/comment/subreddit to get raw data about these as a huge JSON string. Parsing this JSON is only half the trouble—it's too deeply nested and noisy to effectively process the output of the parser, however good it might be.
NJSON allows you to peel the layers off the Reddit data and explore it incrementally. jget
allows you to index the deeply nested objects (when you provide it with a sequence of keys), while jkeys
allows getting all the keys there are in the object.
Using the Nyxt 3.7.0 release post (as the most recent one in Nyxt subreddit) as an example, here's how one might explore the post properties:
;; Getting the data.
(defvar post (njson:decode (dex:get "https://www.reddit.com/r/Nyxt/comments/16frsqb/nyxt_370.json")))
;; Exploring the general structure:
post
;; => #(#<HASH-TABLE :TEST EQUAL :COUNT 2 {10039FD1F3}>
;; #<HASH-TABLE :TEST EQUAL :COUNT 2 {10039FFE63}>)
(njson:jget 0 post)
;; => #<HASH-TABLE :TEST EQUAL :COUNT 2 {10039FD1F3}>
(njson:jkeys (njson:jget 0 post))
;; => ("kind" "data")
(njson:jget #(0 "data") post)
;; => #<HASH-TABLE :TEST EQUAL :COUNT 6 {10039FD2F3}>
(njson:jkeys (njson:jget #(0 "data") post))
;; => ("after" "dist" "modhash" "geo_filter" "children" "before")
(njson:jget #(0 "data" "children") post)
;; => #(#<HASH-TABLE :TEST EQUAL :COUNT 2 {10039FD433}>)
(njson:jkeys (njson:jget #(0 "data" "children" 0) post))
;; => ("kind" "data")
(njson:jkeys (njson:jget #(0 "data" "children" 0 "data") post))
;; => ("approved_at_utc" "subreddit" "selftext" "user_reports" "saved"
;; "mod_reason_title" "gilded" "clicked" "title" "link_flair_richtext"
;; "subreddit_name_prefixed" "hidden" "pwls" "link_flair_css_class"
;; "downs" "thumbnail_height" ...)
(njson:jget #(0 "data" "children" 0 "data" "title") post)
;; => "Nyxt 3.7.0"
(njson:jget #(0 "data" "children" 0 "data" "author") post)
;; => "aadcg"
Getting the title of the post is a matter of a short jget
call:
Or even shorter call using JSON Pointer syntax:
Destructuring the Document
Now that the structure of the document is more or less clear and we've got to the real post metadata, we can extract more information. No need to use lots of jget
calls for that—there's jbind
- a JSON-aware destructuring macro in NJSON. The destructuring syntax is simple:
- Lists match objects:
- Arrays match arrays
- Literal numbers, string, and keywords
:true
,:false
, and:null
match the respective JSON literals and validate them. - And symbols match any value and get bound to it.
So, if we need to extract useful data from the Reddit post above, we can destructure it like this:
;; Same as (njson:jget #(0 "data") post)
(njson:jbind #(("data" data))
post
data)
;; => #<HASH-TABLE :TEST EQUAL :COUNT 6 {1004108F03}>
;; Same as the full nested jget above.
(njson:jbind #(("data" ("children" #(("data" ("title" title))))))
post
title)
;; => "Nyxt 3.7.0"
;; Multiple matches.
(njson:jbind #(("data" ("children" #(("data" ("title" title "author" author))))))
post
(format t "Post ~s by u/~a" title author))
;; Post "Nyxt 3.7.0" by u/aadcg
That's where jbind
shines: it can bind many variables (at several levels of nesting!), allowing easy re-use of structure instead of repetitive jget
-s.
jbind
also validates the data it destructures. So if there's a wrong value in the object or the structure is different from the expected one, you'll see an error explaining what's wrong. Let's look at moderation messages of the post:
(njson:jbind #(("data" ("children" #(("data" ("title" title "author" author
"mod_reports" #(one)))))))
post
one)
;; NJSON:NO-KEY: There's no index 0 in array [].
Ooooops! There are no moderation reports. Which is good for a post on Reddit, but breaks our expectations. The data is probably malformed, so we should adjust the expectations or clean up the data.
And then, if we expect a certain literal value (like null
or 5), we can compare it right inside the destructuring pattern:
(njson:jbind #(("data" ("children" #(("data" ("title" :null))))))
post
one)
;; NJSON:VALUE-MISMATCH Expected null in object {...} and got "Nyxt 3.7.0".
Something's Wrong
jbind
throws errors on data mismatch, and the underlying jget
is extremely cautious about the data it gets. If you try to index an array with a string key, or if you try to index something that isn't an array or object, jget
will notify you of that:
(njson:jget "key" "non-indexable-string")
;; NJSON:NON-INDEXABLE: Non-indexable "non-indexable-string".
(njson:jget 0 (njson:decode "{}"))
;; NJSON:INVALID-KEY: Cannot index JSON object {} with key 0.
;; Use string keys instead.
You've already seen the value-mismatch
and no-key
errors from jbind
, and the usefulness these bring.
NJSON is optimized for interactive JSON inspection, that's why all of these errors have custom restarts, allowing you to debug and fix—replace the key, ignore the value, or simply abort the operation—the mismatches interactively. Looking through the JSON (with jget
) helps to understand the structure which can then be destructured, validated, and used (via jbind
) in the compiled code.
Give NJSON a Try!
If you work with JSON and often encounter the daunting size of some datasets, you can try using NJSON and make your interaction with JSON APIs more comfortable.
You can load NJSON from Github, Quicklisp, or Guix.
Did you enjoy this article? Register for our newsletter to receive the latest hacker news from the world of Lisp and browsers!
- Maximum one email per month
- Unsubscribe at any time