Recently, while implementing a geographic search for an API in MongoDB and Node, I came across a variant of JSON I had never heard of: Newline Delimited JSON (NDJSON). I had copied and pasted this example dataset from MongoDB and started trying to use it before noticing a couple peculiar features in its structure: there was no outer array or parent object, and there were no commas between each member object.
I started searching Google with terms like, “parse json no comma.” As I learned that I was dealing with NDJSON, it surprised me that I had gone 5 years as a developer without ever hearing of any variants of JSON, and that when I found questions on Stackoverflow and various GitHub tickets asking similar questions, some of the top, first responses just said things like, “That’s improperly formatted JSON.”
All I wanted to do was seed my database with a one-off script. I ended up installing a package called ndjson
to import the data, do a slight transformation of some keys, and write an output file in “traditional” JSON.
const fs = require('fs');
const ndjson = require('ndjson');
let zipsToOutput = [];
fs.createReadStream('src/data/raw/zips.json')
.pipe(ndjson.parse())
.on('data', (obj) => {
zipsToOutput.push(obj);
})
.on('end', () => {
const mappedZips = zipsToOutput.map(({ _id, city, loc, pop, state }) => ({
zip: _id,
city,
loc,
pop,
state
}));
fs.writeFile(
'src/data/processed/zips.json',
JSON.stringify(mappedZips),
'utf8',
(err) => {
if (err) {
console.error(err);
return;
}
console.log('done writing ndjson to json');
}
);
});