Skip to content

Instantly share code, notes, and snippets.

@bzerangue
Last active April 15, 2026 11:39
Show Gist options
  • Select an option

  • Save bzerangue/7bf6610079659e57b8d50ecb94928c31 to your computer and use it in GitHub Desktop.

Select an option

Save bzerangue/7bf6610079659e57b8d50ecb94928c31 to your computer and use it in GitHub Desktop.
JSON to NDJSON

NDJSON is a convenient format for storing or streaming structured data that may be processed one record at a time.

  • Each line is a valid JSON value
  • Line separator is ‘\n’

1. Convert JSON to NDJSON?

cat test.json | jq -c '.[]' > testNDJSON.json

With this simple line of code, you can convert and save files in NDJSON format.

Note: jq is a lightweight and flexible command-line JSON processor.
https://stedolan.github.io/jq/

Source: https://medium.com/datadriveninvestor/json-parsing-error-how-to-load-json-into-bigquery-successfully-using-ndjson-2b7d94616bcb

@bzerangue
Copy link
Copy Markdown
Author

Helpful when importing data into Sanity.io.

https://www.sanity.io/docs/data-store/importing-data

@babuloseo
Copy link
Copy Markdown

Commented so that I can start using this more and remembering to use jq :/

@dreamyguy
Copy link
Copy Markdown

Needed to get the same done in node, recursively, and this worked for me:

import fs from 'fs';
import jq from 'node-jq';

const pathInput = './export/json/';
const pathOutput = './export/ndjson/';
const fileExtensionSource = '.json';
const fileExtensionExport = '.json';

const writeSanityObjectToFileSyncAsNdjson = fileName => {
  const fileNameWithoutExtension = fileName.replace(fileExtensionSource, '');
  const fileNameOutput = `${fileNameWithoutExtension}NDJSON${fileExtensionExport}`;
  const pathOutputWithFileName = fileNameOutput.replace(pathInput, pathOutput);
  console.log(`\n`);
  jq.run('.[]', fileName, { output: 'compact' })
    .then((output) => {
      fs.writeFileSync(pathOutputWithFileName, output);
      console.log(`✨ The file '${fileName}' was converted to NDJSON!`);
    })
    .catch((err) => {
      console.error(`🐛  Something went wrong: ${err}`);
    });
  };

const jsonToNDJSON = () => {
  if (!fs.existsSync(pathInput)) {
    console.log(`dir: ${pathInput} does not exist!`);
    return;
  }
  const files = fs.readdirSync(pathInput);
  files.forEach((fileName) => {
    if (fileName !== '.DS_Store') {
      const pathInputWithFileName = `${pathInput}${fileName}`;
      const stat = fs.lstatSync(pathInputWithFileName);
      const regex = new RegExp(`([\\s]*?)${fileExtensionSource}`, "gi");
      if (!stat.isDirectory() && regex.test(pathInputWithFileName)) {
        writeSanityObjectToFileSyncAsNdjson(pathInputWithFileName);
      };
    }
  });
};

jsonToNDJSON();

@m9aertner
Copy link
Copy Markdown

Useful as a stepping stone for creating input data for Elasticsearch bulk API.
Concrete example:

$ jq -c '.a | .[]' <<END
{
    "a": [
        {
            "a1": 1
        },
        {
            "a2": 2
        }
    ]
}
END

Output:

{"a1":1}
{"a2":2}

@UweW
Copy link
Copy Markdown

UweW commented Nov 4, 2021

have even an issue with my json data for elastic.
My challenge is that I need something simular to @m9aertner example, but my nesting goes one level deeper.

{
   "x": {
        "a": [
            {
                "a1": 1
            },
            {
                "a2": 2
            }
        ]
    }
}

should result in:

{"x":{"a1":1}}
{"x":{"a2":2}}

@m9aertner
Copy link
Copy Markdown

@UweW try

jq -c 'to_entries[] | { (.key) : (.value | .[] | .[]) }' <<<'{ "x": { "a": [ { "a1": 1 }, { "a2": 2 } ] } }'
{"x":{"a1":1}}
{"x":{"a2":2}}

@draxil
Copy link
Copy Markdown

draxil commented Aug 9, 2022

jq can choke on very large files, and be slow. For these situations I made json2nd.

@bzerangue
Copy link
Copy Markdown
Author

jq can choke on very large files, and be slow. For these situations I made json2nd.

Thanks for sharing @draxil !

@fixdollar7-cyber
Copy link
Copy Markdown

Very handy jq approach 👍

For people who need a quick browser-based alternative without command line setup, I’ve been using:

https://jsonviewertool.com/json-ndjson

It’s useful when validating API responses, BigQuery imports, JSONL datasets, or large array payloads before pushing them into pipelines.

The nice part is it instantly converts:

[
  {"id":1},
  {"id":2}
]

into:

{"id":1}
{"id":2}

which is perfect for NDJSON / JSONL workflows.

The jq command is still ideal for CI scripts, but the browser tool is great for debugging and quick checks.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment