Encoding / Payload Transport workflow

Why CSV Exports Break JSON Imports

Debug data moved from CSV to JSON by checking delimiters, quotes, encodings, newlines, numeric conversion, leading zeros and spreadsheet formatting.

Quick Answer

CSV exports break JSON imports when spreadsheet or export tools change types, delimiters, quotes, encodings or line endings. Treat CSV as a lossy transfer format unless you validate columns, preserve text fields and inspect the generated JSON before sending it to an API.

Example Scenario

A support team exports customers from a spreadsheet, converts the CSV to JSON, and imports it into an API. Some phone numbers lose leading zeros, large ids turn into scientific notation, names with commas split into extra columns, and non-ASCII characters appear garbled.

Step-by-Step Explanation

  1. Inspect the CSV delimiter, quote character, encoding and newline style.
  2. Confirm column headers match API field names.
  3. Force ids, phone numbers and postal codes to text before export.
  4. Validate converted JSON before import.
  5. Check rows with commas, quotes, newlines and non-ASCII characters.
  6. Keep a small import fixture with edge cases.

CSV Has Fewer Types Than APIs Need

CSV is rows and columns of text, but spreadsheets often infer types while displaying and exporting. A value that should remain a string can become a number, date or formula result. JSON APIs usually care about the difference.

Phone numbers, postal codes, account ids and product codes should often stay strings even when they contain digits. If a spreadsheet removes leading zeros, the later JSON conversion cannot recover the original value.

Before export, mark sensitive identifier columns as text and test a few edge rows. Do not wait until the API rejects a large batch.

Commas and Quotes Need Proper CSV Parsing

A CSV row cannot be safely split on commas. Fields can contain commas, quotes and newlines when they are quoted correctly. A hand-written splitter may work for a small sample and fail on real customer names or addresses.

Use a real CSV parser that understands quoted fields. Then map parsed columns to JSON fields explicitly. This avoids subtle column shifts where every field after a comma-containing value moves into the wrong property.

When debugging, find the first bad row and inspect it in raw text. The row that breaks parsing often contains a comma, quote, embedded newline or missing closing quote.

Encoding Problems Look Like Bad Data

CSV exports can be UTF-8, UTF-8 with BOM, Windows-1252 or another encoding. If the importer assumes the wrong encoding, names and addresses can become garbled. The JSON may still be syntactically valid while containing corrupted text.

Look for replacement characters, broken accents or question marks where real characters should appear. If the same file looks correct in one editor and wrong in another, encoding is a strong suspect.

Document the expected CSV encoding and normalize before conversion. Do not let each import path guess differently.

Header Mapping Is a Contract

CSV headers such as First Name, first_name and firstName are not automatically equivalent. The import mapping should be explicit and validated. A typo in a header can silently create missing JSON fields.

Reject unknown or missing required columns early. A friendly import error is better than creating half-populated records or sending malformed JSON to the API.

If the API uses nested JSON, document how flat CSV columns map to nested fields. For example, address.city and address.postalCode may be clearer than guessing from human labels.

Validation Before Import

After converting CSV to JSON, validate the JSON syntax and the schema. Syntax validation catches broken conversion. Schema validation catches missing fields, wrong types and unexpected nulls.

Compare one converted row against a hand-written known-good JSON object. If the shape differs, fix the mapping before sending a batch. Small sample validation prevents noisy partial imports.

Keep import errors row-specific. Users need to know which row and column failed, not only that the batch was invalid.

Import Fixture Checklist

Create a fixture with leading zeros, long numeric-looking ids, commas, quotes, embedded newlines, non-ASCII characters, empty optional fields and missing required fields. This fixture should be exported, converted and validated whenever the import path changes.

Use Text Diff Checker for raw CSV comparisons and JSON Formatter & Validator after conversion. If the converted JSON is valid but wrong, compare field types and values rather than only syntax.

For recurring imports, keep the original CSV file, converted JSON sample and import summary together. That evidence makes later data corrections much easier.

Add row counts at every stage: source CSV rows, parsed rows, converted JSON records, accepted records and rejected records. Count mismatches reveal skipped blank lines, broken quoted rows and partial failures quickly.

When imports are user-facing, return a downloadable error report with row numbers and column names. This turns a vague failed import into a fixable editing task instead of a support investigation.

Locale settings can change CSV interpretation. Some regions use semicolons as delimiters or commas as decimal separators. If the import tool assumes one locale, values exported from another locale can shift columns or change numeric meaning.

Formula injection is another import concern. Values beginning with equals, plus, minus or at signs may be interpreted by spreadsheet tools if exported later. Treat imported text as data and escape it appropriately when generating new spreadsheets.

For auditability, store the import mapping version with the import result. If the mapping changes later, teams can explain why older imported records have a different shape.

Check blank lines and trailing delimiters. Some CSV exporters add a final blank row, and some parsers treat it as an empty record. If the JSON importer then validates required fields, users see a confusing error for a row they never thought existed.

Preview mode is worth the effort for CSV imports. Show the first few parsed rows, inferred types and validation warnings before committing data. Users can catch delimiter, encoding and header mistakes while the source file is still fresh in mind.

For large imports, process in batches and make the result resumable. If row 8,000 fails, users should not need to guess which records were already imported. Store import id, row number, external id and validation outcome.

Do not rely only on successful HTTP response status for import quality. An API may accept the upload job but later reject rows asynchronously. The import summary should include accepted, skipped, failed and duplicate counts.

If imports update existing records, include a dry-run mode that reports creates, updates and unchanged rows. That prevents users from treating every accepted row as a new record and helps teams catch accidental overwrites before data changes are committed safely in production.

Code Examples

Preserve numeric-looking ids as strings
{
  "customerId": "00012345",
  "phone": "02155551234"
}
Reject missing required headers
const required = ['customer_id', 'email', 'name'];
for (const header of required) {
  if (!headers.includes(header)) throw new Error('Missing column: ' + header);
}
Validate converted row shape
if (typeof row.customerId !== 'string') {
  throw new Error('customerId must remain a string');
}

Common Mistakes

  • Splitting CSV rows on commas instead of using a parser.
  • Letting spreadsheets convert ids and postal codes to numbers.
  • Ignoring file encoding until characters are corrupted.
  • Sending converted JSON without checking schema.
  • Reporting batch import failure without row and column details.

FAQ

Is CSV safe for long numeric ids?

Only if the export path preserves them as text. Spreadsheets may reformat them.

Why do names with commas break my import?

The converter may be splitting on commas instead of parsing quoted CSV fields.

Why are characters garbled after import?

The CSV encoding may not match the importer assumption.

Should I validate JSON after CSV conversion?

Yes. Validate both syntax and expected schema before calling the API.