FileHint

Convert Shift_JIS files to UTF-8

Mojibake in Japanese CSV and text files usually means a Shift_JIS ↔ UTF-8 mismatch. Concrete conversion recipes by OS.

By FileHint editorial teamSupervised by Netwiz LLCEditorial policy

Symptoms

  • CSV opened in Excel shows garbled characters like "繝・Η繝シ".
  • Japanese filenames on GitHub render as mojibake.
  • Terminal output collapses to question marks.

Mac / Linux (CLI)

iconv -f SHIFT_JIS -t UTF-8 in.csv > out.csv
# With BOM (helps Excel on Windows)
printf '\xEF\xBB\xBF' > out.csv && iconv -f SHIFT_JIS -t UTF-8 in.csv >> out.csv

Windows (PowerShell)

$bytes = [System.IO.File]::ReadAllBytes('in.csv')
$text = [System.Text.Encoding]::GetEncoding('shift_jis').GetString($bytes)
[System.IO.File]::WriteAllText('out.csv', $text, [System.Text.UTF8Encoding]::new($true))

VS Code

  1. Open the file. Click the encoding in the bottom-right.
  2. Reopen with Encoding > Shift_JIS.
  3. Save with Encoding > UTF-8 with BOM.

Which encoding to save as

  • For CSVs opened in Excel: UTF-8 with BOM.
  • For anything consumed by web or programmatic tools: UTF-8 (no BOM).

Related extensions