Skip to content

Japanese Regex

I was formatting a list of words from Tanos to a CSV document to import as a deck on my site SRS-Ninja.  Which is something I’ve done before, but this time to make it even easier to format, I found out you can use regular expressions on Japanese!  I found a nice guide on Github which goes over a lot of the possible variations. In this post I wanted to highlight a few I found specially helpful.

  • Kana alone = ([ぁ-ゔゞァ-・ヽヾ゛゜ー])
  • Kanji alone = ([一-龯])
  • Kanji and Kana = ([一-龯ぁ-ゔゞァ-・ヽヾ゛゜ー])
  • Half width numbers + roman characters = ([0-9A-z])

I didn’t need the half width ones for this particular task, but I hate how those sneak in when you’re writing in Japanese and English and I love that I can now flush those out!

Also on a more random note, I wanted to put a reminder here for myself on how Dreamweaver wildcards work. Laugh if you will, but Dreamweaver has a very powerful find and replace feature and has been the easiest way I’ve tried to convert HTML tables to CSV files. So, in short, you would put your regex in the find box as say “before ([0-9A-z]+) after” then in replace box you can use a $1 to return whatever text you found like “new before $1 new after”. So, something like this:

Screenshot (90)

Published inProductivity

Be First to Comment

Leave a Reply

Your email address will not be published. Required fields are marked *