Last active
July 8, 2018 05:45
-
-
Save lo48576/4104005 to your computer and use it in GitHub Desktop.
Utility to make skk dictionary easily.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| skkにて、略称や複数の読み/表記がある言葉を簡単に変換できるような辞書を作成するためのユーティリティ。 | |
| skk辞書形式かcdb形式(あるいは両方)で出力できます。 | |
| 「[」や「/」など、skk辞書にて特殊な意味をもつ文字を含む文字列には対応していないので悪しからず。 | |
| awkとskkdic-p2cdb(multiskkservと一緒に配布されている)が必要です。 | |
| もしskkdic-p2cdbが無くても、skk辞書形式で出力できるのでmultiskkserv以外でも利用できます。 | |
| なお、skk辞書形式ではEUC-JPで出力され、ソートや重複の消去、不正なエントリの削除等は一切ないので、 | |
| 必要なら別のツールで加工してください。 | |
| sample commandline: | |
| mskk2cdb.sh -s my_jisho.skkdic -c my_jisho.cdb -d "sudo /etc/init.d/multiskkserv restart" | |
| sample input file: | |
| ; comment | |
| リトルバスターズ!/Little Busters!|りとばす/りとるばすたーず | |
| ; This will be converted to: | |
| ; リトルバスターズ! /りとばす/りとるばすたーず/ | |
| ; Little Busters! /りとばす/りとるばすたーず/ | |
| ; You can write only one item. | |
| うみねこのなく頃に|うみねこ | |
| ; This will be converted to: | |
| ; うみねこのなく頃に /うみねこ/ |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| #!/bin/sh | |
| eecho() { | |
| echo "$@" >&2 | |
| } | |
| usage() { | |
| eecho -e "Usage: `basename ${0}` [<options>] <output cdb file> <input files>..." | |
| eecho -e "\tTool for multiskkserv." | |
| eecho -e "options:" | |
| eecho -e "\t-s <skkdic output>\tOutput skkdic file." | |
| eecho -e "\t\t\t\tNote that this output might be invalid." | |
| eecho -e "\t\t\t\tUse tools like skk-tools to use the output." | |
| eecho -e "\t-c <cdb output>\t\tOutput cdb file." | |
| eecho -e "\t-d <command>\t\tCommand to execute when done." | |
| } | |
| OUTPUT_SKK= | |
| OUTPUT_CDB= | |
| DONE_COMMAND= | |
| case "$1" in | |
| "--help"|"-h") | |
| usage | |
| exit | |
| ;; | |
| esac | |
| while getopts "d:s:" OPT ; do | |
| case "$OPT" in | |
| "s") | |
| # skkdic output file | |
| OUTPUT_SKK="$OPTARG" | |
| ;; | |
| "d") | |
| # command | |
| DONE_COMMAND="$OPTARG" | |
| ;; | |
| esac | |
| done | |
| shift $(( ${OPTIND} - 1 )) | |
| if [ $# -lt 1 ] ; then | |
| usage | |
| exit 2 | |
| fi | |
| OUTPUT_FILE="$1" | |
| shift | |
| AWK_CONVERT_SCRIPT=' | |
| BEGIN { | |
| FS="|" | |
| } | |
| /^;/ { | |
| print ";"$0 | |
| } | |
| /.*\|.*/ && /^[^;]/ { | |
| #print $0 | |
| split($2, SRC, "/") | |
| #print "---" SRC[1] | |
| for(ITER in SRC) { | |
| print SRC[ITER] " /" $1 "/" | |
| } | |
| }' | |
| if [ "x${OUTPUT_SKK}" == "x" ] ; then | |
| if [ "x${OUTPUT_CDB}" == "x" ] ; then | |
| eecho "ERROR: No output files." | |
| exit 2 | |
| fi | |
| cat "$@" | awk "${AWK_CONVERT_SCRIPT}" | nkf -e | skkdic-p2cdb "${OUTPUT_FILE}" | |
| else | |
| if [ "x${OUTPUT_CDB}" == "x" ] ; then | |
| cat "$@" | awk "${AWK_CONVERT_SCRIPT}" | nkf -e | tee "${OUTPUT_SKK}" | |
| else | |
| cat "$@" | awk "${AWK_CONVERT_SCRIPT}" | nkf -e | tee "${OUTPUT_SKK}" | skkdic-p2cdb "${OUTPUT_FILE}" | |
| fi | |
| fi | |
| if [ "x${DONE_COMMAND}" != "x" ] ; then | |
| # never quote args of eval. | |
| eval ${DONE_COMMAND} | |
| fi |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment