outcassed/abbrev.md

Look for abbreviations in pattern files

I put a commented version that explains what this does at the bottom

Install http://brew.sh/
In Terminal: brew install poppler hunspell
mkdir ~/Dictionaries
Download dictionaries from https://cgit.freedesktop.org/libreoffice/dictionaries/tree/en and put them in there. Use the "plain" link. You need both the .dic and the .aff files.
pdftotext pattern.pdf - | tr -s '[:blank:][:punct:]' '\n' | awk 'length($1) >= 2 && length($1) <= 5 { print $1 }' | hunspell -d ~/Dictionaries/en_US -a | awk '{print $1,$2}' | grep -v '^[\*@]' | tr -d '&#' | sort | uniq -c

   3  10cm
   2  4sts
   1  5cm
   1  60cm
   2  7mm
   1  Aran
   3  Liesl
   3  bo
   1  bo2
   1  bo4
   1  dpn
   1  dpns
   2  grey  # funny, this is probably cause I used the en_US dictionary instead of en_GB
  19  k1
   2  k17
   1  k2
  10  k2tog
  21  k3
   2  k31
   5  k4
   2  kwise
   5  liesl
   5  pdf
   2  psso
   3  pwise
   6  rnd
   6  sl
  22  sl1
  40  sts
   3  ws
   6  www
   3  yds

outcassed · 2016-04-26T14:31:37Z

Explanatory notes:

Install http://brew.sh and brew install poppler hunspell
This is the easiest way to install the PDF to text converter and the spell checker.
mkdir ~/Dictionaries, download dictionaries
Unfortunately you have to manually install the dictionaries
The long command
To see how this works, you can start from the first command up to the pipe (|) and run it, and then add more pipes and commands to the end, one at a time, and run to see what you get. It's fun :)

pdftotext pattern.pdf -
extract the text from the pattern

tr -s '[:blank:][:punct:]' '\n'
replace all spaces and punctuation with newlines so that we get one word per line

awk 'length($1) >= 2 && length($1) <= 5 { print $1 }'
print out all words that are 2-5 characters long

hunspell -d ~/Dictionaries/en_US -a
run them through the spell checker. In this example we are using the en_US dictionary

awk '{print $1,$2}'"
the spell checker prints out a lot of stuff, we only need to see the result of the check and the word that was submitted

grep -v '^[\*@]
lines that start with * or @ are not bad spellings

tr -d '&#'
bad spellings get a & or # symbol, remove those symbols, we don't need to see them

sort
sort the results so we can do a unique count

uniq -c
do a unique count

outcassed · 2016-04-26T14:44:20Z

If you want to find abbreviations by using a master list instead of a spell checker

Install http://brew.sh/
In Terminal, use brew to install the PDF to text utility: brew install poppler
Put abbreviations in abbreviations.txt. Note that k1 will also match things like k17. I figure it's safer to match partial strings.
run pdftotext pattern.pdf - | tr -s '[:blank:][:punct:]' '\n' | grep -io -f abbreviations.txt | sort | uniq -c


  21 k1
  10 k2tog

outcassed · 2016-04-26T17:24:09Z

Add this to the end to colorize your known abbreviations 🌈

| grep --color -f abbreviations.txt -e $

outcassed/abbrev.md

Select an option

No results found

Select an option

No results found

Look for abbreviations in pattern files

outcassed commented Apr 26, 2016 •

edited

Loading

Uh oh!

outcassed commented Apr 26, 2016 •

edited

Loading

Uh oh!

outcassed commented Apr 26, 2016 •

edited

Loading

Uh oh!

outcassed/abbrev.md

Look for abbreviations in pattern files

outcassed commented Apr 26, 2016 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Explanatory notes:

Uh oh!

outcassed commented Apr 26, 2016 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

If you want to find abbreviations by using a master list instead of a spell checker

Uh oh!

outcassed commented Apr 26, 2016 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

outcassed commented Apr 26, 2016 •

edited

Loading

outcassed commented Apr 26, 2016 •

edited

Loading

outcassed commented Apr 26, 2016 •

edited

Loading