RegexForAccordance

Search the Bible with Regular Expressions Using Accordance

Accordance is a Bible study program for macOS produced by OakTree Software. It is host to many different Bible translations and original language manuscripts. RegexForAccordance allows you to search any of the text modules in Accordance using regular expressions. It uses the AppleScript interface of Accordance to get the lines of text, and then it searches those lines with the regular expression.

System Requirements

Download

Visit the Releases page to download the latest version.

Documentation

See the included Help file for features and technical details.

Have a look at the Accordance forum for some examples of what you can do.

RegexForAccordance uses ICU Regular Expression syntax. If you are not familiar with regular expressions at all, just search the web for "regular expression tutorial". You will find plenty of guides to get you started.

Examples

Acrostics of the Tetragrammaton

There is an interesting article in
The Companion Bible, Appendix 60, which shows that the book of Esther contains exactly four acrostics of the name of God, יהוה.

  1. The first acrostic has four consecutive words which begin with those letters.
    הוהי
  2. The second has four words which begin with those letters in reverse order.
    יהוה
  3. The third has four words which end with those letters.
    הוהי
  4. The fourth has four words which end with those letters in reverse order.
    יהוה

Let's use regular expressions to find these four occurrences. After that, we will search the entire Hebrew text to see if we can find similar cases in other books.

Open RegexForAccordance and use these settings.

  1. Text: HMT-W4
  2. Range: Esther
  3. Scope: Verse

Case 1: First Letters, Forward Direction

Here is the regular expression.

‭\b(י\w*)\W+(ה\w*)\W+(ו\w*)\W+(ה\w*)

I have added parentheses around each of the four word-matching parts to make it easier to understand.

The \b at the beginning matches a word boundary, to anchor the first match to the beginning of a word.

The \W+ between the parentheses matches non-word characters, such as whitespace and punctuation.

The \w* part after each letter matches word characters, i.e., not spaces or punctuation.

The result is Esther 5:4, which contains יבוא המלך והמן היום.

Notice that it matched in right-to-left order, the forward direction for Hebrew, even though we wrote the regular expression left-to-right.

Case 2: First Letters, Reverse Direction

This regular expression is the same as the previous one, but we reverse the Hebrew letters.

‭\b(ה\w*)\W+(ו\w*)\W+(ה\w*)\W+(י\w*)

The result is Esther 1:20, which contains היא וכל־הנשים יתנו.

Case 3: Last Letters, Forward Direction

Now, we want to match words which end with the four letters. We just move the \w* part to the other side of the Hebrew letters. We also put \b at the end of the expression so that it only matches the last letter at the end of the last word.

\b(\w*י)\W+(\w*ה)\W+(\w*ו)\W+(\w*ה)\b

The result is Esther 7:7, which contains כי־כלתה אליו הרעה.

Case 4: Last Letters, Reverse Direction

The final regex is the same as the previous, but with the Hebrew letters reversed.

\b(\w*ה)\W+(\w*ו)\W+(\w*ה)\W+(\w*י)\b

The result is Esther 5:13, which contains זה איננו שוה לי.

Full Hebrew Text Search

Expanding the range to search the full Hebrew text, Gen-2Chr (Hebrew book order), we find quite a few more examples of all four cases.

Note that you will get different results, depending on whether you include or omit the bracketed text. I included both results here and show the differences in bold. To omit bracketed text in RegexForAccordance, click Filters…. Under "Remove…" select Bracketed Text.

Note that Num 1:51 contains two separate results, so it counts twice.

Acrostics of the Tetragrammaton
Position, Direction Bracketed Text Result Count Result List (differences in bold)
First, Forward Include 24 Gen 19:25; Exod 4:14; Num 13:32; Deut 11:2; 2 Sam 18:4; 1 Kgs 7:12; 8:42; 18:37; 2 Kgs 10:1; Isa 45:18; Ezek 46:1; Ps 96:11; Esth 5:4; 1 Chr 5:12; 8:39; 16:31; 18:8; 22:18; 23:11,19; 26:4; 2 Chr 20:34; 26:11; 27:3
Omit same same
First, Reverse Include 33 Gen 11:9; Exod 4:16; Lev 8:15; 9:9; 21:22; Num 1:51; 5:18; 19:12; Deut 10:7; 20:8; Josh 2:15; 11:16; 18:28; 24:18; 2 Sam 18:4; 1 Kgs 18:3; 2 Kgs 7:2; Isa 30:26; 35:2; 45:20; Jer 31:7; 33:20; Ezek 46:1; Zech 1:5; 8:19; Ps 18:8; 96:11; Ruth 1:21; Eccl 3:17; Esth 1:20; Dan 12:1; 1 Chr 27:30; 2 Chr 23:6;
Omit 34 Gen 11:9; Exod 4:16; Lev 8:15; 9:9; 21:22; Num 1:51; 5:18; 19:12; Deut 10:7; 20:8; Josh 2:15; 11:16; 18:28; 24:18; 2 Sam 18:4; 1 Kgs 18:3; 2 Kgs 7:2; Isa 30:26; 35:2; 45:20; Jer 31:7; 33:20; Ezek 46:1; Zech 1:5; 8:19; Ps 18:8; 96:11; Ruth 1:21; Eccl 3:17; 10:20; Esth 1:20; Dan 12:1; 1 Chr 27:30; 2 Chr 23:6
Last, Forward Include 40 Gen 12:15; 19:13; 38:7; 43:10; Exod 3:13; 16:22; Num 5:12; Deut 24:5; 30:12; 31:29; Josh 10:18; Judg 16:16; 19:24; 20:18,41; 2 Sam 18:3; 1 Kgs 13:26; 16:7; Isa 16:3; 33:22; Jer 9:11,17; 15:19; 49:19; 51:31; Ezek 23:8; 30:2; 31:15; Hos 11:10; Joel 2:7,17; Zech 9:17; Ps 57:7; 73:15; 107:24; 115:11; Song 2:11; Esth 7:7; 1 Chr 7:34; 23:17
Omit 38 Gen 12:15; 19:13; 38:7; 43:10; Exod 3:13; 16:22; Num 5:12; Deut 24:5; 30:12; 31:29; Josh 10:18; Judg 16:16; 19:24; 20:18, 41; 2 Sam 18:3; 1 Kgs 13:26; 16:7; Isa 33:22; Jer 9:11, 17; 15:19; 49:19; 50:29; 51:31; Ezek 23:8; 30:2; 31:15; Hos 11:10; Joel 2:7, 17; Zech 9:17; Ps 57:7; 73:15; 107:24; 115:11; Esth 7:7; 1 Chr 23:17
Last, Reverse Include 26 Gen 24:58; 49:11,31; Exod 4:3; 16:7; 25:23; 37:10; Lev 8:29; Num 13:30; 24:13; Josh 19:47; 24:27; Judg 14:2; 1 Sam 20:21; 2 Sam 15:14; Isa 16:3; Jer 48:2; 49:19; 50:15,29; Ezek 1:27; Ps 106:1; Lam 3:33; Esth 5:13; Ezra 8:19; 1 Chr 21:17
Omit 28 Gen 24:58; 49:31; Exod 4:3; 16:7; 25:23; 37:10; Lev 8:29; Num 13:30; 24:13; Josh 19:47; 24:27; Judg 14:2; 1 Sam 20:2, 21; 2 Sam 15:14; Isa 16:3; Jer 2:15; 48:2; 49:19; 50:15, 29; Ezek 1:27; Ps 106:1; Lam 3:33; Esth 5:13; Ezra 8:19; 1 Chr 6:11; 21:17

If the search scope is set to Book instead of Verses, then even more results are found, because the matches can span verse boundaries.


Counting Letters and Words

Letters Frequency

In RegexForAccordance, use the statistics pane to get frequency counts of the matches. Click the Count column header to sort by that value. If the Count column isn't visible, right click and choose it.

To export the statistics, select all of the rows in the table. Then copy and paste into Excel or Numbers, for example.

Hebrew Letter Frequency
 

Search: \w

HitCount
‮י‬138794
‮ו‬130517
‮ה‬102346
‮א‬96045
‮ל‬88638
‮ר‬68366
‮ב‬65525
‮ת‬63583
‮ש‬58452
‮מ‬57894
‮ע‬45032
‮ם‬41446
‮נ‬40101
‮כ‬33603
‮ד‬32544
‮ח‬27743
‮פ‬16983
‮ק‬16339
‮ן‬15310
‮ך‬14068
‮צ‬11775
‮ג‬10139
‮ס‬9672
‮ז‬9135
‮ט‬6352
‮ץ‬3290
‮ף‬2561

Hebrew Letter Frequency
At the Beginning of Words

Search: \b\w

HitCount
‮ו‬51303
‮א‬44413
‮ה‬30910
‮ל‬25022
‮ב‬24852
‮י‬24236
‮מ‬20236
‮ע‬16489
‮כ‬14345
‮ש‬11071
‮נ‬6978
‮ת‬6976
‮ח‬5466
‮פ‬4630
‮ר‬4285
‮ד‬4274
‮ס‬3418
‮ק‬3012
‮ג‬2958
‮צ‬2554
‮ז‬2169
‮ט‬900
‮ן‬9

Hebrew Letter Frequency
At the End of Words

Search: \w\b

HitCount
‮ה‬41446
‮ם‬41446
‮י‬32990
‮ו‬30246
‮ת‬29098
‮ל‬24905
‮ר‬23360
‮ן‬15310
‮א‬14137
‮ך‬14068
‮ד‬8602
‮ב‬6984
‮ש‬6176
‮ח‬3952
‮ע‬3851
‮ץ‬3290
‮ס‬2699
‮ף‬2561
‮ק‬1908
‮ט‬1235
‮פ‬1181
‮ז‬587
‮ג‬430
‮מ‬25
‮כ‬19

(The charts were made in Numbers.)

From these results, we find that ה, ו, and י are all very common at the beginning and ending of words. It is easy to see how acrostics of יהוה could occur randomly, and also how they could be constucted deliberately, since there are so many words which begin or end with those letters.

Word Count

Hebrew Word Count

How many different words are in the Hebrew Old Testament?

Search: \w+

Result: 39,928 distinct hits in HMT-W4.

We are defining a "word" as just a sequence of Hebrew characters. Of course, many of these words are just different forms of the same root word, with different prefixes and suffixes.

How many different words begin or end with ה, ו, and י?

This table shows the total number of words and the number of distinct words for each case.

regex
total
distinct
BeginEnd
ה\bה\w+
30,811
4,095
\w+ה\b
41,347
5,370
ו\bו\w+
51,023
11,154
\w+ו\b
29,966
6,141
י\bי\w+
24,232
2,596
\w+י\b
32,986
4,265

Greek New Testament Word Count

How many words are in the book of Matthew in the Greek New Testament?

Search: \w+

This matches strings of one or more "word" characters (not whitespace or punctuation).

The result shows 18,350 hits, which is the total word count. It also shows that 3,970 different word were found, and it lists each word with count and length.


Palindromes

A palindrome is a word or phrase that reads the same in both directions. If there are an odd number of letters, the pattern will be like this: ABCDCBA. If there are an even number of letters, then the central D part is omitted: ABCCBA.

Palindromic Words

Find whole words which are palindromes.

There isn't a general solution, but the brute force approach works. Look for words, \b...\b, which have 1, 2, or 3 pairs of letters, (\w)...\1, and with an optional central letter, \w?. This can be extended to match more pairs of letters, but the longest match in the Hebrew text is 7 letters.

This makes use of the capturing group feature of regular expressions. The \1, \2, and \3, refer back to the thing that was matched within the sets of parentheses.

The (?:...) part is for non-capturing parentheses. It groups the three "or" options between the \b word boundaries without creating a capture group.

Search: \b(?:(\w)\w?\1|(\w)(\w)\w?\3\2|(\w)(\w)(\w)\w?\6\5\4)\b

The longest word is לאיתיאל, which occurs twice in Prov 30:1.

Palindromic Phrases

This is the same as the previous search, but allows spaces and punctuation between the letters: "Madam, I'm Adam" (Genesis something, probably).

The longest phrase that I can find in Hebrew is Prov 30:1 again, which just repeats the 7 letter palindromic word.

Search: (?:(\w)\W*(\w)\W*(\w)\W*(\w)\W*(\w)\W*(\w)\W*(\w)\W*\w?\W*\7\W*\6\W*\5\W*\4\W*\3\W*\2\W*\1)

More results are found by matching shorter phrases.

Search \b(?:(\w)\W*(\w)\W*(\w)\W*(\w)\W*\w?\W*\4\W*\3\W*\2\W*\1)\b

Result:

  1. Gen 27:31: ויבא לאביו
  2. Gen 37:2: אביו ויבא
  3. 1 Chr 11:45: ויחא אחיו

Search \b(?:(\w)\W*(\w)\W*(\w)\W*\w?\W*\3\W*\2\W*\1)\b

Result: 30 hits
Gen 11:16; 50:17; Deut 4:42; 19:6; Josh 8:20; 20:5; 1 Sam 9:26; 16:10; 17:33; 22:5; 2 Sam 3:1; 13:26; 16:9; 19:26,30; 24:3; 1 Kgs 7:23; 13:32; 20:40; 2 Kgs 2:8,14; Isa 55:13; 59:11; Ezek 1:3; Zeph 3:2; Zech 1:16; Ps 68:26; Prov 30:1; Ruth 4:22


Repeated Phrases

Search for repeated words or phrases. It consists of any sequence of characters, (.+), followed by non-word characters (white space or punctuation), \W+, followed by the same sequence of characters, \1. It must also be surrounded by word boundaries \b...\b.

Search: \b(.+)\W+\1\b

Greek New Testament

Text: GNT-T
Filter: Remove Greek punctuation and diacritics
Options: Ignore case

The longest match is John 10:11: ο ποιμην ο καλος ο ποιμην ο καλος.

The most frequent match is αμην αμην, found 25 times.

Hebrew Old Testament

Text: HMT-W4
Filter: Remove Hebrew cantillation, points, and punctuation

Longest match: Exod 25:35 and Exod 37:21: וכפתר תחת שני הקנים ממנה וכפתר תחת שני הקנים ממנה

Most frequent: סביב סביב, found 27 times.

Greek Old Testament (LXX)

Longest: 1 Sam 23:26:  και οι ανδρες αυτου εκ μερους του ορους τουτου και ην Δαυιδ και οι ανδρες αυτου εκ μερους του ορους τουτου και ην Δαυιδ (This one doesn't make sense because the phrases are not divided properly.)

Second Longest: 1 Sam 14:4: και ακρωτηριον πετρας ενθεν και ακρωτηριον πετρας ενθεν

Most frequent: κυριε κυριε, 16 times.


Capitalized Greek Words

Find all words in the Greek New Testament which are capitalized, but are not at the beginning of a sentence. We expect to find proper nouns mostly.

Let's build this regular expression step by step. Open a new window of RegexForAccordance.

  1. Turn off "Ignore case".
  2. Click on Filters. Under "Remove Greek…", turn off Diacritics and Punctuation.
  3. Text: GNT-T (or whatever Greek New Testament you have)
  4. Range: Matt-Rev
  5. Scope: Verse

First, just search for capital letters.

[Α-Ω]

Next, we want only capital letters which are at the beginning of a word. Add \b at the front.

\b[Α-Ω]

That gives us all the first letters, but we want to match whole words. Add \w+ to the end. This matches matches one or more "word" characters, which includes letters but not whitespace or punctuation.

\b[Α-Ω]\w+

Now it matches all capitalized words, but we want to exclude the words which are at the beginning of a sentence. Let's figure out how to find the beginning of a sentence first.

Search for \. \w. That will match a period, followed by a space, followed by one word character.

What about questions? The Greek question mark is a semicolon, so we will include that also.

[.;] \w

The most common place for a sentence to begin is at the beginning of a verse, which we can match with the beginning-of-line anchor.

^

Now, put all of those together into one "or" statement.

(^|[.;] )\w

Spot check: notice that Matt 1:6 has two sentences, and this matches the beginning of both.

Finally, we want to exclude the words which are at the beginning of a sentence. We need to turn our beginning-of-sentence matcher into a "negative look-behind assertion". It is "negative" because we are specifying what we want to exclude. It is "look-behind" because we want to look at what precedes the capitalized word.

According to the documentation, a negative look-behind assertion has this form.

(?<! ... )

So, this is what we need.

(?<!(^|[.;] ))\b[Α-Ω]\w+

That gives us what we want.


.