Skip to content

Greek Hebrew Ideal Format: USFM 3

Jesse Griffin edited this page Nov 20, 2017 · 9 revisions

Recommendation

USFM 3 is the Ideal data format for Source Language Scripture Texts in D43 API

Current standard formats

  • CSV: from editors
  • OSIS: for Greek/Hebrew Source texts
  • USFM: for GL & OL Scripture

Use CSV

Notes:

  • Don't use in tC
  • Use for source files
  • Use for editing team

Pros:

  • One format from beginning to end
  • Preserves all content natively
  • Easy to edit with Google Sheets, Excel, Open Office
  • Easy to track changes

Cons

  • Not standard for Scripture interchange format
  • Requires maintaining another format in API and apps
  • Different ingestion format from other Scriptures

Example Verse:

MID,ORDER,VERSE,WORD,UWORD,UMEDIEVAL,LEXEME,LEMMA,ULEMMA,SYN,MORPH,LEX,PUNC
BHP,1,010101,biblos,βιβλοσ,Βίβλος,9760,biblos,βίβλος,N.,....NFS,.,
BHP,2,010101,genesews,γενεσεωσ,γενέσεως,10780,genesis,γένεσις,N.,....GFS,.,
BHP,3,010101,=iu,ι̅υ̅,Ἰησοῦ,24240,ihsous,Ἰησοῦς,N.,....GMS,.,
BHP,4,010101,=cu,χ̅υ̅,χριστοῦ,55470,cristos,Χριστός,N.,....GMS,.,","
BHP,5,010101,uiou,υιου,υἱοῦ,52070,uios,υἱός,N.,....GMS,.,
BHP,6,010101,daueid,δαυειδ,Δαυεὶδ,11380,daueid,Δαυείδ,N.,....GMS,I,","
BHP,7,010101,uiou,υιου,υἱοῦ,52070,uios,υἱός,N.,....GMS,.,
BHP,8,010101,abraam,αβρααμ,Ἀβραάμ,110,abraam,Ἀβραάμ,N.,....GMS,I,.

Use OSIS

Notes:

  • Only use for publishing
  • Don't use in tC
  • Use for 3rd parties

Pros:

  • Preserves all content natively
  • Standard for Source Texts

Cons:

  • Not a standard we currently use
  • Requires maintaining another format in API and apps
  • Hard to edit
  • Hard to track changes

Example Verse:

<verse osisID="Matt.1.1">
  <w lemma="βίβλος" strong="G9760" morph="N./....NFS" >βιβλοσ</w>
  <w lemma="γένεσις" strong="G10780" morph="N./....GFS" >γενεσεωσ</w>
  <w lemma="Ἰησοῦς" strong="G24240" morph="N./....GMS" >ι̅υ̅</w>
  <w lemma="Χριστός" strong="G55470" morph="N./....GMS" >χ̅υ̅</w>
  <w lemma="υἱός" strong="G52070" morph="N./....GMS" >υιου</w>
  <w lemma="Δαυείδ" strong="G11380" morph="N./....GMS" >δαυειδ</w>
  <w lemma="υἱός" strong="G52070" morph="N./....GMS" >υιου</w>
  <w lemma="Ἀβραάμ" strong="G110" morph="N./....GMS" >αβρααμ</w>
</verse>

Use USFM 3

Notes

Pros:

  • Standard for GL & OL Scripture
  • Mostly implemented in all uW apps
  • Use dev time from other formats to focus on this one
  • Maintain one format for all scripture
  • Easy to track changes
  • Maybe use for editing team
  • Could be used as one format from editing to publishing
  • Easy to create sample verse

Cons:

  • Doesn't natively support all data
  • Requires user defined attributes, x-myattr, to preserve all data
  • Recommended release spec 3 isn't officially finalized

Example verse:

\v 1
\w βιβλοσ|lemma="βίβλος" strong="G09760" x-morph="Gr,N,,,,,,N,F,S,"\w*
\w γενεσεωσ|lemma="γένεσις" strong="G10780" x-morph="Gr,N,,,,,,GFS," \w*
\w ι̅υ̅|lemma="Ἰησοῦς" strong="G24240" x-morph="Gr,N,,,,,,GMS,"\w*
\w χ̅υ̅|lemma="Χριστός" strong="G55470" x-morph="Gr,N,,,,,,GMS,"\w*,
\w υιου|lemma="υἱός" strong="G52070" x-morph="Gr,N,,,,,,GMS,"\w*
\w δαυειδ|lemma="Δαυείδ" strong="G11380" x-morph="Gr,N,,,,,,GMS,I"\w*,
\w υιου|lemma="υἱός" strong="G52070" x-morph="Gr,N,,,,,,GMS,"\w*
\w αβρααμ|lemma="Ἀβραάμ" strong="G00110" x-morph="Gr,N,,,,,,GMS,I" x-tw="rc://*/tw/dict/bible/other/abraham"\w*.