/ Published in: Pseudocode
Specifications of the TOPIC database format, a standardized structure for plain text databases that's easy to read and edit in most text editors, and
easy to programmatically parse as well.
easy to programmatically parse as well.
Expand |
Embed | Plain Text
Copy this code and paste it in your HTML
+SYNOPSIS This document describes the formal specifications of the TOPIC database format and serves as its canonical reference. Designed for humans first and machines second, the TOPIC format attempts to provide a standardized structure for plain text databases that's easy to read and edit in most text editors, and easy to programmatically parse as well [i]. Uses include: knowledge-bases, glossaries, apropos, notes... +KEY CONCEPTS - TOPIC databases are OS neutral. - TOPIC databases are self-indexing. - TOPIC databases provide associations linking blocks of data. - TOPIC databases are written and read as standard ASCII [ii], so virtually any plain text editor is suitable for editing chores. - TOPIC databases use fundamentally simple markup [iii] employing only the plus ('+') and comma (',') characters to delimit content. - TOPIC databases allow the end user to label data in a straight foreword, intuitive manner. +TAGS - A tag line always begins with a single + character [HEX:2B] [iv]. - Tags are always located above the block it describes, alone on a single line. - Tags only contain alpha/numeric characters A-Z, a-z, 0-9, and optionally spaces [HEX:20] [iv]. - A tag can be either a single word, or a group of words. - Multiple tags are comma delimited [HEX:2C] [iv]. +BLOCKS - A block is always located below the tags that describe it. - Block lines never begin with a + character [HEX:2B] [iv]. - A block may contain any number of lines. - Empty lines within a block are valid. +LINES - Lines are terminated with one of CR [HEX:0D], LF [HEX:0A], or a CR/LF pair [iv]. - No limits are imposed on the length of a given line [v]. +ASSOCIATIONS Using multiple tags establishes associations between otherwise unrelated blocks. In the example below, the first block has a tag named 'Apples', the second block has a tag named 'Oranges', and both blocks have a common tag named 'Fruit' as shown in the next two blocks: +Apples, Fruit Block line 1 Block line 2 Block line n... +Oranges, Fruit Block line 1 Block line 2 Block line n... This means you can stream the first block with the 'Apples' tag, stream the second block with the 'Oranges' tag, or stream both blocks via the 'Fruit' tag. The advantage gained is that your data can be filtered in an arbitrary manner. For instance, you could have twelve blocks, each with differing month tags, and a common year tag allowing you to scrutinize your data by month as well as year... +PARSING - TOPIC databases are parsed line-by-line sequentially from top to bottom, and left to right. - Parsing ignores blocks, seeking only tags matching the current query, and when a match is found, outputs the associated block. - Because a given tag can define multiple blocks, the data should be parsed in its entirety 'per query'. +MODIFICATIONS There are no formally sanctioned modifications to the TOPIC database specification. However, the user is free to extend and alter the format as best fits the need provided all legalese is observed. +UPDATES Updates, parsing examples, and other resources are located at: http://www.topcat.hypermart.net/index.html +NOTES i. Parse: To scan/analyze data looking for a desired pattern. ii. ASCII: American Standard Code for Information Interchange. iii. Markup: A system for annotating text. iv. See topic 'HEX TABLE' for ASCII/hexadecimal equivalents. v. Caveat: The user should recognize the constraints governing both the hardware and software rendering the data. +HEX TABLE ASCII/hexadecimal equivalents used in this document: 0 1 2 3 4 5 6 7 8 9 A B C D E F 0 ^@ ^A ^B ^C ^D ^E ^F ^G ^H ^I ^J ^K ^L ^M ^N ^O 1 ^P ^Q ^R ^S ^T ^U ^V ^W ^X ^Y ^Z ^[ ^\ ^] ^^ ^_ 2 SPC ! " # $ % & ' ( ) * + , - . / 3 0 1 2 3 4 5 6 7 8 9 : ; < = > ? 4 @ A B C D E F G H I J K L M N O 5 P Q R S T U V W X Y Z [ \ ] ^ _ 6 ` a b c d e f g h i j k l m n o 7 p q r s t u v w x y z { | } ~ DEL +LEGALESE The TOPIC database specification is copyright Topcat Software LLC. and is absolutely free for anyone to use for any reason in perpetuity. A single line citation is requested in the form of: TOPIC database specification by Topcat Software LLC. eof
URL: http://www.topcat.hypermart.net/index.html