parsing: Difference between revisions

From Microformats Wiki
Jump to navigation Jump to search
(started matrix for parsing rules, move this page if needed)
 
(added the rest of the HTML elements)
Line 233: Line 233:
</tr>
</tr>
<tr>
<tr>
<td>I</td>
<td>node-value</td>
<td>node-value</td>
</tr>
<tr>
<td>IFRAME</td>
<td>@src?</td>
<td>node-value</td>
</tr>
<tr>
<td>IMG</td>
<td>@src</td>
<td>@alt</td>
</tr>
<tr>
<td>INPUT</td>
<td>@value?</td>
<td>@value?</td>
</tr>
<tr>
<td>INS</td>
<td>@cite,node-value</td>
<td>node-value</td>
</tr>
<tr>
<td>ISINDEX (valid?)</td>
<td></td>
<td></td>
</tr>
<tr>
<td>KBD</td>
<td>node-value</td>
<td>node-value</td>
</tr>
<tr>
<td>LABEL</td>
<td>node-value</td>
<td>node-value</td>
</tr>
<tr>
<td>LEGEND</td>
<td>node-value</td>
<td>node-value</td>
</tr>
<tr>
<td>LI</td>
<td>node-value</td>
<td>node-value</td>
</tr>
<tr>
<td>LINK (valid?)</td>
<td></td>
<td></td>
<td></td>
</tr>
<tr>
<td>MAP</td>
<td>node-value</td>
<td>node-value</td>
</tr>
<tr>
<td>MENU (valid?)</td>
<td></td>
<td></td>
</tr>
<tr>
<td>META (valid?)</td>
<td></td>
<td></td>
</tr>
<tr>
<td>NOFRAMES</td>
<td>node-value</td>
<td>node-value</td>
</tr>
<tr>
<td>NOSCRIPT</td>
<td>node-value</td>
<td>node-value</td>
</tr>
<tr>
<td>OBJECT</td>
<td>@data,node-value</td>
<td>node-value</td>
</tr>
<tr>
<td>OL</td>
<td>node-value</td>
<td>node-value</td>
</tr>
<tr>
<td>OPTGROUP (valid?)</td>
<td>node-value</td>
<td>node-value</td>
</tr>
<tr>
<td>OPTION</td>
<td>node-value</td>
<td>node-value</td>
</tr>
<tr>
<td>P</td>
<td>node-value</td>
<td>node-value</td>
</tr>
<tr>
<td>PARAM (?)</td>
<td>node-value</td>
<td>node-value</td>
</tr>
<tr>
<td>PRE</td>
<td>node-value</td>
<td>node-value</td>
</tr>
<tr>
<td>Q</td>
<td>node-value</td>
<td>node-value</td>
</tr>
<tr>
<td>S</td>
<td>node-value</td>
<td>node-value</td>
</tr>
<tr>
<td>SAMP</td>
<td>node-value</td>
<td>node-value</td>
</tr>
<tr>
<td>SCRIPT</td>
<td>node-value</td>
<td>node-value</td>
</tr>
<tr>
<td>SELECT (valid?)</td>
<td>node-value</td>
<td>node-value</td>
</tr>
<tr>
<td>SMALL</td>
<td>node-value</td>
<td>node-value</td>
</tr>
<tr>
<td>SPAN</td>
<td>node-value</td>
<td>node-value</td>
</tr>
<tr>
<td>STRIKE</td>
<td>node-value</td>
<td>node-value</td>
</tr>
<tr>
<td>STRONG</td>
<td>node-value</td>
<td>node-value</td>
</tr>
<tr>
<td>STYLE (valid?)</td>
<td>node-value</td>
<td>node-value</td>
</tr>
<tr>
<td>SUB</td>
<td>node-value</td>
<td>node-value</td>
</tr>
<tr>
<td>SUP</td>
<td>node-value</td>
<td>node-value</td>
</tr>
<tr>
<td>TABLE(valid?)</td>
<td>node-value</td>
<td>node-value</td>
</tr>
<tr>
<td>TBODY</td>
<td>node-value</td>
<td>node-value</td>
</tr>
<tr>
<td>TD</td>
<td>node-value</td>
<td>node-value</td>
</tr>
<tr>
<td>TEXTAREA</td>
<td>node-value</td>
<td>node-value</td>
</tr>
<tr>
<td>TFOOT</td>
<td>node-value</td>
<td>node-value</td>
</tr>
<tr>
<td>TH</td>
<td>node-value</td>
<td>node-value</td>
</tr>
<tr>
<td>THEAD</td>
<td>node-value</td>
<td>node-value</td>
</tr>
<tr>
<td>TITLE</td>
<td>node-value</td>
<td>node-value</td>
</tr>
<tr>
<td>TR</td>
<td>node-value</td>
<td>node-value</td>
</tr>
<tr>
<td>TT</td>
<td>node-value</td>
<td>node-value</td>
</tr>
<tr>
<td>U</td>
<td>node-value</td>
<td>node-value</td>
</tr>
<tr>
<td>UL</td>
<td>node-value</td>
<td>node-value</td>
</tr>
<tr>
<td>VAR</td>
<td>node-value</td>
<td>node-value</td>
<td>node-value</td>
<td>node-value</td>
</tr>
</tr>
</table>
</table>

Revision as of 11:04, 12 September 2007

Parsing

This is a braindump, this page will need cleaning-up, take everything with a grain of salt at the moment.

By Element

This is a matrix of element and type. This should be describe under what circumstances each value and where that value comes from. The list of elements has been taken from http://www.w3.org/TR/html4/index/elements.html

data types

(this probably needs a better name) There are two types in microformats, protocol types and strings. Strings could be integers, such as ratings, strings, such as a note, or datetimes, such as dtstart. Protocol types are UIDs, URLs, email addresses, (sometimes Telephones and faxes)

If there is a comma list, then this is in order of availability. For instance, the ABBR element is @title,node-value. IF the @title is present then it is used, if not the stack is popped and node-value is looked at, if there is no node-value, then the value is NULL.

protocol string
A @href,node-value node-value
ABBR @title,node-value @title,node-value
ACRONYM @title,node-value @title,node-value
ADDRESS node-value node-value
APPLET ??? ???(node-value)
AREA @href,node-value node-value
B node-value node-value
BASE (valid?) @href
BASEFONT (valid?)
BDO (valid?)
BIG node-value node-value
BLOCKQUOTE @cite?,node-value node-value
BODY node-value node-value
BR (valid?)
BUTTON @value? @value?
CAPTION node-value node-value
CENTER node-value node-value
CITE node-value node-value
CODE node-value node-value
COL node-value node-value
COLGROUP node-value node-value
DD node-value node-value
DEL @cite,node-value node-value
DFN node-value node-value
DIR node-value node-value
DIV node-value node-value
DL node-value node-value
DT node-value node-value
EM node-value node-value
FIELDSET node-value node-value
FONT node-value node-value
FORM @action?,node-value node-value
FRAME @src?,node-value node-value
FRAMESET node-value node-value
H1 node-value node-value
H2 node-value node-value
H3 node-value node-value
H4 node-value node-value
H5 node-value node-value
H6 node-value node-value
HEAD (valid?) node-value node-value
HR (valid?) node-value node-value
HTML (valid?) node-value node-value
I node-value node-value
IFRAME @src? node-value
IMG @src @alt
INPUT @value? @value?
INS @cite,node-value node-value
ISINDEX (valid?)
KBD node-value node-value
LABEL node-value node-value
LEGEND node-value node-value
LI node-value node-value
LINK (valid?)
MAP node-value node-value
MENU (valid?)
META (valid?)
NOFRAMES node-value node-value
NOSCRIPT node-value node-value
OBJECT @data,node-value node-value
OL node-value node-value
OPTGROUP (valid?) node-value node-value
OPTION node-value node-value
P node-value node-value
PARAM (?) node-value node-value
PRE node-value node-value
Q node-value node-value
S node-value node-value
SAMP node-value node-value
SCRIPT node-value node-value
SELECT (valid?) node-value node-value
SMALL node-value node-value
SPAN node-value node-value
STRIKE node-value node-value
STRONG node-value node-value
STYLE (valid?) node-value node-value
SUB node-value node-value
SUP node-value node-value
TABLE(valid?) node-value node-value
TBODY node-value node-value
TD node-value node-value
TEXTAREA node-value node-value
TFOOT node-value node-value
TH node-value node-value
THEAD node-value node-value
TITLE node-value node-value
TR node-value node-value
TT node-value node-value
U node-value node-value
UL node-value node-value
VAR node-value node-value