Microformalyze speeds the Microformat examples collection and analysis process.
One of the first steps of the Microformat creation process is example gathering and website analysis. This step requires the Microformat creators to analyze tens if not hundreds of websites. This can be a tedious and time consuming process. The Microformalyze tool automates the task of tracking website properties, saving example data and automatically calculating analysis statistics. It can greatly increase the productivity of an author when collecting examples and analyzing websites. It also creates a paper-trail that others can verify quickly and easily, ensuring that proper analysis is being performed by Microformat authors.
Why This Tool?
This tool was created because tracking the number of properties across websites for hAudio and hVideo started to become too complicated, too time consuming and too error prone when using the Microformats wiki. Rather than continuing to dread collecting examples via the wiki, the author decided to procrastinate by attempting to write a tool that would do the work for him. Initially code-named Frank the Tank, the tool ended up becoming useful - three cheers for procrastination!
The features of the Microformalyze tool are:
- Website URL Storage - Allows an author to add a number of website URLs for further analysis.
- Flexible Property Naming - New website properties can be added as needed while performing analysis.
- Property Descriptions - Allows an author to describe a property so others can understand what it means for a website to have a certain property.
- Standardized Load/Save Format - Allows the example and analysis data to be written to and read from a file. This allows collaboration between multiple authors using an open, standardized file format.
- Verifiable Papertrail - The tool stores all data in a file that can be used to double-check another authors analysis. This is also useful if further analysis is required at any later point in time.
- Auto-statistics Calculation - Automatically calculates and sorts the most current statistics related to the gathered example URLs and properties for each URL.
- Cross Platform* - Coded in Python and PyGtk - tested under Linux, should work in Windows with minimal effort. *(crosses fingers).
- Open Source - All source code and documentation is available under the GPL. Allows anybody to embrace and extend the application to meet their needs.
The following is a brief tutorial for using the Microformalyze tool. We will be analyzing a number of recipe sites in an attempt to determine a common semantic grammar for describing recipes.
Download and Install
You can download the latest version of the Microformalyze tool from there:
To uncompress it, copy the file above to a directory of your choice and run the following decompression command:
tar jxvf microformalyze-0.7.tar.bz2
To start the program, go into the Microformalyze directory and start the application:
cd microformalyze-0.7 ./microformalyze data/recipes.ufa
If your window manager allows you to do so, set the window to "Always be on top". This will help later when new browser windows are created on the screen.
To add a URL, perform the following:
- Click on the Add URL button at the top of the application.
- Enter the URL.
- Enter the URL Description, which is usually the name of the site.
- Click on the Add URL button at the bottom right part of the window.
Add the following URLs to the application:
http://www.cookingforengineers.com/recipe/89/Cheesecake-Plain-New-York-Style Cooking For Engineers
http://www.copykat.com/component/option,com_rapidrecipe/Itemid,28/page,viewrecipe/recipe_id,560/ CopyKat Recipes
To add a property, follow these steps:
- Click on the Add Property button at the top of the interface.
- Enter the name of the property.
- Enter a description for the property.
- Click on the Add Property button at the bottom right part of the window.
Add the following properties to the application:
name The name of the recipe.
course The part of the meal at which time the item is typically served.
ingredient An item that is part of the recipe.
step Part of a series of steps that must be executed in order to prepare the recipe.
cheatsheet A compact image or set of HTML that outlines all ingredients and the process.
Analyzing the URLs
To start analyzing the URLs:
- Click on the Next button at the bottom right part of the screen.
- A web browser will appear and load the currently active URL. Review the website.
- Click on each checkbox whose property is represented on the website. When you are done, click the Next button.
Saving the Data
After you have finished analyzing each site, click on the Save button at the top of the window.
When you want to display the the current analysis statistics, click on the Display Stats button at the top of the window. The statistics will be printed out to whichever terminal you started the Microformalyze application on.
The Microformalyze name is a combination of "Microformat" and "Analyze". It also is a play on words, as the purpose of the tool is to "formalize" small semantic data grammars through examples collection and analysis.