EDITOR: Structuring HTML
EDITOR starts with an existing ADM scheme
- Generated by inspection of web site
EDITOR maps web page text to attributes of an ADM page scheme
- “Wrapping” a web page
- Imposes structure on web pages
EDITOR uses a procedural language to guide the wrapping process
- Each page seen as object with extraction methods
- One method for each attribute of page
- Method accesses page’s HTML source, extracts value of corresponding attribute