So What Should We Do?
We could “screen scrape” HTML pages
- Not a friendly storage media or format
- Interesting sites may have a DBMS inside
We could manage <name, value> pairs
- Universal relation revisited?
- Differences will probably kill us without conventions for attribute naming, meaning, units, structure, and so on
- XML probably won’t cure cancer